robots-parser
A specification compliant robots.txt parser with wildcard (*) matching support.
simplecrawler
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
isbot-fast
JavaScript module detecting bots/crawlers/spiders via user-agent
shift-parser
ECMAScript parser that produces a Shift format AST
spider-detector
A tiny node module to detect spiders/crawlers quickly and comes with optional middleware for ExpressJS