dcrawl – Web Crawler For Unique Domains


dcrawl is a simple, but smart, multithreaded web crawler for randomly gathering huge lists of unique domain names.

dcrawl - Web Crawler For Unique Domains


How does dcrawl work?

dcrawl takes one site URL as input and detects all a href= links in the site’s body. Each found link is put into the queue. Successively, each queued link is crawled in the same way, branching out to more URLs found in links on each site’s body.

dcrawl Web Crawler Features

  • Branching out only to predefined number of links found per one hostname.
  • Maximum number of allowed different hostnames per one domain (avoids subdomain crawling hell e.g. blogspot.com).
  • Can be restarted with same list of domains – last saved domains are added to the URL queue.
  • Crawls only sites that return text/html Content-Type in HEAD response.
  • Retrieves site body of maximum 1MB size.
  • Does not save inaccessible domains.

dcrawl Usage


Example:

There are other tools which do similar things, or could be scripted together recursively to perform a similar kind of task – but nothing this focused. Examples would be:

Host-Extract – Enumerate All IP/Host Patterns In A Web Page
Recon-ng – Web Reconnaissance Framework

You can download dcrawl go web crawler here:

dcrawl-master.zip

Or read more here.

Posted in: Hacking Tools

, ,


Latest Posts:


truffleHog - Search Git for High Entropy Strings with Commit History truffleHog – Search Git for High Entropy Strings with Commit History
truffleHog is a Python-based tool to search Git for high entropy strings, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.
AIEngine - AI-driven Network Intrusion Detection System AIEngine – AI-driven Network Intrusion Detection System
AIEngine is a next-generation interactive/programmable Python/Ruby/Java/Lua and Go AI-driven Network Intrusion Detection System engine with many capabilities.
Sooty - SOC Analyst All-In-One CLI Tool Sooty – SOC Analyst All-In-One CLI Tool
Sooty is a tool developed with the task of aiding a SOC analyst to automate parts of their workflow and speed up their process.
UBoat - Proof Of Concept PoC HTTP Botnet Project UBoat – Proof Of Concept PoC HTTP Botnet Project
UBoat is a PoC HTTP Botnet designed to replicate a full weaponised commercial botnet like the famous large scale infectors Festi, Grum, Zeus and SpyEye.
LambdaGuard - AWS Lambda Serverless Security Scanner LambdaGuard – AWS Lambda Serverless Security Scanner
LambdaGuard is a tool which allows you to visualise and audit the security of your serverless assets, an open-source AWS Lambda Serverless Security Scanner.
exe2powershell - Convert EXE to BAT Files exe2powershell – Convert EXE to BAT Files
exe2powershell is used to convert EXE to BAT files, the previously well known tool for this was exe2bat, this is a version for modern Windows.


Comments are closed.