dcrawl – Web Crawler For Unique Domains


dcrawl is a simple, but smart, multithreaded web crawler for randomly gathering huge lists of unique domain names.

dcrawl - Web Crawler For Unique Domains


How does dcrawl work?

dcrawl takes one site URL as input and detects all a href= links in the site’s body. Each found link is put into the queue. Successively, each queued link is crawled in the same way, branching out to more URLs found in links on each site’s body.

dcrawl Web Crawler Features

  • Branching out only to predefined number of links found per one hostname.
  • Maximum number of allowed different hostnames per one domain (avoids subdomain crawling hell e.g. blogspot.com).
  • Can be restarted with same list of domains – last saved domains are added to the URL queue.
  • Crawls only sites that return text/html Content-Type in HEAD response.
  • Retrieves site body of maximum 1MB size.
  • Does not save inaccessible domains.

dcrawl Usage


Example:

There are other tools which do similar things, or could be scripted together recursively to perform a similar kind of task – but nothing this focused. Examples would be:

Host-Extract – Enumerate All IP/Host Patterns In A Web Page
Recon-ng – Web Reconnaissance Framework

You can download dcrawl go web crawler here:

dcrawl-master.zip

Or read more here.

Posted in: Hacking Tools

, ,


Latest Posts:


dSploit APK Download - Hacking & Security Toolkit For Android dSploit APK Download – Hacking & Security Toolkit For Android
dSploit APK Download is a Hacking & Security Toolkit For Android which can conduct network analysis and penetration testing activities.
Scallion - GPU Based Onion Hash Generator Scallion – GPU Based Onion Hash Generator
Scallion is a GPU-driven Onion Hash Generator written in C#, it lets you create vanity GPG keys and .onion addresses (for Tor's hidden services).
WiFi-Dumper - Dump WiFi Profiles and Cleartext Passwords WiFi-Dumper – Dump WiFi Profiles and Cleartext Passwords
WiFi-Dumper is an open-source Python-based tool to dump WiFi profiles and cleartext passwords of the connected access points on a Windows machine.
truffleHog - Search Git for High Entropy Strings with Commit History truffleHog – Search Git for High Entropy Strings with Commit History
truffleHog is a Python-based tool to search Git for high entropy strings, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.
AIEngine - AI-driven Network Intrusion Detection System AIEngine – AI-driven Network Intrusion Detection System
AIEngine is a next-generation interactive/programmable Python/Ruby/Java/Lua and Go AI-driven Network Intrusion Detection System engine with many capabilities.
Sooty - SOC Analyst All-In-One CLI Tool Sooty – SOC Analyst All-In-One CLI Tool
Sooty is a tool developed with the task of aiding a SOC analyst to automate parts of their workflow and speed up their process.


Comments are closed.