dcrawl – Web Crawler For Unique Domains


dcrawl is a simple, but smart, multithreaded web crawler for randomly gathering huge lists of unique domain names.

dcrawl - Web Crawler For Unique Domains


How does dcrawl work?

dcrawl takes one site URL as input and detects all a href= links in the site’s body. Each found link is put into the queue. Successively, each queued link is crawled in the same way, branching out to more URLs found in links on each site’s body.

dcrawl Web Crawler Features

  • Branching out only to predefined number of links found per one hostname.
  • Maximum number of allowed different hostnames per one domain (avoids subdomain crawling hell e.g. blogspot.com).
  • Can be restarted with same list of domains – last saved domains are added to the URL queue.
  • Crawls only sites that return text/html Content-Type in HEAD response.
  • Retrieves site body of maximum 1MB size.
  • Does not save inaccessible domains.

dcrawl Usage


Example:

There are other tools which do similar things, or could be scripted together recursively to perform a similar kind of task – but nothing this focused. Examples would be:

Host-Extract – Enumerate All IP/Host Patterns In A Web Page
Recon-ng – Web Reconnaissance Framework

You can download dcrawl go web crawler here:

dcrawl-master.zip

Or read more here.

Posted in: Hacking Tools

, ,


Latest Posts:


RandIP - Network Mapper To Find Servers RandIP – Network Mapper To Find Servers
RandIP is a nim-based network mapper application that generates random IP addresses and uses sockets to test whether the connection is valid or not with additional tests for Telnet and SSH.
Nipe - Make Tor Default Gateway For Network Nipe – Make Tor Default Gateway For Network
Nipe is a Perl script to make Tor default gateway for network, this script enables you to directly route all your traffic from your computer to the Tor network.
Mosca - Manual Static Analysis Tool To Find Bugs Mosca – Manual Static Analysis Tool To Find Bugs
Mosca is a manual static analysis tool written in C designed to find bugs in the code before it is compiled, much like a grep unix command.
Slurp - Amazon AWS S3 Bucket Enumerator Slurp – Amazon AWS S3 Bucket Enumerator
Slurp is a blackbox/whitebox S3 bucket enumerator written in Go that can use a permutations list to scan externally or an AWS API to scan internally.
US Government Cyber Security Still Inadequate US Government Cyber Security Still Inadequate
Surprise, surprise, surprise - an internal audit of the US Government cyber security situation has uncovered widespread weaknesses, legacy systems and poor adoption of cyber controls and tooling.
BloodHound - Hacking Active Directory Trust Relationships BloodHound – Hacking Active Directory Trust Relationships
BloodHound is for hacking active directory trust relationships and it uses graph theory to reveal the hidden and often unintended relationships within an AD environment.


Comments are closed.