dcrawl – Web Crawler For Unique Domains

Outsmart Malicious Hackers


dcrawl is a simple, but smart, multithreaded web crawler for randomly gathering huge lists of unique domain names.

dcrawl - Web Crawler For Unique Domains


How does dcrawl work?

dcrawl takes one site URL as input and detects all a href= links in the site’s body. Each found link is put into the queue. Successively, each queued link is crawled in the same way, branching out to more URLs found in links on each site’s body.

dcrawl Web Crawler Features

  • Branching out only to predefined number of links found per one hostname.
  • Maximum number of allowed different hostnames per one domain (avoids subdomain crawling hell e.g. blogspot.com).
  • Can be restarted with same list of domains – last saved domains are added to the URL queue.
  • Crawls only sites that return text/html Content-Type in HEAD response.
  • Retrieves site body of maximum 1MB size.
  • Does not save inaccessible domains.

dcrawl Usage


Example:

There are other tools which do similar things, or could be scripted together recursively to perform a similar kind of task – but nothing this focused. Examples would be:

Host-Extract – Enumerate All IP/Host Patterns In A Web Page
Recon-ng – Web Reconnaissance Framework

You can download dcrawl go web crawler here:

dcrawl-master.zip

Or read more here.

Learn about Hacking Tools



Posted in: Hacking Tools

, ,

Latest Posts:


AWSBucketDump - AWS S3 Security Scanning Tool AWSBucketDump – AWS S3 Security Scanning Tool
AWSBucketDump is an AWS S3 Security Scanning Tool, which allows you to quickly enumerate AWS S3 buckets to look for interesting or confidential files.
nbtscan Download - NetBIOS Scanner For Windows & Linux nbtscan Download – NetBIOS Scanner For Windows & Linux
nbtscan is a command-line NetBIOS scanner for Windows that is SUPER fast, it scans for open NetBIOS nameservers on a local or remote TCP/IP network.
Equifax Data Breach - Hack Due To Missed Apache Patch Equifax Data Breach – Hack Due To Missed Apache Patch
The Equifax data breach is pretty huge with 143 million records leaked from the hack in the US alone with unknown more in Canada and the UK.
Seth - RDP Man In The Middle Attack Tool Seth – RDP Man In The Middle Attack Tool
Seth is an RDP Man In The Middle attack tool written in Python to MiTM RDP connections by attempting to downgrade the connection to extract clear text creds
dcrawl - Web Crawler For Unique Domains dcrawl – Web Crawler For Unique Domains
dcrawl is a simple, but smart, multithreaded web crawler for randomly gathering huge lists of unique domain names. It will branch out indefinitely.
Time Warner Hacked - AWS Config Exposes 4M Subscribers Time Warner Hacked – AWS Config Exposes 4M Subscribers
What's the latest on the web, Time Warner Hacked is what it's about now as a bad AWS S3 config (once again) exposes the details of approximately 4M subs.


No comments yet.

Leave a Reply