🔗 Part 4: URL Collection & Analysis
Gather all URLs, endpoints, and parameters from your target
⚠️ Only test domains you own or have permission to test
Enter Target Domain:
Current: example.com
📌 What is URL Collection?
This phase gathers all possible URLs, endpoints, and parameters from your target. These URLs are the attack surface for finding vulnerabilities like XSS, SQLi, SSRF, etc.
This phase gathers all possible URLs, endpoints, and parameters from your target. These URLs are the attack surface for finding vulnerabilities like XSS, SQLi, SSRF, etc.
PASSIVE Passive URL Collection
GAU (Get All URLs)
cat alive.txt | gau --threads 10 | sort -u > all_urls.txt
Fetch URLs from AlienVault OTX, Wayback Machine, Common Crawl, and URLScan
Waybackurls
cat alive.txt | waybackurls | sort -u > wayback_urls.txt
Get URLs from Wayback Machine archive
Gau with Filters
echo "example.com" | gau --mc 200 --fc 404,403 --filter-mime text/html | urldedupe > filtered_urls.txt
Filter by status code and content type
ACTIVE Active Crawling
Katana
katana -u https://example.com -d 3 -jc -kf -aff -o katana_urls.txt
Fast web crawler with JS parsing and form extraction
Hakrawler
cat alive.txt | hakrawler -depth 3 -plain -subs | sort -u > hakrawler_urls.txt
Simple, fast web crawler
Gospider
gospider -S alive.txt -c 10 -d 2 --other-source -o gospider_output
Advanced spider with sitemap, robots.txt parsing
📜 JavaScript Analysis
Extract JS Files
cat all_urls.txt | grep -E "\.js($|\?)" | sort -u > js_files.txt
Extract all JavaScript file URLs
LinkFinder (JS Endpoints)
python3 linkfinder.py -i https://example.com/script.js -o cli
Find endpoints in JavaScript files
SecretFinder
python3 SecretFinder.py -i https://example.com/script.js -o cli
Find secrets (API keys, tokens) in JS files
🔍 Parameter Extraction
Extract URLs with Parameters
cat all_urls.txt | grep '=' | sort -u > params_urls.txt
Find all URLs containing parameters
URLCrazy (ParamSpider)
python3 paramspider.py -d example.com -o paramspider.txt
Extract URLs with parameters from web archives
Unfurl
cat params_urls.txt | unfurl format %p%?%q > extracted_params.txt
Extract and parse URL components
🎯 GF Pattern Matching
GF - XSS Patterns
cat all_urls.txt | gf xss | sort -u > xss_potential.txt
Filter URLs potentially vulnerable to XSS
GF - SQLi Patterns
cat all_urls.txt | gf sqli | sort -u > sqli_potential.txt
Filter URLs potentially vulnerable to SQL injection
GF - SSRF Patterns
cat all_urls.txt | gf ssrf | sort -u > ssrf_potential.txt
Filter URLs potentially vulnerable to SSRF
GF - LFI Patterns
cat all_urls.txt | gf lfi | sort -u > lfi_potential.txt
Filter URLs potentially vulnerable to LFI
GF - Redirect Patterns
cat all_urls.txt | gf redirect | sort -u > redirect_potential.txt
Filter URLs potentially vulnerable to open redirect
🔄 Deduplication & Processing
URLCrazy (URLCrazy)
cat *.txt | sort -u > all_unique_urls.txt
Merge and deduplicate all URL files
URLCrazy (URLCrazy)
cat all_unique_urls.txt | urldedupe > deduped_urls.txt
Advanced deduplication with query string handling
URLCrazy (URLCrazy)
cat all_urls.txt | uro > uro_urls.txt
Another deduplication tool with path normalization
⚡ Complete Pipelines
Full Recon Pipeline
subfinder -d example.com -silent | httpx -silent | gau --silent | grep "=" | uro | tee recon_final.txt
Complete pipeline: subdomains → live hosts → URLs → parameters
XSS-Focused Pipeline
echo "example.com" | gau | gf xss | uro | qsreplace '">alert(1)' | httpx -silent -mr 'alert'
Find and test potential XSS endpoints
📦 Installation Commands
# Install Go tools
go install -v github.com/lc/gau/v2/cmd/gau@latest
go install -v github.com/tomnomnom/waybackurls@latest
go install -v github.com/tomnomnom/gf@latest
go install -v github.com/tomnomnom/unfurl@latest
go install -v github.com/hakluke/hakrawler@latest
go install -v github.com/projectdiscovery/katana/cmd/katana@latest
go install -v github.com/jaeles-project/gospider@latest
go install -v github.com/dwisiswant0/urldedupe@latest
go install -v github.com/s0md3v/uro@latest
# Install GF patterns
git clone https://github.com/1ndianl33t/Gf-Patterns
mkdir -p ~/.gf
cp -r Gf-Patterns/*.json ~/.gf/
# Install Python tools
git clone https://github.com/GerbenJavado/LinkFinder.git
cd LinkFinder && pip install -r requirements.txt
git clone https://github.com/m4ll0k/SecretFinder.git
cd SecretFinder && pip install -r requirements.txt
📝 Common Parameter Names to Watch For
?id=
?page=
?file=
?redirect=
?url=
?path=
?return=
?next=
?src=
?q=
?search=
?callback=
?api_key=
?token=
?debug=
?test=
💡 Pro Tips
- ✓ Run passive collection multiple times over weeks - new URLs appear constantly
- ✓ Always check JavaScript files for hidden endpoints and API keys
- ✓ Use GF patterns to focus on high-value targets first
- ✓ Save all URLs - you'll use them for multiple vulnerability types
- ✓ Combine results from multiple tools - each finds different URLs
◀️ Previous: Live Host Discovery | Next: Vulnerability Scanning ▶️