Skip to content

MateusVerass/Down

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  ██████╗  ██████╗ ██╗    ██╗███╗   ██╗
  ██╔══██╗██╔═══██╗██║    ██║████╗  ██║
  ██║  ██║██║   ██║██║ █╗ ██║██╔██╗ ██║
  ██║  ██║██║   ██║██║███╗██║██║╚██╗██║
  ██████╔╝╚██████╔╝╚███╔███╔╝██║ ╚████║
  ╚═════╝  ╚═════╝  ╚══╝╚══╝ ╚═╝  ╚═══╝

Down crawls any URL and downloads everything — images, videos, documents, audio.
Bypasses Akamai · Cloudflare · Imperva via Chrome TLS fingerprint impersonation.

Python License Extensions Threads


Install

pip install requests beautifulsoup4 curl-cffi

curl-cffi impersonates Chrome's exact TLS fingerprint, bypassing CDN bot detection.
If it's not installed, Down auto-installs it on first run.

Usage

python3 down.py <URL>

That's it. Output folder is named automatically from the URL.

positional arguments:
  url                   Target URL

options:
  -o, --output DIR      Output directory (default: auto from URL)
  -t, --types LIST      images, videos, documents, audio, all (default: all)
  -d, --depth INT       Crawl depth (default: 1)
  -j, --threads INT     Concurrent downloads (default: 8)
  --delay FLOAT         Delay between requests (default: 0.1s)
  --no-crawl            Download the URL directly without crawling
  --list                List found URLs without downloading
  --html FILE           Use a local HTML file (last resort bypass)
  --verbose             Show which fetch strategy succeeded

Examples

# Download everything from a page
python3 down.py https://example.com

# Only images and videos
python3 down.py https://example.com -t images,videos

# Crawl 2 levels deep, 8 parallel threads
python3 down.py https://example.com -d 2 -j 8

# See what would be downloaded without downloading
python3 down.py https://example.com --list

# Download one file directly
python3 down.py https://example.com/video.mp4 --no-crawl

Real example — war.gov/UFO declassified UAP files

Source: https://www.war.gov/UFO/
U.S. Department of War — Presidential Unsealing and Reporting System for UAP Encounters.
The site uses Akamai EdgeSuite bot protection. Down bypasses it automatically via Chrome TLS impersonation and auto-discovers the embedded CSV data source (161 records, 17 pages) to extract all file links.

python3 down.py https://www.war.gov/UFO/
  ██████╗  ██████╗ ██╗    ██╗███╗   ██╗
  ██╔══██╗██╔═══██╗██║    ██║████╗  ██║
  ██║  ██║██║   ██║██║ █╗ ██║██╔██╗ ██║
  ██║  ██║██║   ██║██║███╗██║██║╚██╗██║
  ██████╔╝╚██████╔╝╚███╔███╔╝██║ ╚████║
  ╚═════╝  ╚═════╝  ╚══╝╚══╝ ╚═╝  ╚═══╝

  Down crawls any URL and downloads everything.
  Images · Videos · Documents · Audio · 166 extensions

  Target  : https://www.war.gov/UFO/
  Output  : ./war.gov-UFO
  Types   : all  (166 extensions)
  Depth   : 1
  Threads : 8
  Engine  : curl-cffi + requests + urllib + curl

[*] Crawling...
[+] Found 278 file(s)

  [  1/278]  OK      7.7 MB  DOD-STRATEGIC-MGMT-PLAN-2023.PDF
  [  2/278]  OK      6.7 MB  2026-NATIONAL-DEFENSE-STRATEGY.PDF
  [  3/278]  OK      1.5 MB  2024-04-30-Composite-Sketch.jpg
  [  4/278]  OK      1.2 MB  FBI-Photo-1.jpg
  [  5/278]  OK      1.5 MB  NASA-UAP-VM6-Apollo-17-1972.jpg
  [  6/278]  OK    841.2 KB  DOW-UAP-PR38-Middle-East-2013.jpg
  ...

============================================================
  Done
  OK   : 278
  Size : ~1.2 GB
  Dir  : ./war.gov-UFO
============================================================

  documents/
    2026-NATIONAL-DEFENSE-STRATEGY.PDF              6.7 MB
    DOD-STRATEGIC-MGMT-PLAN-2023.PDF                7.7 MB
    65_hs1-834228961_62-hq-83894_section_1.pdf      4.1 MB
    65_hs1-834228961_62-hq-83894_section_2.pdf      3.8 MB
    ...  (145 PDFs total)
  images/
    2024-04-30-Composite-Sketch.jpg                 1.5 MB
    FBI-Photo-1.jpg                                 1.2 MB
    NASA-UAP-VM6-Apollo-17-1972.jpg                 1.5 MB
    DOW-UAP-PR19-Middle-East-May-2022.jpg           1.2 MB
    ...  (133 images total)

What Down found automatically on this site:

  • Akamai bot protection → bypassed via curl-cffi Chrome TLS fingerprint
  • JavaScript fetch('/Portals/1/Interactive/2026/UFO/uap-csv.csv') → detected and parsed
  • 161 UAP records across 17 pages → all extracted from the CSV in one pass
  • 145 declassified PDF documents
  • 133 UAP report images (FBI, NASA, DoD, composite sketches)
  • Slideshow images from the main page gallery

How it bypasses bot protection

Down tries 4 strategies automatically, in order:

# Strategy Bypasses
1 curl-cffi — Chrome TLS fingerprint impersonation Akamai, Cloudflare, Imperva
2 requests — fast, standard Python HTTP Basic blocks
3 urllib — different SSL stack Some TLS-based blocks
4 system curl — native binary with full Sec-Fetch headers Most remaining CDN checks

If all 4 fail (e.g. login required), save the page with your browser and use --html page.html.

Output structure

<site-name>/
  images/       jpg, png, gif, webp, heic, avif, svg, psd, raw, cr2, dng …
  videos/       mp4, mkv, avi, mov, webm, ts, 3gp, flv, rmvb, mxf …
  documents/    pdf, docx, xlsx, epub, zip, iso, dmg …
  audio/        mp3, flac, wav, aac, ogg, opus, aiff, dsd …
  others/       anything else

Supported extensions — 166 total

Type Count Formats
images 48 jpg jpeg png gif webp heic avif psd xcf svg ai eps raw cr2 cr3 nef arw dng …
videos 45 mp4 mkv avi mov webm ts m2ts 3gp f4v vob rmvb mxf dv divx xvid …
documents 47 pdf docx xlsx pptx epub mobi djvu zip rar 7z tar gz iso dmg deb …
audio 27 mp3 flac wav aac ogg opus aiff wma ape dsd mid amr …

About

Web media downloader — crawls a URL and downloads images, videos and documents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages