Skip to content

Suntooth/youtube-id-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

YouTube ID Scraper

Scrapes YouTube video IDs from a list of channels using yt-dlp. It can theoretically be used for other platforms too, but may need tweaking, and it was primarily designed for YouTube IDs.

Instructions

  1. Download ids.py.
  2. Install yt-dlp.
  3. In the same directory as ids.py, create a file named batch.txt with a list of YouTube links (channels, videos, or playlists), one per line. Normal yt-dlp batch file rules apply: lines starting with #, ;, or ] will be considered comments and ignored.
  4. Run ids.py. It will create a file named ids.txt with a list of IDs in standard yt-dlp archive format (platform ID, e.g. youtube dQw4w9WgXcQ). If a file named ids.txt already exists, it will skip videos that are already listed in that file and will append new IDs to the end.

Details

  • The only package required for this program (other than yt-dlp itself) is subprocess. It was written in Python 3.12.12.
  • On some systems with specific yt-dlp setups, the paths to ids.txt and batch.txt may need to be absolute paths instead of relative, otherwise they may be created in the same folder as yt-dlp is installed in. This happened on my Raspberry Pi, for example.
  • Cookies aren't required for using this tool with YouTube because it ignores all warnings and errors, meaning even if the "sign in to confirm you're not a bot" error occurs, it will still log the ID.
  • The program will loop once it's exhausted the batch.txt list, as it's designed to run constantly without any human input once it's been started. It re-processes batch.txt on every loop.
  • If you want to run this on Windows and you use the exe of yt-dlp, you need to place yt-dlp.exe in the same folder as ids.py and modify ids.py to reference yt-dlp.exe instead of just yt-dlp.
  • Other than ids.txt, no files are permanently downloaded, as the program uses yt-dlp's --simulate option.
  • On my Raspberry Pi 5 with decent internet, this program scrapes 4-5 IDs per second, or ~16k an hour (~375k a day?).

About

Scrapes YouTube video IDs from a list of channels.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages