(Work in progress; proceed with caution)
Lecdown is a small script to automatically download new/updated lecture notes from a configured course website.
- Check and download configured web pages for new/updated links
- Rename file with scripting
- Use ETag header to efficiently check for updates
- Download HTML pages as PDF
Install this package:
pip install git+https://github.com/jasonchoimtt/lecdownInstall ChromeDriver:
# On Mac:
brew install chromedriver
# On Linux: I don't know...# This creates lecdown.json in the current directory
lecdown init
# Add a page to extract links from
lecdown add-source http://path.to/some/course/page
# Download!
lecdown
# List downloaded files
lecdown ls
# Move / Delete downloaded files
lecdown mv weirdly_named_file.pdf lecture_notes.pdf
lecdown rm useless_file.txtLecdown works by storing an index in lecdown.json. Currently, it ignores any
HTML links and downloads everything else. It does not scrape links of links
either. It associates the downloaded files with the origin link in
lecdown.json.
You should make sure that you only use lecdown mv to rename files; otherwise
lecdown will re-download the moved file.
Similarly, if you want a file not to be redownloaded, you should use
lecdown rm to do so.
Some web pages (i.e. Piazza resources) require login to be scraped. You can use
lecdown browser to login to that page, then save the cookie in the console.
Currently, only the link scraper (but not the file downloader) uses the saved
cookie, but it still works for Piazza.