Toscrape

This is a library to scrape the contents of https://books.toscrape.com. This library was created as result of me trying to learn Rust. Following the most suitable Rust idioms was a core focus.

I've tried experimenting with traits, structs, enums, multithreading, etc., with this project.

The documentation is generated and deployed automatically at https://eeriemyxi.github.io/toscrape/.

If you, for whatever startling reason, desire to use this library,

cargo add --git https://github.com/eeriemyxi/toscrape.git toscrape

... you could add it as a dependency by running this command.

Examples

Fetch a specific book

use toscrape;

fn main() {
    dbg!(toscrape::fetch_book(
        "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"
    ))
    .unwrap();
}

Paginate through a particular category

use toscrape;

fn main() {
     dbg!(
        toscrape::paginate_category(
            "https://books.toscrape.com/catalogue/category/books/historical_42/index.html"
        )
        .unwrap()
        .collect::<Vec<_>>()
    );
}

You could do:

for ... in toscrape::paginate_category("...").page(2) { ... }

To paginate from a particular page number.

Fetch and iterate all categories, then iterate every page, and then iterate every card in each page

use toscrape;

fn main() {
    for category in toscrape::fetch_categories().unwrap() {
        dbg!(&category);
        for book in category.paginate().unwrap().flatten() {
            dbg!(&book);
            dbg!(&book.full());
        }
    }
}

Fetching book cards in parallel

This example uses threads to fetch the results faster (via .thread_ahead builder option). 0 means only main thread is used, so 1 would use two threads.

Note

This is limited to BookCard right now. Extending it to BookDetails is yet to be done.

use toscrape;

fn main() {
    for category in toscrape::fetch_categories().unwrap() {
        dbg!(&category);
        for book in category.paginate().unwrap().thread_ahead(5).flatten() {
            dbg!(&book);
            dbg!(&book.full());
        }
    }
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
crates		crates
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toscrape

Examples

Fetch a specific book

Paginate through a particular category

Fetch and iterate all categories, then iterate every page, and then iterate every card in each page

Fetching book cards in parallel

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Toscrape

Examples

Fetch a specific book

Paginate through a particular category

Fetch and iterate all categories, then iterate every page, and then iterate every card in each page

Fetching book cards in parallel

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages