Skip to content

Error importing web and crawler #1

Description

@nuxion

When I import web and crawler it requires me to install newspaper library which is optional.

I would like to use web and crawler modules without installing this library, or if not possible, add it in the default dependencies.

In [1]: from datahtml import web, crawler
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from datahtml import web, crawler

File ~/.local/share/hatch/env/virtual/streams/rAoc7VE_/streams/lib/python3.9/site-packages/datahtml/web.py:3
      1 from typing import Any, Dict, List, Union
----> 3 from datahtml import defaults, errors, news, parsers, rss, sitemap, types
      4 from datahtml._utils import difference_from_now
      5 from datahtml.base import CrawlerSpec

File ~/.local/share/hatch/env/virtual/streams/rAoc7VE_/streams/lib/python3.9/site-packages/datahtml/news.py:5
      2 from datetime import datetime
      3 from typing import Optional
----> 5 from newspaper import Article
      7 from datahtml.base import CrawlerSpec
     10 @dataclass
     11 class ArticleData:

ModuleNotFoundError: No module named 'newspaper'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions