cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

Classification model (wired / wireless)
Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

Optional	Description
`docling`	For docling usage
`huggingface`	For downloading models

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: ImageFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
cells2table		cells2table
eval		eval
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cells2table

Implemented pipelines

PaddlePaddle

Instalation

Usage

Docling

About

Uh oh!

Releases 7

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cells2table

Implemented pipelines

PaddlePaddle

Instalation

Usage

Docling

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Contributors

Uh oh!

Languages