almendra

almendra — what coffee farmers call the green coffee bean itself.

A fast, re-trainable system for classifying green (unroasted) coffee beans by defect and grade — built to eventually run on a high-throughput sorting machine.

almendra is not just a model. It is a replicable framework: a versioned data pipeline, a config-driven training system with a swappable model architecture, a hardware-agnostic export/benchmark toolchain, and a documented physical capture protocol. The model is the focus — reliable and fast — but it must stay easy to re-train as better data arrives.

Status: Phase 6 — local UI. The full pipeline (ingest → train → eval → export → bench) runs on public data; Phase 5's Pareto sweep picked MobileNetV3-Large + static INT8 as the current deploy choice (0.86 macro-F1, 3.6 MB, ~430 beans/s on a single CPU thread); and a local Streamlit UI now wraps the whole toolkit. See docs/research-log.md for the full log.

The idea

A green bean can hide damage on a face a single photo never sees. So almendra treats one sample as a set of images of one bean — several viewing angles under several illumination spectra (front-light, back-light transillumination, and UV fluorescence) — and a multi-view model fuses them into one per-bean decision.

Two design principles make this both thorough and fast:

Collect rich, deploy lean. The model accepts a variable number of views and is trained with view-dropout. A slow, exhaustive rig collects the richest possible training data; the production machine captures fewer-but-sufficient views at high speed — the same model serves both.
The model is never the bottleneck. A tiny INT8 backbone, batched across all views in flight, runs faster than beans can be singulated or ejected. Speed comes from parallel lanes and strobed capture, not from rushing each bean.

See docs/methodology.md for the full rationale.

Quickstart

Requires uv.

uv sync --extra dev      # fast, torch-free: enough to lint, test and explore
make info                # print the canonical taxonomy and project status
make test                # run the test suite

To run the full pipeline:

make setup               # install everything (torch, onnx, dvc, ...)
make data                # download public datasets (needs ROBOFLOW_API_KEY)
make ingest              # crop instances + build data/processed/manifest.jsonl
make train               # train the baseline defect classifier
make eval                # evaluate on the test split
make export              # export to ONNX (+ INT8) with a parity check
make bench               # benchmark inference latency

Local UI

A Streamlit app wraps the whole pipeline behind a bilingual ES/EN interface — tray capture, training (with live charts), evaluation, prediction and settings. A non-technical user can run almendra end-to-end without touching the CLI.

Launch

uv sync --extra ui --extra train --extra export --extra capture
make ui
# equivalent: uv run almendra ui

The app opens at http://localhost:8501. Flags:

uv run almendra ui --port 8888       # use a different port
uv run almendra ui --headless        # don't auto-open a browser (SSH / CI)

Extra installed	Page it unlocks
`ui`	the app itself (Streamlit + Plotly)
`train`	Train + Evaluate + mis-classified gallery (PyTorch)
`export`	Predict (ONNX Runtime)
`capture`	Tray Capture (OpenCV)

Skipping an extra is fine — the page that depends on it shows a clear error instead of crashing. Install later and reload.

What's in the app

🏠 Inicio / Home — dataset stats, recent runs, a health panel, and an inline wizard that walks first-time users through Ingest → Train → Eval.
📷 Bandeja / Tray Capture — drag-and-drop tray photos, see the original next to the rectified+overlay preview, save per-bean crops to data/raw/proprietary_tray/sessions/<id>/.
🧠 Entrenar / Train — pick a backbone and the key knobs (advanced controls live behind an expander), launch training as a subprocess, and watch train_loss + val_macro_f1 update in real time as each epoch completes.
📊 Evaluar / Evaluate — pick a checkpoint and split, run it, see accuracy / macro-F1 / missed-defect-rate, per-class breakdown, confusion matrix heatmap, and a gallery of mis-classified beans.
🚀 Predecir / Predict — upload a single-bean photo, get the predicted class, confidence, Top-3, and an accept/reject verdict from the canonical taxonomy. Uses the most recent ONNX for speed (prefers INT8).
⚙️ Ajustes / Settings — browse the canonical taxonomy, the YAML data sources, and the current Hydra config.

End-to-end test in 5 minutes

# 1. install everything the UI exercises
uv sync --extra ui --extra train --extra export --extra capture

# 2. (optional) ingest the public Robusta baseline so Train/Evaluate have data
export ROBOFLOW_API_KEY=...    # see your Roboflow workspace
make data && make ingest

# 3. launch the UI
make ui

Then in the browser:

Inicio — confirm the health panel shows Python/PyTorch/Taxonomy green; the manifest icon flips to ✅ once data/processed/manifest.jsonl exists.
Entrenar — backbone mobilenet_v3_small, 3 épocas (for a smoke test), Iniciar entrenamiento. The Plotly chart should start updating within a couple of seconds of the first epoch landing.
Evaluar — pick the run you just trained, leave split = test, Ejecutar. You get the headline metrics + confusion matrix + error gallery.
Predecir — from a terminal, uv run almendra export --checkpoint outputs/ui-<timestamp>/best.pt. Refresh the Predict page, pick the ONNX from the dropdown, upload any single-bean image.

See docs/ui.md for the deeper troubleshooting guide (stuck subprocesses, port conflicts, missing extras).

Repository layout

Path	Purpose
`data/taxonomy.yaml`	Canonical SCA-aligned label schema (single source of truth)
`data/sources/`	Per-dataset adapters + class mappings
`configs/`	Hydra configs — compose models, data and training runs
`src/almendra/`	The package: datasets, models, train, eval, export, bench, infer
`capture/`	The physical data-capture protocol and bill of materials
`docs/`	Methodology, research log, model cards, dataset datasheets, ADRs
`scripts/`	Utilities (e.g. public-dataset download)

Research questions

almendra is run as a rigorous investigation. Each question has a measurable answer, tracked in docs/research-log.md:

Does multi-view fusion measurably lower the missed-defect rate vs a single view?
Does multi-spectral illumination catch defects RGB front-light reflectance misses?
What is the accuracy / latency / model-size Pareto frontier across backbones?
What accuracy is lost to INT8 quantization, per class?
How few deployment views can we use before per-class recall degrades?

Roadmap

Phase 0 — Scaffolding ✓
Phase 1 — Data pipeline + single-view public baseline ✓
Phase 2 — Multi-view fusion model ✓
Phase 3 — Physical capture protocol + proprietary Arabica data (blocked on data)
Phase 4 — Multi-spectral illumination (UV, transillumination)
Phase 5 — Speed: backbone sweep, INT8, hardware benchmark ✓
Phase 6 — Local Streamlit UI for the whole toolkit ✓
Parallel research track — NIR / hyperspectral internal-defect inspection

Data & licensing

Code: Apache-2.0.
Datasets are never redistributed — adapter scripts download each one from its original host under its own licence; provenance and licences are recorded in docs/datasheets/.
The label taxonomy is currently provisional and aligned to — but not yet verified against — the official SCA Arabica Green Coffee Defect Handbook.

Contributing

See CONTRIBUTING.md. Contributions to data, defect taxonomy review, and hardware/capture design are especially welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
capture		capture
configs		configs
data		data
docs		docs
notebooks		notebooks
scripts		scripts
src/almendra		src/almendra
tests		tests
.dvcignore		.dvcignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

almendra

The idea

Quickstart

Local UI

Launch

What's in the app

End-to-end test in 5 minutes

Repository layout

Research questions

Roadmap

Data & licensing

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

almendra

The idea

Quickstart

Local UI

Launch

What's in the app

End-to-end test in 5 minutes

Repository layout

Research questions

Roadmap

Data & licensing

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages