diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index d3af701..78ab1c2 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -27,7 +27,7 @@ jobs: run: uv python install 3.12 - name: Install Python dependencies - run: uv sync --group docs + run: uv sync --extra docs - name: Build Sphinx documentation run: uv run sphinx-build docs docs/_build/html diff --git a/README.md b/README.md index 419be72..ae7a86d 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,10 @@ [](https://pypi.org/project/litxbench) [](https://radical-ai.github.io/litxbench) -LitXBench is a benchmark to evaluate LLMs on extracting material information synthesized in research papers. Read the preprint here. +LitXBench is a benchmark to evaluate LLMs on extracting material information synthesized in research papers. Read the preprint [here](https://arxiv.org/pdf/2604.07649).
-
+
If you use LitXBench in your research, please cite:
@article{chong2026litxbench,
- title={LitXBench: A Benchmark for Extracting Experiments from Scientific Literature},
- author={Chong, Curtis and Colindres, Jorge},
- year={2026},
+ title = {LitXBench: A Benchmark for Extracting Experiments from Scientific Literature},
+ author = {Curtis Chong and Jorge Colindres},
+ year = {2026},
+ eprint = {2604.07649},
+ archivePrefix = {arXiv},
+ primaryClass = {cs.IR},
+ url = {https://arxiv.org/abs/2604.07649}
}
diff --git a/docs/transcribe.md b/docs/transcribe.md
index bf65428..684ee4f 100644
--- a/docs/transcribe.md
+++ b/docs/transcribe.md
@@ -40,7 +40,7 @@ The transcribe feature processes PDFs using OCR and LLM extraction to produce st
This includes the pydantic-ai dependency needed for extraction.
```bash
- uv sync --group paper
+ uv sync --extra paper
```
5. **Set your API keys**
diff --git a/docs/user/building_extractions.rst b/docs/user/building_extractions.rst
index 1645157..0e372a4 100644
--- a/docs/user/building_extractions.rst
+++ b/docs/user/building_extractions.rst
@@ -11,8 +11,8 @@ Specifying Inputs
When a synthesis step combines multiple raw materials or intermediate products, you
need to specify what feeds into it. There are three ways to do this.
-Via the process string
-^^^^^^^^^^^^^^^^^^^^^^
+Via the ``process`` String
+^^^^^^^^^^^^^^^^^^^^^^^^^^
The first segment of a process string (before the first ``->``) lists the inputs to
the first step. Multiple inputs are comma-separated:
@@ -64,8 +64,8 @@ This is useful when a step *within* the group introduces a new material
Input names must reference either a key in ``raw_materials`` or the ``name`` of a
previously defined output material.
-Via template variables in inputs
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Via Template Variables
+^^^^^^^^^^^^^^^^^^^^^^
Inputs can use template variables, allowing the same synthesis group to mix in
different materials depending on the output material:
@@ -108,7 +108,8 @@ as a single logical measurement.
.. code-block:: python
- from litxbench import Measurement, CoreMeasurementValue, MeasurementStatistic
+ from litxbench import Measurement
+ from litxbench.core.models import CoreMeasurementValue, MeasurementStatistic
*Measurement.group_measurements(
kind=PhaseMeasurementKind.phase_size,
diff --git a/docs/user/core_concepts.rst b/docs/user/core_concepts.rst
index d200bd4..84f15b2 100644
--- a/docs/user/core_concepts.rst
+++ b/docs/user/core_concepts.rst
@@ -4,28 +4,6 @@ Core Concepts
LitXBench represents material extractions as structured Python objects. This page explains
the data model and the design principles behind it.
-The Experiment Extraction Problem
----------------------------------
-
-The experiment extraction task is to output all synthesized materials *m_i* in a paper.
-Each material *m* is created from a synthesis process *p* and has measurements *x*. The experiment extraction task is to output all
-measurements *x* for each material *m*. The result can be represented as a list of tuples
-*(m, p, x)*.
-
-**Note: Materials are not compositions.** Since compositions are measured values,
-there can be multiple composition measurements for each material (e.g. measured by a balance, energy-dispersive X-ray spectroscopy, or optical emission spectroscopy).
-
-Design Principles
------------------
-
-1. **Process lineage over composition** -- A material's properties depend on how it was made,
- not just what it's made of. Measurements are linked to the full synthesis history.
-
-2. **Canonical enumerations** -- Categorical values are mapped to canonical identifiers to prevent
- alias collisions. The ``normalize()`` function documents the mapping between a paper's terminology and the correct canonical value.
-
-3. **Code as representation** -- Materials are expressed as executable Python code rather than
- JSON or plain text. This makes LitXBench benchmarks easy-to-edit, have high auditability for readers, and easily allows code-based extraction validation.
Data Model Overview
-------------------
@@ -126,7 +104,7 @@ Measurements
Measurements capture numeric properties of a material.
-- **Measurement** -- a generic numeric measurement with a ``MeasurementKind``, value, optional
+- **Measurement** -- a generic numeric measurement with a ``kind`` string (e.g. ``AlloyMeasurementKind``), value, optional
`Pint