An interactive tool for comparing how well different local LLMs classify and extract structured data from articles in the Philosophical Transactions of the Royal Society (1665–1886).
Part of the Secrets to Patent project.
🌐 Live site: https://digitalhistory-lund.github.io/SecToPat-PhilTransModelSelection/
| Path | Purpose |
|---|---|
index.html |
Self-contained comparison viewer with all data embedded |
results.json |
Raw model evaluation results |
about.html |
About page with citation and license info |
Side-by-side comparison of model outputs for a selection of Philosophical Transactions articles. Models evaluated include gemma4, ministral-3, qwen3.5, phi3.5, and others. For each article the viewer shows the original OCR'd text alongside the structured extraction from each selected model — including the experiment classification and extracted locations.
The HTML is generated by build_bench_viewer.py in the parent repository.
The schema evolved during benchmarking. Two prompt versions are present in the data:
v2 — 3 questions (used for: gemma4, ministral-3):
You are analysing an article from the Philosophical Transactions of the Royal Society
(17th–19th century). Answer three questions:
1. is_experiment: Does this article describe one or more experiments or systematic
observations? Answer "yes", "no", or "unsure".
2. locations: List every spatial location where an experiment or observation took place.
For each, provide:
- place: the immediate setting at room or building scale (e.g. "private study",
"ship's cabin", "kitchen"); null if not mentioned
- geography: named place at city, region, country, estate, or vessel scale
(e.g. "London", "aboard HMS Endeavour"); null if not mentioned
- detail: specific spatial detail within the place (e.g. "by the south window",
"in a dark corner"); null if not mentioned
Do not include apparatus or containers as locations. Return an empty list if no
spatial setting is mentioned.
3. participants: List every person named in the article who conducted, observed, or
contributed to the experiment or observation. For each, provide:
- name: the person's name as it appears in the text
- role: their role if stated (e.g. "experimenter", "observer", "subject",
"correspondent", "author"); null if not clear
Return an empty list if no individuals are named.
v1 — 2 questions (used for: llama3.2, phi3.5, qwen3.5):
Same as v2 but without question 3 (participants). Output schema: is_experiment + locations only.
Structured output was enforced via JSON schema using the Ollama API.
This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License
(CC BY-NC 4.0). See LICENSE.
Machine-readable metadata is in CITATION.cff.
For questions or feedback, contact Mathias Johansson at MathiasJohansson@kultur.lu.se, or open an issue.