Lexicon Platform is a reusable lexical content repository. It contains the canonical source material, language plugins, runtime pack builders, local lexicon storage, and the file-based distribution contract needed to ship language data into an application or service.
It powers Vokabell, a German vocabulary-learning app built with Flutter.
Lexical knowledge for language learning is scattered and inconsistent: definitions vary by source, examples are uneven, CEFR levels are rarely assigned coherently, and the relations between words (synonyms, antonyms, sense families) are mostly missing. Assembling that live, per request, would be slow, costly, non-deterministic and offline-hostile... and a learner cannot tell a good entry from a bad one.
Lexicon solves this by consolidating the knowledge once into a single curated source of truth, then materialising it into static, versioned, downloadable packs. The payoff:
- Deterministic and reviewable. A word always returns the same vetted entry. Quality is fixed once and only improves; the source is content-as-code, so every change is a reviewable diff and nothing ships unreviewed.
- Leveled and interconnected, consistently. CEFR levels (A1→C2) and concept relations are global judgments made once and anchored to authoritative references, not re-derived each time - which is exactly where ad-hoc generation is least reliable.
- Offline, instant, free at the point of use. No network dependency, no per-use cost; it works with no signal and scales to any number of users without scaling cost.
- An owned asset, not a runtime dependency on an external service.
In short: pay the curation cost once, serve quality forever.
- Content-as-code, deterministic packs. A single curated source is built into versioned runtime packs plus a file-based distribution contract. The same word always returns the same vetted entry; every change is a reviewable diff.
- Capability-driven language plugins. Language behaviour (noun declension, separable-verb decomposition, ...) lives in per-language plugins that advertise capabilities. The core stays language-neutral, so adding a language is a plugin, not a core change.
- Deterministic morphology + curated irregulars. German noun declension and
separable-verb decomposition (
auf|stehen) are rule-derived; irregulars come from curated overrides, never guessed. Stable form ids keyed on the grammatical slot so surface edits never orphan a learner's progress. - Self-optimising model selection, nothing hardcoded. When several local models are installed, a dueling-bandit learns from an LLM judge's per-field preferences and routes work to whichever model is best on this content, exploring across runs.
- Human-gated generation. Nothing AI-drafted ships unreviewed: records are
needs_review, a guardrail gate auto-promotes only the clean, corroborated ones, and the build excludes the rest. - Offline-first, owned asset. No network dependency, no per-use cost; packs ship into the app and work with no signal, scaling to any number of users.
Contributors grow the source pack by proposing entries (see authoring/). To make that
fast, the repo ships an optional local toolchain for anyone running a local language
model: it can draft entries and fill gaps, and - when several models are installed - a
self-optimising selector (a dueling-bandit that learns from a judge's per-field
preferences) routes work to whichever model is actually best on this content, with no
hardcoded model choices.
Nothing generated ships automatically. Every drafted record is marked needs_review; a
guardrail gate auto-promotes only the clean ones and holds the rest for a human, and the
pack build excludes anything still under review. The reviewer's final check is the git
diff. Details in authoring/README.md.
The toolchain is driven by a single console - npm run lexicon - with an
interactive menu and an unattended autopilot. How to use it and how to read a run
(the definitions / synonyms / examples DE IT EN output) are documented in
docs/CONSOLE.md.
- canonical source packs
- editorial templates and authoring tools
- generated runtime packs
- file-based distribution artifacts
- Dart packages for contracts, parsing, storage, import, and optional language-plugin add-ons
docs/README.md: documentation map and reading orderdocs/guides/CONSUMER_GUIDE.md: integration model for applications and servicesdocs/guides/PACK_AUTHORING.md: source pack authoring workflowdocs/reference/LEXICON_FILE_CONTRACT_0_1_0.md: distribution contractdocs/reference/TOOLS.md: tool catalog and workflow rolesdocs/reference/WORKFLOW_COMMANDS.md: canonical command referenceCHANGELOG.md: platform, tooling, package, and release historydocs/reference/CONTENT_CHANGELOG.md: concept-first lexical-content historydocs/guides/RELEASING.md: release checklist and verification flowCONTRIBUTING.md: contribution guide
docs/README.mddocs/guides/docs/reference/docs/policies/packs/templates/packs/lexicon_source/current canonical source packpacks/lexicon_*_{a1,a2,b1,b2}/tools/pipeline/tools/reports/tools/maintenance/tools/lib/packages/lexicon_platform/packages/lexicon_content/packages/lexicon_content_db/packages/lexicon_core/packages/lexicon_german/current German language-plugin package (optional add-on)packages/lexicon_italian/current Italian language-plugin package (optional add-on)
Use lexicon_platform as the main dependency.
Git dependency example:
dependencies:
lexicon_platform:
git:
url: https://github.com/massimomazzariol/Lexicon.git
path: packages/lexicon_platform
ref: v0.5.1Local path example:
dependencies:
lexicon_platform:
path: ../Lexicon/packages/lexicon_platformGranular exports are available from the umbrella package:
import 'package:lexicon_platform/core.dart';
import 'package:lexicon_platform/content.dart';
import 'package:lexicon_platform/content_db.dart';Optional language-plugin imports should be added only when an app needs them. The current repository ships German and Italian as concrete add-ons:
dependencies:
lexicon_platform:
git:
url: https://github.com/massimomazzariol/Lexicon.git
path: packages/lexicon_platform
ref: v0.5.1
lexicon_german:
git:
url: https://github.com/massimomazzariol/Lexicon.git
path: packages/lexicon_german
ref: v0.5.1
lexicon_italian:
git:
url: https://github.com/massimomazzariol/Lexicon.git
path: packages/lexicon_italian
ref: v0.5.1import 'package:lexicon_german/lexicon_german.dart'; // optional add-on
import 'package:lexicon_italian/lexicon_italian.dart'; // optional add-onThis is not required for consumers that only need the generic contracts,
runtime-pack parsing, and storage/import layers. lexicon_platform stays
neutral and does not re-export language plugins.
Consumer-side language selection and pack resolution can now be expressed through one shared contract:
final consumerResolution = LexiconConsumerSelectionResolver.resolve(
catalog: availablePackManifests,
availablePlugins: const [GermanLexiconPlugin(), ItalianLexiconPlugin()],
request: LexiconConsumerSelectionRequest(
targetLanguage: 'de',
baseLanguage: 'it',
hintLanguage: 'en',
level: 'A1',
),
);
final selectedPack = consumerResolution.selectedManifest;
final activePluginLanguages = consumerResolution.activePluginLanguages;
final missingPluginLanguages = consumerResolution.missingPluginLanguages;Lower-level APIs such as LexiconStaticPluginRegistry and
LexiconPackResolver still exist when an integration needs finer control.
Use docs/reference/WORKFLOW_COMMANDS.md as the command source of truth.
Typical source-pack entrypoint:
pnpm node tools/pipeline/run_pack_pipeline.mjs \
--pack-dir packs/lexicon_source \
--with-formsDetailed workflow:
docs/guides/PACK_AUTHORING.mddocs/reference/TOOLS.md
Lexicon Platform uses a permissive split:
- code in
packages/,tools/, and local demo code is licensed underApache-2.0 - lexical content, packs, and documentation are licensed under
CC BY 4.0unless noted otherwise
This is intended to keep the repository open, reusable, and easy to extend, including in proprietary software, while preserving attribution to the source project.
See:
LICENSELICENSE-CONTENT.mdNOTICEATTRIBUTION.md
For the end-to-end integration flow, including local DB import, distribution,
and hosting options, see docs/guides/CONSUMER_GUIDE.md.