Saiki (採記) is a small toolkit for Anki-based language learning workflows:
listening playlists, word mining, YouTube transcript mining, TTS sentence
imports, and known/new word comparison.
The name is a coined Japanese compound from 採 as in gathering/collecting and
記 as in remembering or recording. Pronunciation: saiki, roughly
"sigh-key".
saiki --help- Python 3.12+
- Anki with AnkiConnect
ffmpeg- spaCy models for word mining:
python -m spacy download es_core_news_sm
python -m spacy download ja_core_news_lgSetup example:
python3.12 -m venv ~/.venv/saiki
source ~/.venv/saiki/bin/activate
pip install -U pip
pip install -e .
sudo dnf install ffmpegIf you installed with pip install -e . (editable mode), changes to the source
are live immediately — no reinstall needed. Only re-run pip install -e . if
you pull changes that add new dependencies or scripts.
git pull
pip install -e . # only needed if pyproject.toml changedThe default edge-tts backend is included. To install the optional Python-backed
engines (piper, kokoro):
pip install ".[tts]"
# System package for espeak-ng.
sudo dnf install espeak-ngOther package-manager names:
sudo apt-get install espeak-ng
sudo pacman -S espeak-ngBackend notes:
edge-tts: installed bypip install edge-tts; no API key, but it uses Microsoft Edge's online TTS service.gtts: installed byrequirements.txt; no API key, but it uses Google's online TTS service throughgtts-cli.piper: installed bypip install piper-tts; you still need a compatible.onnxvoice model, usually with its matching.onnx.jsonconfig file.espeak-ng: installed through your OS package manager, not pip.kokoro: installed bypip install kokoro-onnx soundfile; you still needkokoro-v1.0.onnxandvoices-v1.0.bin, plus any language-specific G2P setup required by your Kokoro release.
Example model downloads for the README smoke tests:
mkdir -p ~/.local/share/saiki/models
# Piper Spanish voice model plus matching config.
wget -O ~/.local/share/saiki/models/es_ES-davefx-medium.onnx \
https://huggingface.co/rhasspy/piper-voices/resolve/main/es/es_ES/davefx/medium/es_ES-davefx-medium.onnx
wget -O ~/.local/share/saiki/models/es_ES-davefx-medium.onnx.json \
https://huggingface.co/rhasspy/piper-voices/resolve/main/es/es_ES/davefx/medium/es_ES-davefx-medium.onnx.json
# Kokoro ONNX model plus voices bundle.
wget -O ~/.local/share/saiki/models/kokoro-v1.0.onnx \
https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx
wget -O ~/.local/share/saiki/models/voices-v1.0.bin \
https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.binSaiki's default tts_model_dir is ~/.local/share/saiki/models. Relative
model paths such as es_ES-davefx-medium.onnx are resolved under that
directory. You can override it in YAML with tts_model_dir or for one command
with --tts-model-dir.
Defaults are built in, but you can override them with YAML:
~/.config/saiki/config.yamlOr pass a config explicitly:
saiki --config ./config.yaml words jpExample:
anki_connect_url: http://localhost:8765
media_dir: ~/.var/app/net.ankiweb.Anki/data/Anki2/User 1/collection.media
audio_output_root: ~/Languages/Anki/anki-audio
word_output_root: ~/Languages/Anki/anki-words
sentence_dir: ~/Languages/Anki
tts_model_dir: ~/.local/share/saiki/models
note_model: Basic
fields:
front: Front
back: Back
languages:
jp:
name: japanese
transcript_code: ja
tts_backend: edge-tts
tts_voice: ja-JP-NanamiNeural
tts_tempo: 1
decks: ["日本語"]
field: Back
word_model: ja_core_news_lg
sentence_file: sentences_jp.txt
es:
name: spanish
transcript_code: es
tts_backend: edge-tts
tts_voice: es-ES-ElviraNeural
tts_tempo: 1
decks: ["Español"]
field: Back
word_model: es_core_news_sm
sentence_file: sentences_es.txtA copyable template is also available at examples/config.yaml.
Supported language codes by default:
jpes
Extract audio referenced by [sound:...] tags from configured decks and create
an .m3u playlist.
saiki audio jp
saiki audio es --concat
saiki audio jp --media-dir ~/.local/share/Anki2/User\ 1/collection.media --copy-only-newOutputs go to ~/Languages/Anki/anki-audio/<language>/ by default.
Extract frequent words from Anki notes using AnkiConnect and spaCy.
saiki words jp
saiki words es --deck "Español"
saiki words es --query 'deck:"Español" tag:youtube'
saiki words jp --min-freq 3 --out words_jp.txt
saiki words jp --full-fieldOutput format:
word frequency
Examples:
comer 12
hablar 9
行く (行き) 8
見る (見た) 6
Extract vocabulary from an Anki TSV export file instead of using AnkiConnect:
saiki words --lang es --input Español.txt --field 2 --field-section first --output words_es_content.txt --debug words_es_debug.tsv
saiki words --lang jp --input Japanese.txt --field 2 --output words_jp_content.txtWhen --input is provided, --field specifies the 1-based column index of the
text field (default 2). Only that column is mined; other columns such as audio
or Anki tags are ignored. By default, all blank-line-separated sections inside
the selected field are kept. Use --field-section first when your card format
stores target-language text before a translation or note in the same field.
The file-based pipeline:
- Parses Anki
#header lines (#separator:tab,#html:true,#tags column:N) - Uses Python's
csvmodule for robust TSV parsing - Removes
[sound:...mp3]markers, HTML tags, URLs, and email addresses from the field text - Uses the configured language's spaCy model, token filter, and output format
- Tracks POS counts, surface forms, example sentences, and source line numbers
- Tracks original spaCy lemmas for debugging
Output files produced:
words_<lang>_content.txt— cleaned vocabularywords_<lang>_debug.tsv— per-entry debug info (only with--debug)
Debug TSV example:
entry count pos_counts top_surface_forms example_sentences source_lines original_lemmas
comer 8 VERB:8 como, come, comen Yo como manzanas.; Ustedes los comen con arroz. 70,979 comer
Compare deck vocabulary against a target list:
saiki compare --deck-words words_es_content.txt --target-words target_es_top_1000.txt --output missing_from_deck.txt
saiki compare --deck-words words_es_content.txt --target-words target_es_top_1000.txt --min-frequency 3Normalises case and optionally strips accents for matching, so cómo and
como are treated as the same word. Use --min-frequency to ignore accidental
low-frequency words in the deck.
Mine vocabulary or sentence rows from YouTube subtitles.
saiki youtube es VIDEO_ID
saiki youtube es VIDEO_ID --top 50
saiki youtube jp VIDEO_ID --mode sentences
saiki youtube es VIDEO_ID --raw --no-stopwordsExport Anki-ready sentence rows:
saiki youtube es VIDEO_ID --mode sentences --out youtube.tsvExport only rows that appear to contain unknown vocabulary:
saiki youtube es VIDEO_ID \
--mode sentences \
--out youtube_new.tsv \
--known-words ~/Languages/Anki/anki-words/spanish/words_es.txt \
--only-newSentence exports contain:
sentence timestamp video_url vocab_guess
Generate TTS audio and add sentence cards to Anki.
saiki import es
saiki import jp ~/Languages/Anki/sentences_jp.txt
saiki import es youtube.tsv --tags youtube,manual
saiki import es --tts-voice es-MX-DaliaNeural
saiki import es --dry-run # preview sentences without touching Anki or TTSThe importer accepts plain text sentence files and TSV/CSV files with a
sentence column. text-to-speech is always added as a tag. If --tags is not
provided, AI-generated is added.
TTS is configured per language with tts_backend. Supported backends are:
edge-tts: default backend using Microsoft Edge neural voices; configuretts_voice.gtts: free backend usinggtts-cli; configuretts_codeandtts_tld.piper: local/offline neural TTS; configuretts_modelwith a model path. The stock Piper catalog includes Spanish voices, but not Japanese.espeak-ng: local/offline lightweight TTS; configuretts_voice. Spanish is supported; Japanese is documented as kana-only and is not recommended for normal Japanese sentence cards.kokoro: local/offline neural TTS; configuretts_model,tts_voices,tts_voice, andtts_code; some Japanese setups also needtts_vocab_config. Kokoro lists Japanese and Spanish voices, but upstream notes that non-English quality can be thin.
You can override backend settings for one import:
saiki import jp sentences_jp.txt \
--tts-backend edge-tts \
--tts-voice ja-JP-KeitaNeuralVoice-listing helpers:
saiki tts-voices jp
saiki tts-voices es --backend edge-ttsTest a TTS backend without creating Anki cards:
saiki tts-test es --out /tmp/saiki_edge_default_es.mp3
saiki tts-test jp --tts-backend edge-tts --tts-voice ja-JP-NanamiNeural --out /tmp/saiki_edge_jp.mp3
saiki tts-test es --tts-backend edge-tts --tts-voice es-ES-ElviraNeural --out /tmp/saiki_edge_es.mp3
saiki tts-test es --tts-backend gtts --tts-code es --tts-tld es --out /tmp/saiki_gtts_es.mp3
saiki tts-test es --tts-backend piper --tts-model es_ES-davefx-medium.onnx --tts-config es_ES-davefx-medium.onnx.json --out /tmp/saiki_piper_es.mp3
saiki tts-test es --tts-backend espeak-ng --tts-voice es --out /tmp/saiki_espeak_es.mp3
saiki tts-test es --tts-backend kokoro --tts-model kokoro-v1.0.onnx --tts-voices voices-v1.0.bin --tts-voice ef_dora --out /tmp/saiki_kokoro_es.mp3For kokoro, put tts_model, tts_voices, and any needed tts_vocab_config
in your config file rather than typing every path each time.
Compare any generated word list against an existing known list:
saiki compare-words transcript_words.txt ~/Languages/Anki/anki-words/spanish/words_es.txtThis prints entries from the first file whose word key does not appear in the second file.
The default configuration assumes Basic notes with audio on Front and the
target-language sentence on Back. Word mining reads only the first visible
line by default; use --full-field to process the whole field.
- Add support for different Anki note/card types, including configurable field mappings per language and per import workflow.
- Support multiple import profiles, such as sentence cards, vocab cards, audio cards, and cloze cards.
- Let YouTube exports map directly into configurable note fields, not just a
fixed
sentencecolumn. - Add richer transcript filtering, such as minimum/maximum sentence length, duplicate removal, and punctuation cleanup.
- Add optional audio slicing from videos when timestamp data is available.
- Improve known/new word matching with better lemmatization for transcript vocabulary.
- Add more language profiles beyond Japanese and Spanish.
- Build a GUI for common workflows like transcript review, sentence selection, import previews, and configuration editing.
- Add integration tests with mocked AnkiConnect responses.
- Add shell completion.
pip install -e ".[dev]"
pytestThis project is licensed under the MIT License. See LICENSE.
