Saiki

Saiki (採記) is a small toolkit for Anki-based language learning workflows: listening playlists, word mining, YouTube transcript mining, TTS sentence imports, and known/new word comparison.

The name is a coined Japanese compound from 採 as in gathering/collecting and 記 as in remembering or recording. Pronunciation: saiki, roughly "sigh-key".

saiki --help

Requirements

Python 3.12+
Anki with AnkiConnect
ffmpeg
spaCy models for word mining:

python -m spacy download es_core_news_sm
python -m spacy download ja_core_news_lg

Setup example:

python3.12 -m venv ~/.venv/saiki
source ~/.venv/saiki/bin/activate
pip install -U pip
pip install -e .
sudo dnf install ffmpeg

Updating

If you installed with pip install -e . (editable mode), changes to the source are live immediately — no reinstall needed. Only re-run pip install -e . if you pull changes that add new dependencies or scripts.

git pull
pip install -e .   # only needed if pyproject.toml changed

Optional TTS Backends

The default edge-tts backend is included. To install the optional Python-backed engines (piper, kokoro):

pip install ".[tts]"

# System package for espeak-ng.
sudo dnf install espeak-ng

Other package-manager names:

sudo apt-get install espeak-ng
sudo pacman -S espeak-ng

Backend notes:

edge-tts: installed by pip install edge-tts; no API key, but it uses Microsoft Edge's online TTS service.
gtts: installed by requirements.txt; no API key, but it uses Google's online TTS service through gtts-cli.
piper: installed by pip install piper-tts; you still need a compatible .onnx voice model, usually with its matching .onnx.json config file.
espeak-ng: installed through your OS package manager, not pip.
kokoro: installed by pip install kokoro-onnx soundfile; you still need kokoro-v1.0.onnx and voices-v1.0.bin, plus any language-specific G2P setup required by your Kokoro release.

Example model downloads for the README smoke tests:

mkdir -p ~/.local/share/saiki/models

# Piper Spanish voice model plus matching config.
wget -O ~/.local/share/saiki/models/es_ES-davefx-medium.onnx \
  https://huggingface.co/rhasspy/piper-voices/resolve/main/es/es_ES/davefx/medium/es_ES-davefx-medium.onnx
wget -O ~/.local/share/saiki/models/es_ES-davefx-medium.onnx.json \
  https://huggingface.co/rhasspy/piper-voices/resolve/main/es/es_ES/davefx/medium/es_ES-davefx-medium.onnx.json

# Kokoro ONNX model plus voices bundle.
wget -O ~/.local/share/saiki/models/kokoro-v1.0.onnx \
  https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx
wget -O ~/.local/share/saiki/models/voices-v1.0.bin \
  https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin

Saiki's default tts_model_dir is ~/.local/share/saiki/models. Relative model paths such as es_ES-davefx-medium.onnx are resolved under that directory. You can override it in YAML with tts_model_dir or for one command with --tts-model-dir.

Configuration

Defaults are built in, but you can override them with YAML:

~/.config/saiki/config.yaml

Or pass a config explicitly:

saiki --config ./config.yaml words jp

Example:

anki_connect_url: http://localhost:8765
media_dir: ~/.var/app/net.ankiweb.Anki/data/Anki2/User 1/collection.media
audio_output_root: ~/Languages/Anki/anki-audio
word_output_root: ~/Languages/Anki/anki-words
sentence_dir: ~/Languages/Anki
tts_model_dir: ~/.local/share/saiki/models
note_model: Basic
fields:
  front: Front
  back: Back
languages:
  jp:
    name: japanese
    transcript_code: ja
    tts_backend: edge-tts
    tts_voice: ja-JP-NanamiNeural
    tts_tempo: 1
    decks: ["日本語"]
    field: Back
    word_model: ja_core_news_lg
    sentence_file: sentences_jp.txt
  es:
    name: spanish
    transcript_code: es
    tts_backend: edge-tts
    tts_voice: es-ES-ElviraNeural
    tts_tempo: 1
    decks: ["Español"]
    field: Back
    word_model: es_core_news_sm
    sentence_file: sentences_es.txt

A copyable template is also available at examples/config.yaml.

Supported language codes by default:

jp
es

CLI

Audio

Extract audio referenced by [sound:...] tags from configured decks and create an .m3u playlist.

saiki audio jp
saiki audio es --concat
saiki audio jp --media-dir ~/.local/share/Anki2/User\ 1/collection.media --copy-only-new

Outputs go to ~/Languages/Anki/anki-audio/<language>/ by default.

Words

Extract frequent words from Anki notes using AnkiConnect and spaCy.

saiki words jp
saiki words es --deck "Español"
saiki words es --query 'deck:"Español" tag:youtube'
saiki words jp --min-freq 3 --out words_jp.txt
saiki words jp --full-field

Output format:

word frequency

Examples:

comer 12
hablar 9
行く (行き) 8
見る (見た) 6

Words from TSV Export

Extract vocabulary from an Anki TSV export file instead of using AnkiConnect:

saiki words --lang es --input Español.txt --field 2 --field-section first --output words_es_content.txt --debug words_es_debug.tsv
saiki words --lang jp --input Japanese.txt --field 2 --output words_jp_content.txt

When --input is provided, --field specifies the 1-based column index of the text field (default 2). Only that column is mined; other columns such as audio or Anki tags are ignored. By default, all blank-line-separated sections inside the selected field are kept. Use --field-section first when your card format stores target-language text before a translation or note in the same field.

The file-based pipeline:

Parses Anki # header lines (#separator:tab, #html:true, #tags column:N)
Uses Python's csv module for robust TSV parsing
Removes [sound:...mp3] markers, HTML tags, URLs, and email addresses from the field text
Uses the configured language's spaCy model, token filter, and output format
Tracks POS counts, surface forms, example sentences, and source line numbers
Tracks original spaCy lemmas for debugging

Output files produced:

words_<lang>_content.txt — cleaned vocabulary
words_<lang>_debug.tsv — per-entry debug info (only with --debug)

Debug TSV example:

entry	count	pos_counts	top_surface_forms	example_sentences	source_lines	original_lemmas
comer	8	VERB:8	como, come, comen	Yo como manzanas.; Ustedes los comen con arroz.	70,979	comer

Compare

Compare deck vocabulary against a target list:

saiki compare --deck-words words_es_content.txt --target-words target_es_top_1000.txt --output missing_from_deck.txt
saiki compare --deck-words words_es_content.txt --target-words target_es_top_1000.txt --min-frequency 3

Normalises case and optionally strips accents for matching, so cómo and como are treated as the same word. Use --min-frequency to ignore accidental low-frequency words in the deck.

YouTube

Mine vocabulary or sentence rows from YouTube subtitles.

saiki youtube es VIDEO_ID
saiki youtube es VIDEO_ID --top 50
saiki youtube jp VIDEO_ID --mode sentences
saiki youtube es VIDEO_ID --raw --no-stopwords

Export Anki-ready sentence rows:

saiki youtube es VIDEO_ID --mode sentences --out youtube.tsv

Export only rows that appear to contain unknown vocabulary:

saiki youtube es VIDEO_ID \
  --mode sentences \
  --out youtube_new.tsv \
  --known-words ~/Languages/Anki/anki-words/spanish/words_es.txt \
  --only-new

Sentence exports contain:

sentence    timestamp    video_url    vocab_guess

Import

Generate TTS audio and add sentence cards to Anki.

saiki import es
saiki import jp ~/Languages/Anki/sentences_jp.txt
saiki import es youtube.tsv --tags youtube,manual
saiki import es --tts-voice es-MX-DaliaNeural
saiki import es --dry-run   # preview sentences without touching Anki or TTS

The importer accepts plain text sentence files and TSV/CSV files with a sentence column. text-to-speech is always added as a tag. If --tags is not provided, AI-generated is added.

TTS is configured per language with tts_backend. Supported backends are:

edge-tts: default backend using Microsoft Edge neural voices; configure tts_voice.
gtts: free backend using gtts-cli; configure tts_code and tts_tld.
piper: local/offline neural TTS; configure tts_model with a model path. The stock Piper catalog includes Spanish voices, but not Japanese.
espeak-ng: local/offline lightweight TTS; configure tts_voice. Spanish is supported; Japanese is documented as kana-only and is not recommended for normal Japanese sentence cards.
kokoro: local/offline neural TTS; configure tts_model, tts_voices, tts_voice, and tts_code; some Japanese setups also need tts_vocab_config. Kokoro lists Japanese and Spanish voices, but upstream notes that non-English quality can be thin.

You can override backend settings for one import:

saiki import jp sentences_jp.txt \
  --tts-backend edge-tts \
  --tts-voice ja-JP-KeitaNeural

Voice-listing helpers:

saiki tts-voices jp
saiki tts-voices es --backend edge-tts

Test a TTS backend without creating Anki cards:

saiki tts-test es --out /tmp/saiki_edge_default_es.mp3
saiki tts-test jp --tts-backend edge-tts --tts-voice ja-JP-NanamiNeural --out /tmp/saiki_edge_jp.mp3
saiki tts-test es --tts-backend edge-tts --tts-voice es-ES-ElviraNeural --out /tmp/saiki_edge_es.mp3
saiki tts-test es --tts-backend gtts --tts-code es --tts-tld es --out /tmp/saiki_gtts_es.mp3
saiki tts-test es --tts-backend piper --tts-model es_ES-davefx-medium.onnx --tts-config es_ES-davefx-medium.onnx.json --out /tmp/saiki_piper_es.mp3
saiki tts-test es --tts-backend espeak-ng --tts-voice es --out /tmp/saiki_espeak_es.mp3
saiki tts-test es --tts-backend kokoro --tts-model kokoro-v1.0.onnx --tts-voices voices-v1.0.bin --tts-voice ef_dora --out /tmp/saiki_kokoro_es.mp3

For kokoro, put tts_model, tts_voices, and any needed tts_vocab_config in your config file rather than typing every path each time.

Known/New Words

Compare any generated word list against an existing known list:

saiki compare-words transcript_words.txt ~/Languages/Anki/anki-words/spanish/words_es.txt

This prints entries from the first file whose word key does not appear in the second file.

Card Assumptions

The default configuration assumes Basic notes with audio on Front and the target-language sentence on Back. Word mining reads only the first visible line by default; use --full-field to process the whole field.

To Do

Add support for different Anki note/card types, including configurable field mappings per language and per import workflow.
Support multiple import profiles, such as sentence cards, vocab cards, audio cards, and cloze cards.
Let YouTube exports map directly into configurable note fields, not just a fixed sentence column.
Add richer transcript filtering, such as minimum/maximum sentence length, duplicate removal, and punctuation cleanup.
Add optional audio slicing from videos when timestamp data is available.
Improve known/new word matching with better lemmatization for transcript vocabulary.
Add more language profiles beyond Japanese and Spanish.
Build a GUI for common workflows like transcript review, sentence selection, import previews, and configuration editing.
Add integration tests with mocked AnkiConnect responses.
Add shell completion.

Tests

pip install -e ".[dev]"
pytest

License

This project is licensed under the MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
examples		examples
figures		figures
src/saiki		src/saiki
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Saiki

Requirements

Updating

Optional TTS Backends

Configuration

CLI

Audio

Words

Words from TSV Export

Compare

YouTube

Import

Known/New Words

Card Assumptions

To Do

Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Saiki

Requirements

Updating

Optional TTS Backends

Configuration

CLI

Audio

Words

Words from TSV Export

Compare

YouTube

Import

Known/New Words

Card Assumptions

To Do

Tests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages