Skip to content

Neurocoda/Echoline

Repository files navigation

Echoline

Echoline is a local web app for sentence-by-sentence English listening dictation. Upload an audio file, let whisper.cpp generate transcript segments, then practice by listening to one sentence, writing what you hear, repeating the sentence, and revealing the answer when ready.

It is a general English listening tool, not a CET-only app. It works naturally with exam audio, VOA, TED, podcasts, course recordings, and custom materials.

Echoline practice workspace

Highlights

  • Upload local audio and transcribe it through a local whisper.cpp server.
  • Convert raw Whisper fragments into natural sentence-level practice units.
  • Preserve sentence timestamps for precise replay, restart, previous, and next controls.
  • Hide or reveal the current sentence answer during dictation practice.
  • Mark each sentence as practiced or difficult.
  • Save dictation text and progress in browser local storage.
  • Use visible controls or desktop shortcuts for high-frequency listening practice.

Keyboard Shortcuts

  • Space: play / pause.
  • ArrowLeft: restart the current sentence; if already near the start, jump to the previous sentence.
  • ArrowRight: jump to the next sentence.
  • R: repeat the current sentence.
  • H: show / hide the current sentence answer.

Shortcuts are ignored while typing in inputs, textareas, selects, or editable content.

Requirements

  • Node.js and npm.
  • CMake and a C/C++ compiler.
  • ffmpeg available on PATH.

On macOS with Homebrew:

brew install cmake ffmpeg

Quick Start

Install web dependencies:

npm install

Create local configuration:

cp .env.example .env

If you cloned the repository without submodules, initialize whisper.cpp first:

git submodule update --init --recursive

Start Echoline and the bundled whisper.cpp server:

npm run start:all

Open:

http://127.0.0.1:5173

Stop both services:

npm run stop:all

The shell scripts can also be run directly:

./scripts/start.sh
./scripts/stop.sh

Configuration

Runtime configuration lives in .env. The file is intentionally ignored by Git; .env.example is the committed template.

VITE_WHISPER_BASE_URL=http://127.0.0.1:8080
VITE_WHISPER_ENDPOINT_STYLE=whispercpp

WHISPER_HOST=127.0.0.1
WHISPER_PORT=8080
WHISPER_MODEL=whisper.cpp/models/ggml-tiny.en.bin
WHISPER_MODEL_NAME=tiny.en
WHISPER_THREADS=6
WHISPER_LANGUAGE=en
WHISPER_EXTRA_ARGS=

WEB_HOST=127.0.0.1
WEB_PORT=5173

Important fields:

  • VITE_WHISPER_BASE_URL: target used by Vite's same-origin proxy.
  • VITE_WHISPER_ENDPOINT_STYLE: whispercpp tries /inference first; openai tries /v1/audio/transcriptions first.
  • WHISPER_MODEL: local model file used by whisper-server.
  • WHISPER_MODEL_NAME: model name downloaded automatically if WHISPER_MODEL is missing.
  • WHISPER_EXTRA_ARGS: optional extra flags passed to whisper-server.

The default model is tiny.en, which starts quickly but is less accurate. For better transcription quality:

WHISPER_MODEL=whisper.cpp/models/ggml-base.en.bin
WHISPER_MODEL_NAME=base.en

Then run npm run start:all; the script downloads the configured model if needed.

How Startup Works

scripts/start.sh keeps the local runtime self-contained:

  1. Creates .env from .env.example if needed.
  2. Uses the whisper.cpp submodule in the project root.
  3. Builds whisper.cpp/build/bin/whisper-server if missing.
  4. Rebuilds whisper.cpp if an old build still points at a stale project path.
  5. Downloads the configured ggml model if missing.
  6. Starts whisper-server and waits for /health.
  7. Starts the Vite web app and waits for the local URL.

Build output, downloaded models, logs, and pid files are ignored:

whisper.cpp/build/
whisper.cpp/models/ggml-*.bin
logs/
.run/

Sentence Post-Processing

whisper.cpp often returns acoustic chunks instead of complete written sentences. Echoline post-processes the response before showing practice units:

  1. Split coarse segments that contain multiple sentences.
  2. Merge adjacent fragments until a natural sentence boundary appears, usually ., ?, or !.
  3. Avoid common false boundaries such as decimal numbers and common English abbreviations.
  4. Preserve the start time of the first merged fragment and the end time of the last merged fragment.
  5. Fall back to sentence splitting with approximate timings if Whisper returns text without timestamps.

This keeps the practice flow close to: listen to one sentence, write one sentence, repeat, then compare.

Project Layout

Echoline/
  asset/               README screenshot and project assets
  src/                 React app source
  scripts/start.sh     Start web + whisper.cpp
  scripts/stop.sh      Stop web + whisper.cpp
  whisper.cpp/         whisper.cpp Git submodule
  .env.example         Local configuration template

whisper.cpp is tracked as a Git submodule because it is an upstream third-party project. Fresh clones should use:

git clone --recurse-submodules https://github.com/Neurocoda/Echoline.git

Development

npm test
npm run lint
npm run build

Manual Whisper request when the local server is running:

curl http://127.0.0.1:8080/inference \
  -F "file=@/absolute/path/to/audio.mp3" \
  -F response_format=verbose_json \
  -F temperature=0.0 \
  -F temperature_inc=0.2

Health checks:

curl http://127.0.0.1:8080/health
curl http://127.0.0.1:5173/api/whisper/health

About

Sentence-by-sentence English listening dictation app powered by whisper.cpp

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors