NarrativePulse is my final project for an introductory Python course. It is a small CLI tool that analyzes writing style from plain text files and compares two documents.
analyze <file>: prints style metrics for one documentcompare <file_a> <file_b>: computes a style similarity score and shows metric deltas- Finds repeated bigrams/trigrams as simple repetition hotspots
lexical_diversity: unique tokens / all tokenssentence_rhythm: variation in sentence lengths (coefficient of variation)dialogue_ratio: share of sentences containing quote charactersavg_sentence_length: average token count per sentencestyle_signature: compact 4-value vector used for comparisonstyle_similarity: cosine similarity between two style signatures
- Read UTF-8
.txt/.mdfiles. - Split into paragraphs, sentences, and normalized tokens.
- Compute the metrics above.
- For comparison mode, calculate cosine similarity from the two signatures.
uv pip install -e .uv run -m narrativepulse --helpAnalyze one document:
uv run -m narrativepulse analyze examples/sample_a.txt --top 5Compare two documents:
uv run -m narrativepulse compare examples/sample_a.txt examples/sample_b.txt --top 5NarrativePulse compare report
style_similarity: 0.9973 (very high)
...
metric_deltas (A - B):
- lexical_diversity: 0.0252
- sentence_rhythm: -0.0428
- This is a rule-based project, not a deep NLP model.
- Sentence splitting is punctuation-based (
.,!,?), so edge cases exist. - Dialogue detection is quote-marker based.
- Very short texts can produce unstable similarity scores.
python3 -m unittest discover -s tests