Course: AI380S26 — AI in Healthcare Author: Jiel Selmani
This project applies five core NLP techniques to a corpus of approximately 3,000 posts from the public r/GriefSupport subreddit and uses the resulting feature set to build a two-stage screening prototype that scores a free-text reflection against the seven dimensions of the PG-13-R prolonged-grief-disorder framework (Prigerson et al., 2021).
- Stage 1 extracts structured features from the reflection: VADER sentiment, an NRC-derived emotion lexicon, LIWC-style linguistic categories, and per-dimension PG-13-R scores.
- Stage 2 forwards the text and Stage 1 scores to a large language model (Anthropic Claude or OpenAI GPT) for narrative interpretation.
This is a research prototype. It is not a clinical assessment, not a diagnostic instrument, and not a substitute for professional care.
grief-analysis/
├── src/ analysis pipeline (collect, preprocess, analyze, screen)
│ ├── collect_data.py download r/GriefSupport posts via Arctic Shift
│ ├── preprocess.py clean and normalize raw posts
│ ├── analysis.py five core NLP techniques (figures 1–6)
│ ├── deep_analysis.py cross-loss, complicated grief, trajectories
│ └── screener.py two-stage screening prototype
├── api/ Flask JSON API exposing the screener
│ └── server.py
├── web/ Vite + React + Tailwind frontend
├── data/ raw and processed post data
├── results/ CSVs and JSON summaries
├── figures/ generated plots (fig0–fig11)
├── references/ cited PDFs
├── report/ write-up
└── requirements.txt
- Python 3.11 or newer
- Node.js 18 or newer (for the web UI)
- An API key for Anthropic or OpenAI (only required for Stage 2 interpretation; Stage 1 runs locally without a key)
cd grief-analysis
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtThe first run of analysis.py or screener.py will download the NLTK
resources it needs (vader_lexicon, stopwords, punkt).
cd web
npm installThe API and the web UI run as two separate processes.
# from the project root, with the venv active
python api/server.py
# → http://localhost:5001Endpoints:
| Method | Path | Body |
|---|---|---|
| GET | /api/health |
— |
| POST | /api/screen |
{ text, provider, api_key?, model? } |
provider is anthropic or openai. If api_key is omitted, only Stage 1
results are returned.
cd web
npm run dev
# → http://localhost:5173Vite proxies /api/* to the Flask server on port 5001. Open the Settings
panel in the header to choose Anthropic or OpenAI, paste an API key, and
optionally override the default model. Keys are stored in browser
localStorage and sent only with screening requests.
Default models:
- Anthropic:
claude-sonnet-4-5-20250929 - OpenAI:
gpt-4o-mini
Each step writes to data/, results/, or figures/. Run them in order
from the project root with the venv active:
python src/collect_data.py # download posts via Arctic Shift
python src/preprocess.py # clean and normalize
python src/analysis.py # five core NLP techniques (figs 1–6)
python src/deep_analysis.py # cross-loss, complicated grief, trajectories
python src/screener.py --evaluate # validate Stage 1 against the corpusOther screener.py modes:
python src/screener.py --demo # Stage 1 on sample texts
python src/screener.py --demo --api-key sk-... # Full pipeline on samples
python src/screener.py --interactive # Paste-and-screenresults/grief_posts_analyzed.csv,grief_posts_deep_analysis.csv— per-post feature tablesresults/analysis_summary.json,deep_analysis_summary.json— aggregate summariesresults/screener_evaluation.json— Stage 1 validation against the corpusfigures/fig0–fig11— methodology pipeline, sentiment, emotion, linguistic features, word clouds, topics, temporal trends, cross-loss fingerprints, complicated grief, trajectories, community response, and screener evaluation plots
This is a research prototype built for coursework. It is not a clinical assessment, not a diagnostic instrument, and not a substitute for professional care. If you or someone you know is struggling with grief or mental health, please reach out to a qualified professional.