Skip to content

mindds/ATP-Note-Processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ATP Note Processing

This folder contains the code used to preprocess clinical notes and extract structured information using Azure OpenAI.

Overview

  • main.py: Lightweight entrypoint for the note extraction workflow.
  • client.py: Azure OpenAI client setup and authentication helper.
  • templates.py: JSON-like extraction templates used by info_extractn.
  • utils.py: Data cleanup and post-processing utility functions.
  • preprocessing-notes.ipynb: Notebook workflow that imports the extraction utilities and processes notes.

Structure

  • main.py
    • Exposes info_extractn() for calling the OpenAI extraction pipeline.
    • Re-exports the shared helpers and templates.
  • client.py
    • Configures openai.AzureOpenAI with Azure credentials and endpoint.
  • templates.py
    • Stores template and template_biomarker definitions used for JSON extraction.
  • utils.py
    • Provides note formatting and cleaning helpers:
      • format_notes
      • extract_suvr_values
      • replace_missing
      • check_qc

Usage

  1. Open preprocessing-notes.ipynb.
  2. Import the extraction functions from main.py.
  3. Load the raw notes CSV and call format_notes().
  4. Run info_extractn(text, template) or info_extractn(text, template_biomarker).

Notes

  • The project uses Azure OpenAI with DefaultAzureCredential().
  • templates.py contains the extraction schema definitions.
  • utils.py includes final data normalization helpers for the extracted DataFrame.

Validation

  • Python files have been syntax-checked with python3 -m py_compile.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors