Skip to content

transfer FHIR pipeline to branch#1155

Open
jhnwu3 wants to merge 3 commits into
masterfrom
add/fhir_ehr_mamba
Open

transfer FHIR pipeline to branch#1155
jhnwu3 wants to merge 3 commits into
masterfrom
add/fhir_ehr_mamba

Conversation

@jhnwu3
Copy link
Copy Markdown
Collaborator

@jhnwu3 jhnwu3 commented May 31, 2026

This pull request introduces comprehensive support for FHIR (Fast Healthcare Interoperability Resources) datasets in PyHealth, including a generic, YAML-configurable FHIR ingest engine, a pre-configured MIMIC-IV-on-FHIR dataset, and a full clinical prediction pipeline using new tasks and models. The documentation is significantly expanded to cover these new features, and an end-to-end example is provided for users. Key changes are grouped below:

FHIR Dataset Support and Documentation:

  • Added FHIRDataset, a generic, YAML-configurable dataset for ingesting HL7 FHIR NDJSON exports, along with detailed documentation and usage instructions. This engine supports flexible configuration of resource flattening and event schema via YAML, with caching and validation. (docs/api/datasets/pyhealth.datasets.FHIRDataset.rst [1] pyhealth/datasets/fhir/__init__.py [2]
  • Introduced MIMIC4FHIR, a subclass of FHIRDataset pre-configured for the PhysioNet MIMIC-IV-on-FHIR export, including documentation and resource coverage details. (docs/api/datasets/pyhealth.datasets.MIMIC4FHIR.rst docs/api/datasets/pyhealth.datasets.MIMIC4FHIR.rstR1-R78)
  • Registered FHIRDataset and MIMIC4FHIR in the main datasets API and documentation. (docs/api/datasets.rst [1] pyhealth/datasets/__init__.py [2]

New Task and Model for FHIR-based Clinical Prediction:

  • Added MPFClinicalPredictionTask, supporting multitask prompted fine-tuning (MPF) style binary clinical prediction on FHIR token timelines, with documentation. (docs/api/tasks/pyhealth.tasks.mpf_clinical_prediction.rst [1] docs/api/tasks.rst [2]
  • Introduced EHRMambaCEHR, a model combining CEHR-style embeddings and Mamba blocks for FHIR token streams, with API documentation and registration. (docs/api/models/pyhealth.models.EHRMambaCEHR.rst [1] docs/api/models.rst [2]

Example and Usability Improvements:

  • Added a runnable example (examples/mimic4fhir_mpf_ehrmamba.py) demonstrating the full pipeline: dataset loading, task setup, model instantiation, training, and evaluation on the MIMIC-IV FHIR demo dataset.

Internal Improvements:

These changes make PyHealth a first-class tool for working with FHIR data, enabling both out-of-the-box use with MIMIC-IV and easy adaptation to other FHIR exports.

@jhnwu3 jhnwu3 requested a review from Logiquo May 31, 2026 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant