scRT-agent is a focused workflow for paired single-cell RNA sequencing (scRNA-seq) and single-cell T cell receptor sequencing (scTCR-seq) analysis. It combines a fixed single-cell analysis layer with role-specific LLM agents for literature-grounded hypothesis generation, clone-aware validation, downstream analysis planning, mechanism interpretation, and report generation.
The workflow is RNA-first: transcriptional states, programs, tissues, conditions and patient-level contrasts define the biological question. TCR clonotypes are used as supporting evidence for lineage, persistence, clone expansion, sharing, receptor follow-up priority and state occupancy. Clone expansion or sharing is not interpreted as antigen specificity without orthogonal evidence.
- Desktop launcher panel through
scrta-agent guiorgui.bat. - Guided terminal workflow through
scrta-agent interactive. - LLM-assisted input preparation for common scRNA-seq and scTCR-seq file
layouts, producing a workflow-ready RNA
.h5adfile and normalized TCR table. - Project-folder and multi-sample archive input support for common sequencing delivery layouts.
- User-controlled hypothesis selection and editing before the deep-dive stage.
- Dataset profiling for
.h5adscRNA-seq objects and tabular scTCR files. - Standard paired scRNA/scTCR analysis script generation.
- T-cell subclustering and marker-based state annotation support.
- Clone-size bins compatible with common repertoire summaries.
- Patient-aware, tissue-aware and clone-size-aware summary tables.
- RAG context injection from a local JSONL literature index.
- Biology-first hypothesis generation and hypothesis selection.
- LLM-written deep-dive and downstream analysis scripts.
- Biological interpretation, mechanism mapping and next-test proposals.
- Exported run artifacts, scripts, logs, tables and figures.
git clone https://github.com/tangaode/scRT-agent.git
cd scRT-agent
pip install -e ".[analysis,llm]"For development:
pip install -e ".[analysis,llm,dev]"
pytestThe main workflow expects:
rna_h5ad_path: an.h5adfile containing the scRNA-seq matrix and cell metadata.tcr_path: a tabular scTCR file such as a 10xfiltered_contig_annotations.csvfile or a table containing barcode, clonotype, chain, CDR3 and V/J gene columns.
Useful RNA metadata columns include patient, sample, tissue, condition, timepoint, response group, cluster and cell type. The workflow attempts to profile available metadata and infer join keys before analysis.
The interactive preparation layer can also start from common raw or processed inputs:
- Project folders containing multiple sample folders or sample archives.
- Sample archives such as ZIP, TAR, TAR.GZ and TGZ files.
- RNA
.h5adfiles. - 10x
filtered_feature_bc_matrixorraw_feature_bc_matrixdirectories. - GEO-style prefixed 10x triplets such as
SAMPLE.matrix.mtx.gz,SAMPLE.barcodes.tsv.gzandSAMPLE.features.tsv.gz. - 10x HDF5 gene-expression matrices.
- Dense text expression matrices in CSV, TSV or TXT format.
- Loom or AnnData zarr stores when the required Python readers are installed.
- 10x VDJ contig tables, clonotype tables, AIRR TSV files, or other tabular TCR files with barcode and clonotype or receptor-sequence fields.
Large matrices are never sent to the LLM. The LLM reviews a file inventory and
proposes a preparation plan; conversion is performed locally with standard
Python readers. When multiple compatible samples are found, scRT-agent combines
them into one .h5ad file and one normalized TCR table while preserving
sample_id, input_sample_id and input_source_path metadata.
Set one of the following API keys before running the workflow:
export OPENAI_API_KEY="your_api_key"
# or
export SCRTA_AGENT_API_KEY="your_api_key"For local desktop use, the key can also be placed in a root-level .env or
.scrta_agent.env file next to gui.bat:
OPENAI_API_KEY=your_api_keyFor OpenAI-compatible endpoints, the same file can include:
SCRTA_AGENT_API_KEY=your_api_key
SCRTA_AGENT_API_BASE=https://your-compatible-endpoint/v1Local .env files are ignored by Git and should not be committed.
For OpenAI-compatible endpoints:
export SCRTA_AGENT_API_BASE="https://your-compatible-endpoint/v1"The default model can be overridden with --model.
For the guided workflow:
scrta-agent guiOn Windows, double-click gui.bat from the repository root after installation
or run it from a terminal. The desktop launcher provides file browsers,
configuration save/reload, run status, live logs and a Hypothesis Review panel.
When candidate hypotheses are generated, the panel is populated with the
candidate list and editable fields for the selected hypothesis, explanation,
required tests, falsification criteria and source tables. The workflow waits
until the user confirms a hypothesis before continuing to deep-dive analysis.
For the terminal wizard:
scrta-agent interactiveTo prepare inputs without launching the full workflow:
scrta-agent prepare \
--rna-input /path/to/rna_project_folder \
--tcr-input /path/to/tcr_project_folder \
--out ./prepared_inputs/example \
--analysis-name example_scrna_sctcrThen run the main workflow on the prepared files:
scrta-agent run \
--rna /path/to/sample.h5ad \
--tcr /path/to/filtered_contig_annotations.csv \
--analysis-name example_scrna_sctcr \
--out ./runs \
--brief "Identify RNA-defined T-cell states with conservative TCR lineage support." \
--executeTo manually choose and edit the selected hypothesis after LLM hypothesis generation:
scrta-agent run \
--rna /path/to/sample.h5ad \
--tcr /path/to/filtered_contig_annotations.csv \
--analysis-name example_interactive_selection \
--out ./runs \
--execute \
--interactive-hypothesis-selectionWith a local RAG index:
scrta-agent run \
--rna /path/to/sample.h5ad \
--tcr /path/to/tcr.tsv.gz \
--analysis-name example_rag_run \
--out ./runs \
--rag-index /path/to/rag_chunks.jsonl \
--rag-top-k 10 \
--brief "Propose and test biology-first hypotheses for this paired scRNA/scTCR cohort." \
--executeDisable optional loops if needed:
scrta-agent run \
--rna /path/to/sample.h5ad \
--tcr /path/to/tcr.tsv \
--out ./runs \
--no-deep-dive \
--no-mechanism-loop \
--no-downstream-analysisThe repository includes helper scripts for legal/open full-text retrieval and structured card generation. A typical local build is:
python scripts/build_scrna_sctcr_rag.py \
--out ./rag_kb/scrna_sctcr \
--seed-csv /path/to/literature_cards.csvThe resulting JSONL chunks can be passed with --rag-index.
Each run writes a timestamped directory under the selected output root. Common artifacts include:
dataset_profile.mdanddataset_profile.jsonenvironment.mdandenvironment.jsonrag_context_*.mdagent_*.mdrag_grounded_hypothesis_candidates.mdselected_hypothesis.mdandselected_hypothesis.jsonscripts/scrna_sctcr_joint_analysis.pyscripts/hypothesis_deep_dive.pyscripts/hypothesis_downstream_analysis.pyscripts/biology_mechanism.pyscripts/publication_figures.pyanalysis_outputs/*.csvanalysis_outputs/figures/*.pnganalysis_outputs/publication_figures/*.pdffinal_report.md
List agent roles:
scrta-agent agents
scrta-agent agents --jsonRun from a JSON config:
scrta-agent run --config examples/config.example.jsonImportant options:
--execute: run the generated analysis script.--interactive-hypothesis-selection: pause after hypothesis generation so the user can select and edit the hypothesis before deep-dive analysis.--repair-attempts N: retry script execution after transient failures.--script-timeout SECONDS: set script execution timeout.--rag-index PATH: inject local RAG chunks into agent prompts.--rag-top-k N: number of retrieved chunks per agent call.--model MODEL: LLM model name.
scRT-agent treats TCR evidence conservatively:
- Clone expansion supports clonal enrichment, not antigen specificity.
- Shared clonotypes support lineage relatedness or state occupancy, not migration by themselves.
- CDR3 similarity or V/J usage can prioritize receptor follow-up, but does not establish antigen identity without experimental validation.
- Patient structure, sample composition and clone-size effects should be controlled before drawing cohort-level conclusions.
src/scrta_agent/: package source code.src/scrta_agent/prompts/: role prompts.src/scrta_agent/templates/: generated Python script templates.scripts/: optional RAG and literature preparation utilities.skills/: domain workflow rules loaded by the package.examples/: minimal configuration example.tests/: lightweight package tests.
This repository is provided for research use. Add a project-specific license before redistribution if required by your institution or journal.