BrainDecode: Computational workflow for detecting, validating, and analyzing amino acid substitutions from alternate RNA decoding in aging and neurodegeneration.

This repository provides tools to identify, validate, and quantify amino acid substitutions in LC-MS proteomics data that arise from alternative RNA decoding. The described pipelines evaluate multiplexed LC-MS proteomics data described in Bai et al. (2020), Ping et al. (2018), and Takasugi et al. (2024).

This project builds on the work of the Slavov Laboratory:

The repository contains the code, templates, sample maps, reusable dependency tables, and analysis notebooks. Large raw/search inputs, MaxQuant output, and plot exports are kept outside the Git repository in Project_BrainDecode.

Repository Layout

BrainDecode/
├── Dependencies/
│   ├── Analysis_Outputs/
│   │   ├── Bai_2020/
│   │   ├── Ping_2018/
│   │   └── Takasugi_2024/
│   ├── Bai_2020/
│   ├── PD_2026/
│   ├── Ping_2018/
│   │   ├── acg/
│   │   └── fc/
│   ├── Sample_maps/
│   └── Takasugi_2024/
├── MQ_templates/
├── Scripts/
│   ├── Analysis scripts/
│   ├── Generation scripts/
│   └── Pipeline scripts/
├── LICENSE
└── README.md

External Project Data

Large data files are expected in a sibling folder:

Project_BrainDecode/
├── Analysis_Inputs/
│   ├── Bai_2020/
│   ├── Ping_2018/
│   └── Takasugi_2024/
├── mq_output/
└── Plots/

Key Folders

Scripts/Generation scripts/ contains scripts that generate MaxQuant XML files, pipeline scripts, and translation resources.

Scripts/Pipeline scripts/ contains the pipeline template scripts used for SAAP detection, validation, and quantification.

Scripts/Analysis scripts/ contains the Jupyter notebooks for dependency generation and downstream analyses.

MQ_templates/ contains MaxQuant XML templates.

Dependencies/Sample_maps/ contains sample map spreadsheets used by the notebooks.

Dependencies/Ping_2018/, Dependencies/Takasugi_2024/, and Dependencies/Bai_2020/ contain reusable analysis dependency files such as fragment dictionaries, PTM heatmap data, dataset metrics, and validation summaries.

Dependencies/Analysis_Outputs/ contains non-plot analysis outputs such as .xlsx, .tsv, and .p files. Plot files should not be stored there.

Analysis Workflows

Place or generate raw analysis inputs in Project_BrainDecode/Analysis_Inputs/.
Place MaxQuant output in Project_BrainDecode/mq_output/.
Use scripts in Scripts/Generation scripts/ to generate XML or pipeline files as needed.
Run notebooks in Scripts/Analysis scripts/.
Save tables and reusable outputs to Dependencies/Analysis_Outputs/.
Save plots to Project_BrainDecode/Plots/.

Data Generation Workflows

Step 1: Custom protein databases

Use RNA-seq data matched to LC-MS proteomics data to create sample-specific protein databases.

The code for this step is in custom_protein_database_pipeline and the README.md in that directory contains detailed instructions for running the code.

If no matched RNA-seq data is available, this step can be skipped, but caution should be taken in interpreting quantified amino acid substitutions as there is lower confidence that they are not encoded in the genome.

Step 2: Identifying modified peptides with MaxQuant

Guide for running MaxQuant in Linux.

Step 3: Identifying candidate alternate translation events

Step 4. Validation search with MaxQuant (or another proteomics data search engine)

Step 5. Quantify alternate decoding events

Step 6. Downstream data analysis

Running BLASTp

cd ~/bin
wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.17.0+-x64-linux.tar.gz
tar -xzf ncbi-blast-2.17.0+-x64-linux.tar.gz
rm ncbi-blast-2.17.0+-x64-linux.tar.gz

echo 'export PATH=$HOME/bin/ncbi-blast-2.17.0+/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

blastp -version
makeblastdb -version

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Dependencies		Dependencies
MQ_templates		MQ_templates
Scripts		Scripts
gnomAD		gnomAD
interproscan		interproscan
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BrainDecode: Computational workflow for detecting, validating, and analyzing amino acid substitutions from alternate RNA decoding in aging and neurodegeneration.

Repository Layout

External Project Data

Key Folders

Analysis Workflows

Data Generation Workflows

Step 1: Custom protein databases

Step 2: Identifying modified peptides with MaxQuant

Step 3: Identifying candidate alternate translation events

Step 4. Validation search with MaxQuant (or another proteomics data search engine)

Step 5. Quantify alternate decoding events

Step 6. Downstream data analysis

Running BLASTp

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BrainDecode: Computational workflow for detecting, validating, and analyzing amino acid substitutions from alternate RNA decoding in aging and neurodegeneration.

Repository Layout

External Project Data

Key Folders

Analysis Workflows

Data Generation Workflows

Step 1: Custom protein databases

Step 2: Identifying modified peptides with MaxQuant

Step 3: Identifying candidate alternate translation events

Step 4. Validation search with MaxQuant (or another proteomics data search engine)

Step 5. Quantify alternate decoding events

Step 6. Downstream data analysis

Running BLASTp

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages