GitHub - Dougherty-Lab/astrocyte_sn-mpra

Overview

This repository contains analysis code and custom processing scripts used for massively parallel reporter assay (MPRA) experiments, single-nucleotide mutagenesis studies, imaging analyses, and RNA structural feature prediction. It includes manuscript analysis notebooks, reusable functions, and a custom barcode counting pipeline for sequencing-based MPRA datasets.

Repository Contents

Manuscript and Data Analysis

File	Description
`SN_MPRA_manuscript_code.Rmd`	Analysis code for the original tiling single-nucleotide MPRA (SN-MPRA) library.
`SN_MPRA_followup_manuscript_code.Rmd`	Analysis code for follow-up single nucleotide mutagenesis MPRA experiments.
`imaging_stats_analysis.R`	Statistical analysis scripts for imaging datasets.
`localization_mpra_functions.R`	Shared functions used by `SN_MPRA_manuscript_code.Rmd` and `SN_MPRA_followup_manuscript_code.Rmd`.

RNA Structure / Sequence Feature Analysis

File	Description
`rG4.ipynb`	RNA G-quadruplex (rG4) detection notebook adapted from the online `rG4detector` package.
`vienna_RNA.ipynb`	RNA folding / ΔG prediction notebook adapted from the `ViennaRNA` package.

Custom Barcode Counting Pipeline

These files comprise a custom pipeline for counting barcodes from sequencing reads generated in MPRA experiments.

File	Description
`MPRA_count.py`	Main component of the custom barcode counting workflow.
`count.py`	Barcode counting utility script.
`count.txt`	Supporting configuration or reference file used in barcode counting.
`countSetup.py`	Setup/configuration script for counting pipeline.
`merge.py`	Script for merging intermediate count files or sequencing outputs.
`pandaseq.sh`	Shell script for paired-end read assembly using PANDAseq.
`processFQ.py`	FASTQ preprocessing script for barcode counting pipeline.
`setup_multi.py`	Setup script for multiprocessing barcode matching workflow.
`string_match_multi.c`	C implementation for high-speed string matching.
`string_match_multi.pyx`	Cython wrapper/source for accelerated string matching.
`string_match_multi.cpython-313-x86_64-linux-gnu.so`	Compiled shared object for Python integration of string matching module.

Requirements

Software requirements depend on which components are used. Common dependencies may include:

R (with tidyverse, rmarkdown, and statistical packages)
Python 3.x
Jupyter Notebook
Cython
GCC / C compiler
PANDAseq
ViennaRNA
rG4detector and associated Python packages

Typical Workflow

Preprocess sequencing reads using pandaseq.sh and processFQ.py
Count barcodes using the custom counting scripts
Analyze MPRA datasets using the R Markdown notebooks
Run imaging statistics using imaging_stats_analysis.R
Predict RNA structural features using the provided notebooks

Notes

Some notebooks/scripts incorporate code adapted from external packages (rG4detector, ViennaRNA).
File paths and input formats may need to be updated for your local environment.
Compiled binaries (.so) may need to be rebuilt depending on operating system and Python version.

Citation

If using this repository in academic work, please cite the associated manuscript(s) and relevant external tools/packages.

Contact

For questions or collaboration inquiries, please open an issue or contact the repository owner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Repository Contents

Manuscript and Data Analysis

RNA Structure / Sequence Feature Analysis

Custom Barcode Counting Pipeline

Requirements

Typical Workflow

Notes

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
MPRA_count.py		MPRA_count.py
README.md		README.md
SN_MPRA_followup_manuscript_code.Rmd		SN_MPRA_followup_manuscript_code.Rmd
SN_MPRA_manuscript_code.Rmd		SN_MPRA_manuscript_code.Rmd
count.py		count.py
count.txt		count.txt
countSetup.py		countSetup.py
imaging_stats_analysis.R		imaging_stats_analysis.R
localization_mpra_functions.R		localization_mpra_functions.R
merge.py		merge.py
pandaseq.sh		pandaseq.sh
processFQ.py		processFQ.py
rG4.ipynb		rG4.ipynb
setup_multi.py		setup_multi.py
string_match_multi.c		string_match_multi.c
string_match_multi.cpython-313-x86_64-linux-gnu.so		string_match_multi.cpython-313-x86_64-linux-gnu.so
string_match_multi.pyx		string_match_multi.pyx
vienna_RNA.ipynb		vienna_RNA.ipynb

Folders and files

Latest commit

History

Repository files navigation

Overview

Repository Contents

Manuscript and Data Analysis

RNA Structure / Sequence Feature Analysis

Custom Barcode Counting Pipeline

Requirements

Typical Workflow

Notes

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages