Skip to content

bioinf-fi/rDNA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 

Repository files navigation

Human rDNA Assembly Repository

This repository contains datasets for the rDNA assembly and analysis, using the individual PAN027 (HG06807) from the washu pedigree. The project aims to assemble as complete diploid rDNA arrays as possible, given the current state of technologies and algorithms. The datasets below were predominantly generated in Miga lab at UCSC, in collaborations with WashU.

rDNA links

Overview of the available rDNA datasets

Dataset Number of reads Coverage >Q15 (QV) >Q30 (QV) Sequencing Date
WGS 14,922 27x 92.8% 0.8% 06/2025
AS 13,774 29x 92.4% 2.1% 08/2025
HTFC 18,569 35x 96.3% 2.3% 11/2025
WGS + AS 28,696 56x 92.6% 1.4% -
WGS + AS + HTFC 47,265 91x 94.1% 1.8% -

Methodology

These datasets were basecalled using a new experimental ONT model called hyperbasecalling, which offers higher precision compared to the standard SUP (Super Accuracy) model. However, please note that this increased precision comes with a computational cost, as hyperbasecalling is approximately 10x slower than SUP.

To identify reads containing rDNA units, we aligned the sequencing data against an rDNA array reference (10x KY962518-ROT reference) using minimap2 version 2.30-r1287. This method outperformed single-unit reference mapping (one KY962518-ROT), yielding significantly higher coverage.

Current rDNA assemblies

Chromosome 14 maternal (active) and paternal (inactive) rDNA arrays.

maternal rDNA array paternal rDNA array
chr14 maternal chr14 paternal

Other datasets

About

rDNA arrays assembly

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors