bigWig format

Conservation tracks ([PhastCons/PhyloP 100-way](https://genome.ucsc.edu/cgi-bin/hgTables?hgsid=4037674825_jpNk3kmBl1rvoVWT5aVrvqfeutF0&db=hg38&hgta_group=compGeno&hgta_track=cons100way&hgta_table=phastCons100way&hgta_regionType=genome&position=chr7%3A155%2C592%2C223-155%2C605%2C565&hgta_outputType=maf&hgta_outFileName=) or [here](http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons100way/hg38.phastCons100way.bw)) from UCSC are in bigWig format (https://genome.ucsc.edu/goldenPath/help/bigWig.html). They are dense per-base signals. Converting to bigWig to BED blows up the size by a lot.

For sparse annotations (promoters, enhancers,…) it makes sense that to provide BEDs and statgen paints them onto SNPs. For dense bigWigs, there seems to be no reasonable BED representation that isn’t either huge (full per-base) or lossy (thresholded/binned).

Maybe it makes sense to add a helper like (e.g., load_bigwig_annotations(bigwigs, reference) that:
- queries bigWigs at the reference SNPs,
- builds a SNP × annotation matrix,
- returns an AnnotationPanel with continuous columns.

For indels, mapping examples could be: pick the leftmost reference base of the indel as its representative position or, for larger indels, summarize over the reference interval they span (e.g., mean/median/max signal in [start, end))



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bigWig format #13

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

bigWig format #13

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions