Skip to content

bigWig format #13

Description

@juuf

Conservation tracks (PhastCons/PhyloP 100-way or here) from UCSC are in bigWig format (https://genome.ucsc.edu/goldenPath/help/bigWig.html). They are dense per-base signals. Converting to bigWig to BED blows up the size by a lot.

For sparse annotations (promoters, enhancers,…) it makes sense that to provide BEDs and statgen paints them onto SNPs. For dense bigWigs, there seems to be no reasonable BED representation that isn’t either huge (full per-base) or lossy (thresholded/binned).

Maybe it makes sense to add a helper like (e.g., load_bigwig_annotations(bigwigs, reference) that:

  • queries bigWigs at the reference SNPs,
  • builds a SNP × annotation matrix,
  • returns an AnnotationPanel with continuous columns.

For indels, mapping examples could be: pick the leftmost reference base of the indel as its representative position or, for larger indels, summarize over the reference interval they span (e.g., mean/median/max signal in [start, end))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions