When using an aligner which is allowed to soft-clip the reads, this can error out with an index error at poss[idx] as len(read.query_length()) used to define idx will be greater than len(read.get_reference_positions(). Would it be possible for iCount to support use of aligners with soft-clipping?
|
poss = read.get_reference_positions() |
|
if read.is_reverse: |
|
strand = '-' |
|
xlink_pos = poss[-1] + 1 |
|
end_pos = poss[0] |
|
else: |
|
strand = '+' |
|
xlink_pos = poss[0] - 1 |
|
xlink_pos = 1 if xlink_pos < 1 else xlink_pos # Case of neg. pos on circular MT |
|
end_pos = poss[-1] |
|
|
|
chrom = read.reference_name |
|
second_start, is_strange = _second_start(read, poss, strand, chrom, segmentation, gap_th) |
|
if is_strange: |
|
metrics.strange_recs += 1 |
|
|
|
# Position of middle nucleotide. Because we can have spliced reads, middle position is |
|
# not necessarily the middle of region the read maps to. Take one nucleotide upstream |
|
# of center in case length is even, happens by default on + strand |
|
idx = read.query_length // 2 - 1 if (strand == '-' and read.query_length % 2 == 0) \ |
|
else read.query_length // 2 |
|
middle_pos = poss[idx] |
As an example, BAM read:
NS500105:608:HHF3NBGXN:4:22601:5900:10521:N:0:GTCCGCrbc:TTAAGAC 147 chr1 189874 0 46S30M = 189693 -211 CACGACATCCTCCTCCCAGTCGCCCCTGTAGCTCCCCTACCTCCAAGAGGGTGTGGGATGGTGGAGGGGTTTGAGA EEEAEA6EAAAAEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEE/EEEAEEAEEEEEEEEEAEEEEEEEEAAA/A AS:i:60 XS:i:98 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:30 YS:i:96 YT:Z:CP NH:i:2
poss: [189873, 189874, 189875, 189876, 189877, 189878, 189879, 189880, 189881, 189882, 189883, 189884, 189885, 189886, 189887, 189888, 189889, 189890, 189891, 189892, 189893, 189894, 189895, 189896, 189897, 189898, 189899, 189900, 189901, 189902] (length=30)
idx:37
When using an aligner which is allowed to soft-clip the reads, this can error out with an index error at
poss[idx]aslen(read.query_length())used to defineidxwill be greater thanlen(read.get_reference_positions(). Would it be possible for iCount to support use of aligners with soft-clipping?iCount/iCount/mapping/xlsites.py
Lines 471 to 492 in 4260bae
As an example, BAM read:
NS500105:608:HHF3NBGXN:4:22601:5900:10521:N:0:GTCCGCrbc:TTAAGAC 147 chr1 189874 0 46S30M = 189693 -211 CACGACATCCTCCTCCCAGTCGCCCCTGTAGCTCCCCTACCTCCAAGAGGGTGTGGGATGGTGGAGGGGTTTGAGA EEEAEA6EAAAAEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEE/EEEAEEAEEEEEEEEEAEEEEEEEEAAA/A AS:i:60 XS:i:98 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:30 YS:i:96 YT:Z:CP NH:i:2poss:[189873, 189874, 189875, 189876, 189877, 189878, 189879, 189880, 189881, 189882, 189883, 189884, 189885, 189886, 189887, 189888, 189889, 189890, 189891, 189892, 189893, 189894, 189895, 189896, 189897, 189898, 189899, 189900, 189901, 189902](length=30)idx:37