Evolutionary genomics based on PacBio HiFi long-read sequencing data reveals the importance of structural variants in shaping population-specific differences between Chinese and Indian rhesus macaques (Macaca mulatta)
chmod u+x count_genotypes_ver_4.py
chmod u+x count_genotypes_ver_5.py
chmod u+x find_private_sv_ver_2.py
g++ -o bonferroni_vaf_bin_ver_2 bonferroni_vaf_bin_ver_2.cpp -lm
g++ -o find_significant_private_sv_ver_2 find_significant_private_sv_ver_2.cpp
-
Prepare a tab-delimited .txt from the .vcf that contains basic information of each SVs (including chromosome, position, ID, SV type, SV length, and the genotype of each individual at the site)
bcftools query -H -f '%CHROM\t%POS\t%ID\t%SVTYPE\t%SVLEN[\t%GT]\n' rhesus.SVs.vcf.gz -o rhesus.SVs.txtand then separate by population
sed 's/#//' rhesus.SVs.txt | cut -f 1-15 > rhesus.SVs.Chi.txt sed 's/#//' rhesus.SVs.txt | cut -f 1-5,16- > rhesus.SVs.Ind.txt -
Calculate the reference, alternative, and minor allele frequencies in both populations
python3 ./count_genotypes_ver_4.py rhesus.SVs.Chi.txt rhesus.SVs.Chi.AF.txt python3 ./count_genotypes_ver_4.py rhesus.SVs.Ind.txt rhesus.SVs.Ind.AF.txt
-
Calculate the alternative allele count and frequency of each SV in each population
python3 count_genotypes_ver_5.py rhesus.SVs.Chi.txt rhesus.SVs.Chi.AC.AF.txt python3 count_genotypes_ver_5.py rhesus.SVs.Ind.txt rhesus.SVs.Ind.AC.AF.txt -
Merge the population-specific alternative allele counts and frequencies of each SV and add header information
paste <(cut -f 1-8 rhesus.SVs.Chi.AC.AF.txt) <(cut -f 6-8 rhesus.SVs.Ind.AC.AF.txt) | sed '1d' > rhesus.SVs.AC.AF.txt { printf 'CHROM\tPOS\tID\tSVTYPE\tSVLEN\tChi_GENOS\tChi_VAC\tChi_VAF\tInd_GENOS\tInd_VAC\tInd_VAF\n'; cat rhesus.SVs.AC.AF.txt; } > tmp && mv tmp rhesus.SVs.AC.AF.txt -
Identify population-private SVs
python3 find_private_sv_ver_2.py rhesus.SVs.AC.AF.txt rhesus.SVs.PopPriv.txt sed '1d' rhesus.SVs.PopPriv.txt | awk '{if ($6 == "Chi") print}' > rhesus.SVs.PopPriv.Chi.txt sed '1d' rhesus.SVs.PopPriv.txt | awk '{if ($6 == "Ind") print}' > rhesus.SVs.PopPriv.Ind.txt -
Apply a Bonferroni correction
./bonferroni_vaf_bin_ver_2 -in rhesus.SVs.PopPriv.Chi.txt -out rhesus.SVs.PopPriv.Chi.Bonferroni.txt ./bonferroni_vaf_bin_ver_2 -in rhesus.SVs.PopPriv.Ind.txt -out rhesus.SVs.PopPriv.Ind.Bonferroni.txt -
Identify significant population-private SVs
./find_significant_private_sv_ver_2 -in rhesus.SVs.PopPriv.Chi.txt -cv rhesus.SVs.PopPriv.Chi.Bonferroni.txt -out rhesus.SVs.PopPriv.Chi.Bonferroni.signifiant.txt ./find_significant_private_sv_ver_2 -in rhesus.SVs.PopPriv.Ind.txt -cv rhesus.SVs.PopPriv.Ind.Bonferroni.txt -out rhesus.SVs.PopPriv.Ind.Bonferroni.signifiant.txt