之前写的一个小工具,写的很简陋,名字取的也很随意就叫skr,哈哈。主要是fq转fa、合并多个染色体的vcf文件等,功能不多(主要是C写起来太操蛋了T_T),通常我也只用来统计fastq文件信息:
这里给出工具地址:https://github.com/sharkLoc/skrTools
usage:
Program: skr
Usage: skr
fq2fa translate fastq file to fasta
fqstat summary statistics of fastq file
mergeVcf merge vcf files from list
statVcf summary statistics of vcf file
makewind make bed from a list file
统计fastq文件信息:
输出read的平均长度,GC含量,总read数量和总的碱基数量,当然还包括ATGC和N碱基的数量和百分比,最后就是Q20和Q30结果。
skr fqstat -i xx1.fq.gz -I xx2.fq.gz
输出文件:
Iterm reads_1.fq reads_2.fq
read average length: 150 150
read GC content(%): 48.42 48.48
total read Count: 34946389 34946389
total base Count: 5241958350 5241958350
base A Count: 1352284833(25.80%) 1342903044(25.62%)
base C Count: 1270459966(24.24%) 1246706604(23.78%)
base G Count: 1267522866(24.18%) 1294357728(24.69%)
base T Count: 1351401800(25.78%) 1357986115(25.91%)
base N Count: 288885(0.01%) 4859(0.00%)
Number of base calls with quality value of 20 or higher (Q20+) (%) 5113248711(97.54%) 5092440219(97.15%)
Number of base calls with quality value of 30 or higher (Q30+) (%) 4886887711(93.23%) 4832524601(92.19%)
手机扫一扫
移动阅读更方便
你可能感兴趣的文章