1、下载安装 https://bitbucket.org/mroachawri/purge_haplotigs/wiki/Install
$ sudo apt install bedtools
$ bedtools --version
bedtools v2.26.0
$ sudo apt install samtools
$ samtools --version
samtools 1.7
Using htslib 1.7-2
Copyright (C) 2018 Genome Research Ltd.
$ sudo apt install r-base r-base-dev
$ sudo su - -c "R -e \"install.packages('ggplot2', repos='http://cran.rstudio.com/')\""
# download the latest release from https://github.com/lh3/minimap2/releases (currently v2.13)
$ wget https://github.com/lh3/minimap2/releases/download/v2.13/minimap2-2.13_x64-linux.tar.bz2
$ tar xf minimap2-2.13_x64-linux.tar.bz2
$ mkdir ~/bin
$ printf "export PATH=\$PATH:~/bin\n" > .bashrc
$ source .bashrc
$ cp minimap2-2.13_x64-linux/minimap2 ~/bin/
$ minimap2 -V
2.13-r850
# download the latest release from https://github.com/mummer4/mummer/releases (currently 4.0.0.beta2)
$ wget https://github.com/mummer4/mummer/releases/download/v4.0.0beta2/mummer-4.0.0beta2.tar.gz
$ tar xf mummer-4.0.0beta2.tar.gz
$ cd mummer-4.0.0beta2
$ ./configure
$ make
$ cd ../
$ ln -s ~/mummer-4.0.0beta2/delta-filter ~/bin/delta-filter
$ ln -s ~/mummer-4.0.0beta2/nucmer ~/bin/nucmer
$ ln -s ~/mummer-4.0.0beta2/show-coords ~/bin/show-coords
$ nucmer -V
4.0.0beta2
installing to user's home directory, no compiling, just add the purge_haplotigs/bin directory to the system PATH.
# clone the git
$ git clone https://bitbucket.org/mroachawri/purge_haplotigs.git
$ ln -s ~/purge_haplotigs/bin/purge_haplotigs ~/bin/purge_haplotigs
$ purge_haplotigs
USAGE:
purge_haplotigs
COMMANDS:
-- Purge Haplotigs pipeline:
readhist First step, generate a read-depth histogram for the genome
contigcov Second step, get contig coverage stats and flag 'suspect' contigs
purge Third step, identify and reassign haplotigs
-- Other scripts
ncbiplace Generate a placement file for submission to NCBI
test Test everything!
$ purge_haplotigs test
#
ALL TESTS PASSED
minimap2 -t 4 -ax map-pb genome.fa subreads.fasta.gz --secondary=no \
| samtools sort -@ 8 -m 1G -o aligned.bam -T tmp.ali
Generate a coverage histogram by running the first script. This script will produce a histogram png image file for you to look at and a BEDTools 'genomecov' output file that you'll need for STEP 2.
purge_haplotigs hist -b aligned.bam -g genome.fasta [ -t threads ]
Run the second script using the cutoffs from the previous step to analyse the coverage on a contig by contig basis. This script produces a contig coverage stats csv file with suspect contigs flagged for further analysis or removal.
purge_haplotigs cov -i aligned.bam.genecov -l
[-o coverage_stats.csv -j 80 -s 80 ]
Run the purging pipeline. This script will automatically run a BEDTools windowed coverage analysis (if generating dotplots), and minimap2 alignments to assess which contigs to reassign and which to keep. The pipeline will make several iterations of purging. Optionally, parse repeats -r
in BED format for improved handling of repetitive regions
purge_haplotigs purge -g genome.fasta -c coverage_stats.csv
You will have five files
手机扫一扫
移动阅读更方便
你可能感兴趣的文章