Difference between revisions of "PacBio smrtpipe commandline"
From mn/bio/cees-bioinf
Line 9: | Line 9: | ||
*a settings.xml file, see below, specifies the job. See the end of this page for how to obtain such a xml file | *a settings.xml file, see below, specifies the job. See the end of this page for how to obtain such a xml file | ||
− | <br> ''' | + | ==== <br> '''Common steps''' ==== |
− | + | Set up environment | |
<pre>module load smrtanalysis/x.y.z # at the time of writing, 'x.y.z' is 2.0.1</pre> | <pre>module load smrtanalysis/x.y.z # at the time of writing, 'x.y.z' is 2.0.1</pre> | ||
− | + | Generate smrtcells.fofn file | |
<pre>find path/to/runfile|grep bax.h5 >smrtcells.fofn #NOTE for pre 2,0 versions, use 'bas.h5'!</pre> | <pre>find path/to/runfile|grep bax.h5 >smrtcells.fofn #NOTE for pre 2,0 versions, use 'bas.h5'!</pre> | ||
− | + | Convert this file to correct xml format: | |
<pre>fofnToSmrtpipeInput.py smrtcells.fofn >input.xml</pre> | <pre>fofnToSmrtpipeInput.py smrtcells.fofn >input.xml</pre> | ||
copy a settings xml file, e.g. filter_only_settings.xml, HGAP_settings.xml, ... | copy a settings xml file, e.g. filter_only_settings.xml, HGAP_settings.xml, ... |
Revision as of 10:25, 8 November 2013
HOW TO run smrtpipe.py on the command line.
smrtpipe.py is the PacBio pipeline for filtering and mapping PacBio runs, as well as running HGAP. For filtering-only jobs, it splits the raw reads on the adaptor and generates all subreads in fasta or fastq format.
Input:
- input.fofn, list of files to process (see below)
- bax.h5 files (or, before 2.0 software version, bas.h5 files) from the run (one per movie), raw-output from the PacBio
- a settings.xml file, see below, specifies the job. See the end of this page for how to obtain such a xml file
Common steps
Set up environment
module load smrtanalysis/x.y.z # at the time of writing, 'x.y.z' is 2.0.1
Generate smrtcells.fofn file
find path/to/runfile|grep bax.h5 >smrtcells.fofn #NOTE for pre 2,0 versions, use 'bas.h5'!
Convert this file to correct xml format:
fofnToSmrtpipeInput.py smrtcells.fofn >input.xml
copy a settings xml file, e.g. filter_only_settings.xml, HGAP_settings.xml, ...
rsync /projects/nscdata/scripts/pacbio/smrtpipe_xml_files/x.y.z/desired_settings.xml .
NOTE won't work within a screen ..?
Actual filtering (NPROC is the number of CPUs the job gets):
smrtpipe.py -D TMP=./ -D SHARED_DIR=./ -D NPROC=24 --params=desired_settings.xml xml:input.xml &> smrtpipe.err
How to obtain the settings.xml file
After a smrtportal upgrade, if it is not yet there in the folder /projects/nscdata/scripts/pacbio/smrtpipe_xml_files
- In SMRTportal on cod2, set up a job using the protocol you want to run, e.g. RS_Filter_Only, RS_HGAP_Assembly, you don't have to execute it
- note down the job number
- from the folder /projects/nscdata/smrtportal/userdata/jobs/016/, find the folder with your job number
- copy the settings.xml file over to /projects/nscdata/scripts/pacbio/smrtpipe_xml_files/x.y.x/
- give it a smart name (check the other folders), e.g. filter_only_settings.xml, HGAP_settings.xml