Difference between revisions of "SMRT Analyis: The input.xml file"

From mn/ibv/bioinfwiki
Jump to: navigation, search
 
Line 4: Line 4:
  
  
 +
<div style="line-height:90%; background-color: LightGray; border-style: solid; border-width:1px; font-family:courier new,courier,monospace;">
 +
/path/to/first_bax.h5
  
<span style="font-family:courier new,courier,monospace;">/path/to/first_bax.h5</span>
+
/path/to/second_bax.h5
  
<span style="font-family:courier new,courier,monospace;">/path/to/second_bax.h5</span>
+
/path/to/third_bax.h5
 +
</div>
 +
<br/>This text file is transformed to a XML file using the <span style="font-family:courier new,courier,monospace;">fofnToSmrtpipeInput.py</span> script:
  
<span style="font-family:courier new,courier,monospace;">/path/to/third_bax.h5</span>
 
  
 +
<div style="line-height:90%; background-color: LightGray; border-style: solid; border-width:1px; font-family:courier new,courier,monospace;">
 +
module load smrtanalysis/2.3.0
  
 +
fofnToSmrtpipeInput.py /path/to/fofn.txt > input.xml
 +
</div>
 +
<br/>The resulting input.xml file:
  
This text file is transformed to a XML file using the <span style="font-family:courier new,courier,monospace;">fofnToSmrtpipeInput.py</span> script:
 
  
 +
<div style="line-height:90%; background-color: LightGray; border-style: solid; border-width:1px; font-family:courier new,courier,monospace;">
 +
&lt;?xml version="1.0"?&gt;
  
 +
&lt;pacbioAnalysisInputs&gt;
  
<span style="font-family:courier new,courier,monospace;">module load smrtanalysis/2.3.0</span>
+
&nbsp; &lt;dataReferences&gt;
  
<span style="font-family:courier new,courier,monospace;">fofnToSmrtpipeInput.py /path/to/fofn.txt > input.xml</span>
+
&nbsp; &nbsp; &lt;url ref="run:0000000-0000"&gt;&lt;location&gt;/path/to/first_bax.h5&lt;/location&gt;&lt;/url&gt;
  
 +
&nbsp; &nbsp; &lt;url ref="run:0000000-0001"&gt;&lt;location&gt;/path/to/second_bax.h5&lt;/location&gt;&lt;/url&gt;
  
 +
&nbsp; &nbsp; &lt;url ref="run:0000000-0002"&gt;&lt;location&gt;/path/to/third_bax.h5&lt;/location&gt;&lt;/url&gt;
  
The resulting input.xml file:
+
&nbsp; &lt;/dataReferences&gt;
 
 
 
 
 
 
<span style="font-family:courier new,courier,monospace;">&lt;?xml version="1.0"?&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&lt;pacbioAnalysisInputs&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp; &lt;dataReferences&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp; &nbsp; &lt;url ref="run:0000000-0000"&gt;&lt;location&gt;/path/to/first_bax.h5&lt;/location&gt;&lt;/url&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp; &nbsp; &lt;url ref="run:0000000-0001"&gt;&lt;location&gt;/path/to/second_bax.h5&lt;/location&gt;&lt;/url&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp; &nbsp; &lt;url ref="run:0000000-0002"&gt;&lt;location&gt;/path/to/third_bax.h5&lt;/location&gt;&lt;/url&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp; &lt;/dataReferences&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&lt;/pacbioAnalysisInputs&gt;</span>
 
  
 +
&lt;/pacbioAnalysisInputs&gt;
 +
</div>
 
<span style="font-family:courier new,courier,monospace;">&nbsp;</span>
 
<span style="font-family:courier new,courier,monospace;">&nbsp;</span>
  
Line 50: Line 46:
  
  
 +
<div style="line-height:90%; background-color: LightGray; border-style: solid; border-width:1px; font-family:courier new,courier,monospace;">
 +
&lt;?xml version="1.0"?&gt;
  
<span style="font-family:courier new,courier,monospace;">&lt;?xml version="1.0"?&gt;</span>
+
&lt;pacbioAnalysisInputs&gt;
 
 
<span style="font-family:courier new,courier,monospace;">&lt;pacbioAnalysisInputs&gt;</span>
 
 
 
<span style="font-family:courier new,courier,monospace;">&nbsp;&nbsp; &lt;dataReferences&gt;</span>
 
  
<span style="font-family:courier new,courier,monospace;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;url ref="run:0000000-0001"&gt;&lt;location&gt;/path/to/bax.h5&lt;/location&gt;&lt;/url&gt;</span>
+
&nbsp;&nbsp; &lt;dataReferences&gt;
  
<span style="font-family:courier new,courier,monospace;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;url ref="fastq:/path/to/Fastq"/&gt;</span>
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;url ref="run:0000000-0001"&gt;&lt;location&gt;/path/to/bax.h5&lt;/location&gt;&lt;/url&gt;
  
<span style="font-family:courier new,courier,monospace;">&nbsp;&nbsp; &lt;/dataReferences&gt;</span>
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;url ref="fastq:/path/to/Fastq"/&gt;
  
<span style="font-family:courier new,courier,monospace;">&lt;/pacbioAnalysisInputs&gt;</span>
+
&nbsp;&nbsp; &lt;/dataReferences&gt;
  
 +
&lt;/pacbioAnalysisInputs&gt;
 +
</div>
 
&nbsp;
 
&nbsp;

Latest revision as of 16:12, 9 March 2015

The input.xml file

In order to run the smrtpipe program, an input.xml file must be included. This XML file contains the filenames of the bas.h5/bax.h5 read files to be used by the pipeline. The fofnToSmrtpipeInput.py script, included in the smrtanalysis package, can create this XML file. It needs a file-of-filenames (fofn) file as input. The fofn file is simply a text file containing one filename per line, for instance:


/path/to/first_bax.h5

/path/to/second_bax.h5

/path/to/third_bax.h5


This text file is transformed to a XML file using the fofnToSmrtpipeInput.py script:


module load smrtanalysis/2.3.0

fofnToSmrtpipeInput.py /path/to/fofn.txt > input.xml


The resulting input.xml file:


<?xml version="1.0"?>

<pacbioAnalysisInputs>

  <dataReferences>

    <url ref="run:0000000-0000"><location>/path/to/first_bax.h5</location></url>

    <url ref="run:0000000-0001"><location>/path/to/second_bax.h5</location></url>

    <url ref="run:0000000-0002"><location>/path/to/third_bax.h5</location></url>

  </dataReferences>

</pacbioAnalysisInputs>

 

The fofnToSmrtpipeInput.py script can also specify an id, name, and comment for the smrtpipe job to be run. See fofnToSmrtpipeInput.py –h for details.

In addition to containing bas.h5/bax.h5 files, the input.xml can also specify filenames of fastq or fasta files (as the fofnToSmrtpipeInput.py script does not support this, such an input.xml must be manually created):


<?xml version="1.0"?>

<pacbioAnalysisInputs>

   <dataReferences>

       <url ref="run:0000000-0001"><location>/path/to/bax.h5</location></url>

       <url ref="fastq:/path/to/Fastq"/>

   </dataReferences>

</pacbioAnalysisInputs>