Difference between revisions of "User manual cod nodes old"

From mn/bio/cees-bioinf
Jump to: navigation, search
(First version)
 
(Formatting)
Line 1: Line 1:
This document describes how to use the high-performance compute nodes of the cod group at CEES.
+
This document describes how to use the high-performance compute nodes of the cod group at CEES. We have the following resources available:  
We have the following resources available:
 
  
cod1     24cpus     128 GB of RAM, ~1 TB disk space
+
cod1 24cpus 128 GB of RAM, ~1 TB disk space<br>cod2 24cpus 128 GB of RAM, ~1.3 TB disk space<br> cod3 64cpus 512 GB of RAM, ~4 TB disk space<br> cod4 64cpus 512 GB of RAM, ~4 TB disk space  
cod2     24cpus     128 GB of RAM, ~1.3 TB disk space
 
cod3     64cpus     512 GB of RAM, ~XX TB disk space
 
cod4     64cpus     512 GB of RAM, ~XX TB disk space
 
  
'''Getting access
+
<br> '''Getting access'''
  
Ask Lex or Karin
+
Ask Lex or Karin  
  
'''Mailing list'''
+
<br> '''Mailing list'''  
  
If you're not already on it, get subscribed to the cees-hpc mailing list:
+
If you're not already on it, get subscribed to the cees-hpc mailing list: https://sympa.uio.no/bio.uio.no/subscribe/cees-hpc  
https://sympa.uio.no/bio.uio.no/subscribe/cees-hpc
 
  
If you intend to use one of the nodes for an extended period of time, please send an email to this list
+
If you intend to use one of the nodes for an extended period of time, please send an email to this list  
  
'''Logging in'''
+
<br> '''Logging in'''  
 +
<pre>ssh username@cod1.titan.uio.no</pre>
 +
When you are on the UiO network it is enough to write ssh cod1
  
ssh username@cod1.titan.uio.no
+
All nodes have titan and your titan home area mounted to them. So, all the files located in /project are avail be on the cod nodes. In addition, the have local discs, currently:
When you are on the UiO network it is enough to write
 
ssh cod1
 
  
All nodes have titan and your titan home area mounted to them. So, all the files located in /project are avail be on the cod nodes. In addition, the have local discs, currently:
+
/data --&gt; for permanent files, e.g input to your program
  
/data --> for permanent files, e.g input to your program
+
/work --&gt; working are for your programs  
/work --> working are for your programs
 
  
A few important things to note:
+
A few important things to note: - reading and writing data to/from /data and /work will be much faster and efficient than to /projects - data on /project is backed up by USIT, but NOT data on /data and /work  
- reading and writing data to/from /data and /work will be much faster and efficient than to /projects
 
- data on /project is backed up by USIT, but NOT data on /data and /work
 
 
This leads to the following strategy for how to choose which disk to use:
 
- for a quick command, you may use /projects
 
- for a long running program, or one that generates a lot of data over a long time, use /data and /work
 
- having your program write a lot over a long time to a file on /projects causes problems for the backup system, as the file may be changed during backup.
 
  
'''TIP: using screen'''
+
This leads to the following strategy for how to choose which disk to use:  
  
After starting a long running job, you cannot close the terminal wind, or the job will be cancelled.Isnteads, run the job from within a 'screen':
+
*for a quick command, you may use /projects
type
+
*for a long running program, or one that generates a lot of data over a long time, use /data and /work
screen
+
*having your program write a lot over a long time to a file on /projects causes problems for the backup system, as the file may be changed during backup.
  
You now stared a 'new' terminal
+
<br> '''TIP: using screen'''  
start your job
 
press ctrl-a-d, that is the CTRL key with the 'a' key, followed by the 'd' key
 
Now you're back in the terminal where you started. You can close this terminal, and the one behind the screen still exists and continues
 
To get back into the now hidden one, type
 
  
screen -rd
+
After starting a long running job, you cannot close the terminal wind, or the job will be cancelled. Instead, run the job from within a 'screen': type
 +
<pre>screen</pre>
 +
You now stared a 'new' terminal start your job press ctrl-a-d, that is the CTRL key with the 'a' key, followed by the 'd' key Now you're back in the terminal where you started. You can close this terminal, and the one behind the screen still exists and continues To get back into the now hidden one, type
 +
<pre>screen -rd</pre>
 +
'''CEES project data'''
  
'''CEES project data'''
+
The CEES projects are organised as follows:
  
The CEES projects are organised as follows:
+
/projects/454data is the main area (at some point this will change names). Access is for those in the seq454 unix user group (soon to change name to ceesdata) Please check that files and folders you create have the right permissions:  
/projects/454data is the main area (at some point this will change names)
+
<pre>chgrp -R seq454 yourfolder
Access is for those in the seq454 unix user group (soon to change name to ceesdata)
+
chmod -R 770 yourfolder</pre>
Please check that files and folders you create have the right permissions:
+
It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex/Karin
chgrp -R seq454 yourfolder
 
chmod -R 770 yourfolder
 
  
It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex/Karin
+
Folders in /projects/454data:
  
Folders in /projects/454data:
+
projects --&gt; finished projects, data belonging to an (accepted) publication<br>databases --&gt; permanenet files, e.g reference genomes<br>in_progress --&gt; the main working area. Here you store data for unfinished projects<br>bin --&gt; here are programs and script located<br>lib --&gt; needed by certain programs<br>src --&gt; source files for some of the programs in the bin folder<br>exchange --&gt; for exchanging files with non-members<br>
projects --> finished projects, data belonging to an (accepted) publication
 
databases --> permanenet files, e.g reference genomes
 
in_progress --> the main working area. Here you store data for unfinished projects
 
bin --> here are programs and script located
 
lib --> needed by certain programs
 
src --> source files for some of the programs in the bin folder
 
exchange --> for exchanging files with non-members
 
  
bioportal --> used by the NSC
+
<br>  
cees --> will be removed at a later date
 
scripts --> to be removed (empty)
 
utsp --> to be removed
 
www_docs_old --> to be removed
 
runs --> to be removed
 
  
It is recommended to put the /projects/454data/bin folder in your path. Include the following line in your ~/.bash_login file:
+
Other folders:<br>bioportal --&gt; used by the NSC<br>cees --&gt; will be removed at a later date<br>scripts --&gt; to be removed (empty)<br>utsp --&gt; to be removed<br>www_docs_old --&gt; to be removed<br>runs --&gt; to be removed
export PATH=/projects/454data/bin:$PATH
 
  
'''Note to SLURM users'''
+
<br>
 +
 
 +
It is recommended to put the /projects/454data/bin folder in your path. Include the following line in your ~/.bash_login file:
 +
<pre>export PATH=/projects/454data/bin:$PATH</pre>
 +
'''Note to SLURM users'''  
  
 
If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.  
 
If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.  
  
'''Job scripts'''
+
<br> '''Job scripts'''  
  
You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with
+
You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with  
source yourcommands.sh
+
<pre>source yourcommands.sh</pre>

Revision as of 09:59, 15 June 2012

This document describes how to use the high-performance compute nodes of the cod group at CEES. We have the following resources available:

cod1 24cpus 128 GB of RAM, ~1 TB disk space
cod2 24cpus 128 GB of RAM, ~1.3 TB disk space
cod3 64cpus 512 GB of RAM, ~4 TB disk space
cod4 64cpus 512 GB of RAM, ~4 TB disk space


Getting access

Ask Lex or Karin


Mailing list

If you're not already on it, get subscribed to the cees-hpc mailing list: https://sympa.uio.no/bio.uio.no/subscribe/cees-hpc

If you intend to use one of the nodes for an extended period of time, please send an email to this list


Logging in

ssh username@cod1.titan.uio.no

When you are on the UiO network it is enough to write ssh cod1

All nodes have titan and your titan home area mounted to them. So, all the files located in /project are avail be on the cod nodes. In addition, the have local discs, currently:

/data --> for permanent files, e.g input to your program

/work --> working are for your programs

A few important things to note: - reading and writing data to/from /data and /work will be much faster and efficient than to /projects - data on /project is backed up by USIT, but NOT data on /data and /work

This leads to the following strategy for how to choose which disk to use:

  • for a quick command, you may use /projects
  • for a long running program, or one that generates a lot of data over a long time, use /data and /work
  • having your program write a lot over a long time to a file on /projects causes problems for the backup system, as the file may be changed during backup.


TIP: using screen

After starting a long running job, you cannot close the terminal wind, or the job will be cancelled. Instead, run the job from within a 'screen': type

screen

You now stared a 'new' terminal start your job press ctrl-a-d, that is the CTRL key with the 'a' key, followed by the 'd' key Now you're back in the terminal where you started. You can close this terminal, and the one behind the screen still exists and continues To get back into the now hidden one, type

screen -rd

CEES project data

The CEES projects are organised as follows:

/projects/454data is the main area (at some point this will change names). Access is for those in the seq454 unix user group (soon to change name to ceesdata) Please check that files and folders you create have the right permissions:

chgrp -R seq454 yourfolder
chmod -R 770 yourfolder

It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex/Karin

Folders in /projects/454data:

projects --> finished projects, data belonging to an (accepted) publication
databases --> permanenet files, e.g reference genomes
in_progress --> the main working area. Here you store data for unfinished projects
bin --> here are programs and script located
lib --> needed by certain programs
src --> source files for some of the programs in the bin folder
exchange --> for exchanging files with non-members


Other folders:
bioportal --> used by the NSC
cees --> will be removed at a later date
scripts --> to be removed (empty)
utsp --> to be removed
www_docs_old --> to be removed
runs --> to be removed


It is recommended to put the /projects/454data/bin folder in your path. Include the following line in your ~/.bash_login file:

export PATH=/projects/454data/bin:$PATH

Note to SLURM users

If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.


Job scripts

You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with

source yourcommands.sh