Difference between revisions of "User manual cod nodes old"

From mn/bio/cees-bioinf
Jump to: navigation, search
Line 21: Line 21:
 
Check that you are a member of the group 'seq454' by simply typing:  
 
Check that you are a member of the group 'seq454' by simply typing:  
 
<pre>groups</pre>  
 
<pre>groups</pre>  
If 'seq454' s not listed, please contact Lex Nederbragt.<br>
+
If 'seq454' s not listed, please contact Lex Nederbragt.<br>  
  
 
=== '''TIP: using screen'''  ===
 
=== '''TIP: using screen'''  ===
Line 31: Line 31:
 
== '''Data and software'''  ==
 
== '''Data and software'''  ==
  
=== Disks ===
+
=== Disks ===
  
All nodes have 'abel' (the UiO supercomputer cluster) disks, and your abel home area mounted to them. So, all the files located in /projects are available on the cod nodes, see below. In addition, the nodes have local discs, currently:<br>/data --&gt; for permanent files, e.g input to your program<br>/work --&gt; working are for your programs
+
All nodes have 'abel' (the UiO supercomputer cluster) disks, and your abel home area mounted to them. So, all the files located in /projects are available on the cod nodes, see below. In addition, the nodes have local discs, currently:<br>/data --&gt; for permanent files, e.g input to your program<br>/work --&gt; working are for your programs  
  
 
=== '''CEES project data'''  ===
 
=== '''CEES project data'''  ===
Line 39: Line 39:
 
The CEES projects are organised as follows:  
 
The CEES projects are organised as follows:  
  
/projects/cees is the main area. Access is for those in the seq454 unix user group (soon to change name to ceesdata) Please check that files and folders you create have the right permissions:  
+
/projects/cees is the main area. Access is for those in the seq454 unix user group (the name is a relict from when we first started using this area). Please check that files and folders you create have the right permissions:  
 
<pre>chgrp -R seq454 yourfolder
 
<pre>chgrp -R seq454 yourfolder
chmod -R 770 yourfolder</pre>  
+
chmod -R 770 yourfolder
It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex Nederbragt
+
chmod -R g+s yourfolder</pre>  
 +
The last command ensures that the chosen permissions are given to new files and folders as well.
  
Folders in /projects/cees:  
+
It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex Nederbragt.
 +
 
 +
<u>Folders in /projects/cees:</u>
  
 
projects --&gt; finished projects, data belonging to an (accepted) publication<br>databases --&gt; permanenet files, e.g reference genomes<br>in_progress --&gt; the main working area. Here you store data for unfinished projects<br>bin --&gt; here are programs and scripts located<br>lib --&gt; needed by certain programs<br>src --&gt; source files for some of the programs in the bin folder<br>scripts --&gt; for home-made scripts<br>exchange --&gt; for exchanging files with non-members<br>  
 
projects --&gt; finished projects, data belonging to an (accepted) publication<br>databases --&gt; permanenet files, e.g reference genomes<br>in_progress --&gt; the main working area. Here you store data for unfinished projects<br>bin --&gt; here are programs and scripts located<br>lib --&gt; needed by certain programs<br>src --&gt; source files for some of the programs in the bin folder<br>scripts --&gt; for home-made scripts<br>exchange --&gt; for exchanging files with non-members<br>  
  
=== Choosing where to work with and store your data ===
+
TIP see [https://wiki.uio.no/mn/bio/cees-bioinf/index.php/Organising_your_work this link] for advice on how to set up a folder for your project in /projects/cees/in_progress
  
*reading and writing data to and from /data and /work will be much faster and efficient than to /projects/cees
+
=== Choosing where to work with and store your data  ===
 +
 
 +
*reading and writing data to and from /data and /work will be much faster and efficient than to /projects/cees  
 
*data on /projects/cees is backed up by USIT, but NOT data on /data and /work<br>
 
*data on /projects/cees is backed up by USIT, but NOT data on /data and /work<br>
  
This leads to the following strategy for how to choose which disk to use:
+
This leads to the following strategy for how to choose which disk to use:  
  
*for something short and quick, you can use /projects
+
*for something short and quick, you can directly work on data in /projects/cees
*for a long running program, or one that generates a lot of data over a long time, use /data and /work
+
*for a long running program, or one that generates a lot of data over a long time, use the locally attached /data and /work  
*once the long running job is done, you can move the data you want to keep to /projects/cees
+
*once the long running job is done, you can move the data you want to keep to /projects/cees  
*NOTE having your program write a lot over a long time to a file on /projects/cees causes problems for the backup system, as the file may be changed during backup
+
*NOTE having your program write a lot over a long time to a file on /projects/cees causes problems for the backup system, as the file may be changed during backup  
 
*for long-term storage of data you do not need regular access to, please use the norstore allocation (ask Lex Nederbragt)
 
*for long-term storage of data you do not need regular access to, please use the norstore allocation (ask Lex Nederbragt)
  
Line 68: Line 73:
 
*to use our local modules, see below under 'setting up your environment'  
 
*to use our local modules, see below under 'setting up your environment'  
 
*we can add our own local python and perl modules, please ask
 
*we can add our own local python and perl modules, please ask
 +
 +
NOTE all nodes run Read Hat Enterprise Linux version 6, except cod2, which uses version 5 (normally, you will not notice this)
  
 
=== Setting up your environment  ===
 
=== Setting up your environment  ===
  
 
*to have access to the commonly used programs and scripts, add the /projects/cees/bin folder in your path  
 
*to have access to the commonly used programs and scripts, add the /projects/cees/bin folder in your path  
*to automaticilly set permissions to new files and folders, &nbsp;use 'umask 0002' command  
+
*to help with automatically seting permissions to new files and folders, use 'umask 0002' command  
*to use our own modules (see elsewhere for 'modules'), add the path to local modules
+
*to use our own modules, add the path to local modules
  
In order to achieve all this, include the following lines in your ~/.bash_login file (please create the file if it doesn't already exist):  
+
In order to achieve all this, include the following lines in your ~/.bash_login (/usit/abel/u1/username/.bash_login) file:  
 
<pre>export PATH=/projects/cees/bin:$PATH
 
<pre>export PATH=/projects/cees/bin:$PATH
 
umask 0002
 
umask 0002
 
module use /projects/cees/bin/modules
 
module use /projects/cees/bin/modules
 
</pre>  
 
</pre>  
 +
Please create the file if it doesn't already exist
 +
 
=== '''Note to SLURM users'''  ===
 
=== '''Note to SLURM users'''  ===
  
If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.<br> '''Job scripts'''  
+
If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.
 +
 
 +
=== <br> '''Job scripts''' ===
  
 
You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with  
 
You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with  

Revision as of 12:36, 3 March 2014

This document describes how to use the high-performance compute nodes of the cod group at CEES. We have the following resources available:

cod1 24cpus 128 GB of RAM, ~1 TB disk space --> mainly for mysql databases
cod2 24cpus 128 GB of RAM, ~1.3 TB disk space
cod3 64cpus 512 GB of RAM, ~24 TB disk space
cod4 64cpus 512 GB of RAM, ~24 TB disk space


General use

Getting access

Provide Lex Nederbragt with your UiO username.


Mailing list

If you're not already on it, get subscribed to the cod-nodes mailing list: https://sympa.uio.no/bio.uio.no/subscribe/cod-nodes. We use this list to distribute information on the use of the CEES cod nodes.

If you intend to use one of the nodes for an extended period of time, please send an email to this list!


Logging in

ssh username@cod1.uio.no

When you are on the UiO network it is enough to write ssh cod1

Check that you are a member of the group 'seq454' by simply typing:

groups

If 'seq454' s not listed, please contact Lex Nederbragt.

TIP: using screen

After starting a long running job, you cannot close the terminal wind, or the job will be cancelled. Instead, run the job from within a 'screen': type

screen

You now stared a 'new' terminal start your job press ctrl-a-d, that is the CTRL key with the 'a' key, followed by the 'd' key Now you're back in the terminal where you started. You can close this terminal, and the one behind the screen still exists and continues To get back into the now hidden one, type

screen -rd

Data and software

Disks

All nodes have 'abel' (the UiO supercomputer cluster) disks, and your abel home area mounted to them. So, all the files located in /projects are available on the cod nodes, see below. In addition, the nodes have local discs, currently:
/data --> for permanent files, e.g input to your program
/work --> working are for your programs

CEES project data

The CEES projects are organised as follows:

/projects/cees is the main area. Access is for those in the seq454 unix user group (the name is a relict from when we first started using this area). Please check that files and folders you create have the right permissions:

chgrp -R seq454 yourfolder
chmod -R 770 yourfolder
chmod -R g+s yourfolder

The last command ensures that the chosen permissions are given to new files and folders as well.

It is possible to restrict access to a subgroup of those that are inn the seq454 group, please ask Lex Nederbragt.

Folders in /projects/cees:

projects --> finished projects, data belonging to an (accepted) publication
databases --> permanenet files, e.g reference genomes
in_progress --> the main working area. Here you store data for unfinished projects
bin --> here are programs and scripts located
lib --> needed by certain programs
src --> source files for some of the programs in the bin folder
scripts --> for home-made scripts
exchange --> for exchanging files with non-members

TIP see this link for advice on how to set up a folder for your project in /projects/cees/in_progress

Choosing where to work with and store your data

  • reading and writing data to and from /data and /work will be much faster and efficient than to /projects/cees
  • data on /projects/cees is backed up by USIT, but NOT data on /data and /work

This leads to the following strategy for how to choose which disk to use:

  • for something short and quick, you can directly work on data in /projects/cees
  • for a long running program, or one that generates a lot of data over a long time, use the locally attached /data and /work
  • once the long running job is done, you can move the data you want to keep to /projects/cees
  • NOTE having your program write a lot over a long time to a file on /projects/cees causes problems for the backup system, as the file may be changed during backup
  • for long-term storage of data you do not need regular access to, please use the norstore allocation (ask Lex Nederbragt)

Software

  • locally installed programs are in /projects/cees/bin
  • much software is available through the 'module' system, see this manual.
  • we can make our own modules, see the instructions (forthcoming)
  • to use our local modules, see below under 'setting up your environment'
  • we can add our own local python and perl modules, please ask

NOTE all nodes run Read Hat Enterprise Linux version 6, except cod2, which uses version 5 (normally, you will not notice this)

Setting up your environment

  • to have access to the commonly used programs and scripts, add the /projects/cees/bin folder in your path
  • to help with automatically seting permissions to new files and folders, use 'umask 0002' command
  • to use our own modules, add the path to local modules

In order to achieve all this, include the following lines in your ~/.bash_login (/usit/abel/u1/username/.bash_login) file:

export PATH=/projects/cees/bin:$PATH
umask 0002
module use /projects/cees/bin/modules

Please create the file if it doesn't already exist

Note to SLURM users

If you are used to submit jobs through a slurm script, this will not work on the cod nodes. Here you'll have to give the command directly on the command line.


Job scripts

You can use a job script: collect a bunch of commands and put them in an executable file. Run the command with

source yourcommands.sh

Local services

cod1

  • mysql
  • the stacks program for analysis of RAD tag data

cod2

  • the smrtportal software for secondary analysis of PacBio runs