Difference between revisions of "Abel use old"
Line 71: | Line 71: | ||
=== Information on your job === | === Information on your job === | ||
<pre>scontrol show job JOBID</pre> | <pre>scontrol show job JOBID</pre> | ||
+ | |||
+ | === Cancel your job === | ||
+ | |||
+ | scancel JOBID | ||
+ | |||
+ | === Cancel all your jobs === | ||
+ | |||
+ | scancel -u USERNAME | ||
+ | |||
=== Running SLURM script as a shell script when not submitted through SLURM === | === Running SLURM script as a shell script when not submitted through SLURM === | ||
Revision as of 17:09, 18 March 2015
Contents
Introduction
We have been given a large allocation on Abel for computational work. This page explains how to get access and start using the resources. All use of Abel needs to draw CPU hours from an allocation.
Mailing list
If you're not already on it, get subscribed to the appropriate mailing lists. We use this list to distribute information on the use of the CEES HPC resources - both our own nodes and the CPU allocation on Abel. See the main wiki page, then come back here.
Getting access to CPU hours
See getting access.
Using Abel
Interactive login
See also here.
ssh abel.uio.no
Getting a single cpu for 11 hrs
qlogin --account nn9244k --nodes 1 --ntasks-per-node 1
Same, for 24 hrs
qlogin --account nn9244k --nodes 1 --ntasks-per-node 1 --time 24:00:00
NOTE you aresharing the node with others, do no use more than the number of cpus you asked for
NOTE a pipeline of unix commands may use one cpu per command:
grep something somefile | sort | uniq -c
This may use three cpus!
Getting a whole node with 16 CPUs and 64 GB RAM:
qlogin --account nn9244k --nodes 1 --ntasks-per-node 16 --time 24:00:00
Even though each node has 16 cpus, due to hyperthreading, you can run up to 32 processes simultaneously
You have a large work area available as well:
echo $SCRATCH cd $SCRATCH
Using
squeue -u your username
will tell you the job ID, the work area is
/work/jobID.d
NOTE all data on this area is deleted once you log out
Quitting:
logout (or ctrl-d)
SLURM scripts
Information coming, until then see here.
Temporary, fast access disk space on Abel
From the Abel newsletter #3:
Update on Abel scratch file-system usage
While a job runs, it has access to a temporary scratch directory on the shared file system /work. The directory is individual for each job, is automatically created, and is deleted when the job finishes (or gets requeued). There is no backup of this directory. The name of the directory is stored in the environment variable $SCRATCH, which is set within the job script. If your job is I/O intensive, we strongly recommend copying its work files to $SCRATCH and running the program there.
Sometimes, one needs to use a file for several jobs, or have it available some time after the job finishes. To accommodate this need, we have now created a directory /work/users/$USER for each user, where $USER is the user's user name. The purpose of the directory is to stage files that are needed by more than one job. Files in this directory are automatically deleted after a certain time (currently 45 days). There is no backup of files in /work/users/.
SLURM tips
Jobs in our queue
squeue -A nn9244k
Listing your jobs
squeue -u username
Information on your job
scontrol show job JOBID
Cancel your job
scancel JOBID
Cancel all your jobs
scancel -u USERNAME
Running SLURM script as a shell script when not submitted through SLURM
Add these lines at the beginning of your slurm script, but after the "#SBATCH" instructions
if [ -n "$SLURM_JOB_ID" ]; then # running in a slurm job source /cluster/bin/jobsetup fi
Now run the script as
source script.slurm