Difference between revisions of "5. Retrieve larger data sets on MARS"

From mn/geo/geoit
Jump to: navigation, search
m (MONITOR YOUR JOB)
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
To retrieve large numbers of files from MARS create a ksh script on your home directory on ecgate.
+
Back to ECMWF overview[http://muspelheim.nilu.no/w/index.php/ECMWF]
 +
 
 +
 
 +
== BATCH JOBS ==
 +
 
 +
 
 +
The retrieval of data is done through a submission of a shell-script.
 +
 
 +
When running a program on the UNIX system use the ''batch'' system (not interactive mode). That means you submit the job with explicit commands so that the job is run unattended under Unix.
 +
 
 +
''nohup'' is similarly submission of jobs to run unattended on a Unix system, but there exists more sophisticated batch systems for handling jobs.
 +
 
 +
The batch system currently available on ECgate and HPCF is called '''LoadLeveler''' and jobs are submitted with the command ''llsubmit''.
 +
 
 +
The default shell is either Korn (ksh) or C-shell (csh).
 +
 
 +
== CREATING A SCRIPT ==
 +
 
 +
Log in to ECgate.
 +
 
 +
To retrieve large numbers of files from MARS create a '''ksh script''' on your home directory on ecgate.
  
 
For a script example see the script from Sam-Erik:
 
For a script example see the script from Sam-Erik:
Line 16: Line 36:
 
  #@ queue
 
  #@ queue
  
If retrieving fields for different times:
 
 
in a loop over dates:
 
 
- create a file “ct1ed” containing the request information for data for a  given date (need to specify keywords)
 
 
- execute this script on mars: mars ct1ed
 
 
- this script generates the specified file
 
  
- retrieve the file using ectrans command
+
Do not use environmental variables like $SCRATCH, $USER to define LL keywords.
  
The request information looks like this:
+
Then add your request information which might look like this:
 
  retrieve,
 
  retrieve,
   class    = od,
+
   class    = od,     ("Operational archive"
   stream  = enfo,
+
   stream  = oper,   ("operational Atmospheric model")
   expver  = 1,
+
   expver  = 1,       ("Experiment version", always use 1)
   date    = $cdate,
+
   date    = 1,       ("Specifies the Analysis/Forecast base date",  n is the number of days before today)
   time    = 00:00,
+
   time    = 00:00,   ("Specifies the Analysis/Forecast base time"
 
   step    = 0/to/72/by/6,
 
   step    = 0/to/72/by/6,
 
   type    = cf,
 
   type    = cf,
Line 42: Line 53:
 
   grid    = 0.5/0.5,
 
   grid    = 0.5/0.5,
 
   area    = 65/0/55/20,
 
   area    = 65/0/55/20,
   target  = "ecmwf_ctl3d_$cdate.grib"
+
   target  = "ecmwf_data.grib" (Output file containing the data)
  
 
A summary of MARS keywords can be found here:
 
A summary of MARS keywords can be found here:
Line 51: Line 62:
 
To transfer files to zardoz add the following to your ksh script:
 
To transfer files to zardoz add the following to your ksh script:
 
  ectrans -remote my_ms_association@genericFtp \
 
  ectrans -remote my_ms_association@genericFtp \
  -source ecmwf_ctl3d_$cdate.grib \
+
  -source ecmwf_data.grib \
 
  -mailto user@nilu.no \
 
  -mailto user@nilu.no \
 
  -onfailure \
 
  -onfailure \
 
  -remove \
 
  -remove \
 
  -verbose
 
  -verbose
 
To submit your script:
 
llsubmit myscript
 
 
To cancel your script:
 
llcancel myscript
 
 
To view where your job is in the queue:
 
llq
 
 
To append file2 to file1:
 
cat file2 >> file1
 
To concatenate 2 files to make a new file:
 
cat file1 file2 > file3
 
 
 
Working with files on the archive
 
els “ls” equivalent
 
ectrans “mv” equivalent
 
ecd “cd” equivalent
 
erm “rm” equivalent
 
 
 
 
 
 
  
  
 +
Instead of the "retrieve" keyword in the above script you can use other keywords to "read", "compute" or "write" data.
 +
As a rule - you should have as few retrieval routines as possible, but rather have each retrieval routine
  
 +
== SUBMIT YOUR JOB ==
  
 +
Submit your job as a batch job to LoadLeveler.
  
 +
To submit your script:
 +
llsubmit myscript
  
  
 +
== MONITOR YOUR JOB ==
  
 +
llq -u <UserId> To view where your job is in the queue
  
 +
llq -l <jobId>  To get a detailed description for a job
  
 +
llq -s <jobId>  To determine why the job is not running
  
 +
llcancel <jobId>To cancel your script
  
  
  
Back to ECMWF overview[http://muspelheim.nilu.no/w/index.php/ECMWF]
+
See ''man llq'' for more options

Latest revision as of 15:24, 12 March 2013

Back to ECMWF overview[1]


BATCH JOBS

The retrieval of data is done through a submission of a shell-script.

When running a program on the UNIX system use the batch system (not interactive mode). That means you submit the job with explicit commands so that the job is run unattended under Unix.

nohup is similarly submission of jobs to run unattended on a Unix system, but there exists more sophisticated batch systems for handling jobs.

The batch system currently available on ECgate and HPCF is called LoadLeveler and jobs are submitted with the command llsubmit.

The default shell is either Korn (ksh) or C-shell (csh).

CREATING A SCRIPT

Log in to ECgate.

To retrieve large numbers of files from MARS create a ksh script on your home directory on ecgate.

For a script example see the script from Sam-Erik:

/nilu/home/sec/ecmwf/ecmwf_starg_all.ksh

In the beginning of the script, set the Batch System keywords:

#@ shell        = /usr/bin/ksh
#@ job_type     = serial
#@ job_name     = request_to_MARS
#@ output       = /scratch/ms/no/$user/MARS_out
#@ error        = /scratch/ms/no/$user/MARS_err
#@ initialdir   = /scratch/ms/no/$user/
#@ class        = normal
#@ notification = complete
#@ account_no   = spnoflex
#@ queue


Do not use environmental variables like $SCRATCH, $USER to define LL keywords.

Then add your request information which might look like this:

retrieve,
 class    = od,      ("Operational archive"
 stream   = oper,    ("operational Atmospheric model")
 expver   = 1,       ("Experiment version", always use 1)
 date     = 1,       ("Specifies the Analysis/Forecast base date",  n is the number of days before today)
 time     = 00:00,   ("Specifies the Analysis/Forecast base time"
 step     = 0/to/72/by/6,
 type     = cf,
 levtype  = pl,
 levelist = 100/150/200/250/300/400/500/700/850/925/1000,
 param    = 129.128/130.128/131.128/132.128/133.128,
 grid     = 0.5/0.5,
 area     = 65/0/55/20,
 target   = "ecmwf_data.grib"  (Output file containing the data)

A summary of MARS keywords can be found here:

http://www.ecmwf.int/publications/manuals/mars/guide/MARS_keywords.html


To transfer files to zardoz add the following to your ksh script:

ectrans -remote my_ms_association@genericFtp \
-source ecmwf_data.grib \
-mailto user@nilu.no \
-onfailure \
-remove \
-verbose


Instead of the "retrieve" keyword in the above script you can use other keywords to "read", "compute" or "write" data. As a rule - you should have as few retrieval routines as possible, but rather have each retrieval routine

SUBMIT YOUR JOB

Submit your job as a batch job to LoadLeveler.

To submit your script:

llsubmit myscript


MONITOR YOUR JOB

llq -u <UserId> To view where your job is in the queue
llq -l <jobId>  To get a detailed description for a job
llq -s <jobId>  To determine why the job is not running
llcancel <jobId>To cancel your script


See man llq for more options