Difference between revisions of "Retrieve larger data sets from MARS on ECgate or HPCF"

From mn/geo/geoit
Jump to: navigation, search
m (RETRIEVE, LIST, READ, COMPUTE and WRITE CALLS)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Back to ECMWF overview[http://muspelheim.nilu.no/w/index.php/ECMWF]
+
Back to [[ECMWF]] overview
 
 
  
 
''Updated by nik: 12 March 2013''
 
''Updated by nik: 12 March 2013''
  
 +
<br/>To retrieve large numbers of files from MARS you need to log in to the ECgate server and issue jobs in form of scripts.
  
To retrieve large numbers of files from MARS you need to log in to the ECgate server and issue jobs in form of scripts.
 
  
  
 
== BATCH JOBS ==
 
== BATCH JOBS ==
 
  
 
The retrieval of data from MARS is done through a submission of a shell-script.
 
The retrieval of data from MARS is done through a submission of a shell-script.
  
When running a program on the UNIX system use the ''batch'' system (not interactive mode). That means you submit the job with explicit commands so that the job is run unattended under Unix.  
+
When running a program on the UNIX system use the ''batch'' system (not interactive mode). That means you submit the job with explicit commands so that the job is run unattended under Unix.
  
 
''nohup'' is a way to submit a job to run unattended on a Unix system, but there exists more sophisticated batch systems for handling jobs.
 
''nohup'' is a way to submit a job to run unattended on a Unix system, but there exists more sophisticated batch systems for handling jobs.
Line 19: Line 17:
 
The batch system currently available on ECgate and HPCF is called '''LoadLeveler''' and jobs are submitted with the command ''llsubmit''.
 
The batch system currently available on ECgate and HPCF is called '''LoadLeveler''' and jobs are submitted with the command ''llsubmit''.
  
 
+
<br/>'''OBS OBS! From June 2013 the batch system on ECgate will change from Loadleveler to ''SLURM'' and jobs are submitted with the command ''sbatch''.'''
'''OBS OBS! From June 2013 the batch system on ECgate will change from Loadleveler to ''SLURM'' and jobs are submitted with the command ''sbatch''.'''
 
  
 
'''This page should be updated to contain the commands for the new batch system when it is up running.'''
 
'''This page should be updated to contain the commands for the new batch system when it is up running.'''
  
 
== CREATING A SCRIPT ==
 
== CREATING A SCRIPT ==
 
  
 
Log in to ECgate.
 
Log in to ECgate.
Line 33: Line 29:
 
(The default shell is either Korn (ksh) or C-shell (csh)).
 
(The default shell is either Korn (ksh) or C-shell (csh)).
  
 +
<br/>In the beginning of the script, set the Batch System keywords:
  
In the beginning of the script, set the Batch System keywords:
 
 
  #@ shell        = /usr/bin/ksh                  (Specify the shell)
 
  #@ shell        = /usr/bin/ksh                  (Specify the shell)
 
  #@ job_type    = serial                        (Indicates that this is a serial job)
 
  #@ job_type    = serial                        (Indicates that this is a serial job)
Line 49: Line 45:
 
OBS: Do not user environmental variables like $USER or $SCRATCH in the LL keywords!
 
OBS: Do not user environmental variables like $USER or $SCRATCH in the LL keywords!
  
 +
<br/>Then add one or more requests which might look like this:
  
Then add one or more requests which might look like this:
 
 
  retrieve,
 
  retrieve,
  class    = od,                                            ("Operational archive")
+
class    = od,                                            ("Operational archive")
  stream  = oper,                                          ("Which model forecast" - here:operational Atmospheric model)
+
stream  = oper,                                          ("Which model forecast" - here:operational Atmospheric model)
  expver  = 1,                                              ("Experiment version", always use 1 which is the verified version)
+
expver  = 1,                                              ("Experiment version", always use 1 which is the verified version)
  date    = 20130312,                                      ("Specifies the Analysis/Forecast base date")
+
date    = 20130312,                                      ("Specifies the Analysis/Forecast base date")
  time    = 0,                                              ("Specifies the Analysis/Forecast base time"
+
time    = 0,                                              ("Specifies the Analysis/Forecast base time"
  step    = 0/to/72/by/6,                                  ("Specifies the Analysis/Forecast time steps to retrieve data for - from the base time")
+
step    = 0/to/72/by/6,                                  ("Specifies the Analysis/Forecast time steps to retrieve data for - from the base time")
  type    = an,                                            ("Type of field", Forecast=fc, Analysis=an)
+
type    = an,                                            ("Type of field", Forecast=fc, Analysis=an)
  levtype  = pl,                                            ("Type of level", pressure level=pl, model level=ml, surface=sfc...)
+
levtype  = pl,                                            ("Type of level", pressure level=pl, model level=ml, surface=sfc...)
  levelist = 100/150/200/250/300/400/500/700/850/925/1000,  ("Levels for the specified levtype", this can also be specified as "all")
+
levelist = 100/150/200/250/300/400/500/700/850/925/1000,  ("Levels for the specified levtype", this can also be specified as "all")
  param    = 129.128/130.128/131.128/132.128/133.128,        ("Meteorological parameters to retrieve", can be specified in various ways, e.g. t, temperature, 130,30.128)
+
param    = 129.128/130.128/131.128/132.128/133.128,        ("Meteorological parameters to retrieve", can be specified in various ways, e.g. t, temperature, 130,30.128)
  grid    = 0.5/0.5,                                        ("Post-processing". The data are interpolated to a grid of lat/long increments)
+
grid    = 0.5/0.5,                                        ("Post-processing". The data are interpolated to a grid of lat/long increments)
  target  = "ecmwf_data.grib"                              ("Storage", pathname where retrieved data is stored)
+
target  = "ecmwf_data.grib"                              ("Storage", pathname where retrieved data is stored)
  
You can repeat this ''retrieve'' section again and the values are inherited to the next retrieval session so that you only need to specify the ones you want to change!
+
You can repeat this ''retrieve'' section again and the values are inherited to the next retrieval section so that for the second section you only need to specify the ones you want to change from the first section.
  
 +
<br/>A summary of MARS keywords can be found here:
  
A summary of MARS keywords can be found here:
+
[http://www.ecmwf.int/publications/manuals/mars/guide/MARS_keywords.html http://www.ecmwf.int/publications/manuals/mars/guide/MARS_keywords.html]
  
http://www.ecmwf.int/publications/manuals/mars/guide/MARS_keywords.html
+
<br/>The GRIB data in MARS are in Spherical Harmonics for Upper-Air fields, but by specifying ''grid'' one can interpolate this data to a lon/lat grid, or by specifying ''gaussian'' one can interpolate the data to a regular or reduced Gaussian grid.
  
 +
<br/>To transfer files to zardoz add the following to your ksh script:
  
The GRIB data in MARS are in Spherical Harmonics for Upper-Air fields, but by specifying ''grid'' one can interpolate this data to a lon/lat grid, or by specifying ''gaussian'' one can interpolate the data to a regular or reduced Gaussian grid.
 
 
 
To transfer files to zardoz add the following to your ksh script:
 
 
  ectrans -remote my_ms_association@genericFtp \
 
  ectrans -remote my_ms_association@genericFtp \
 
  -source ecmwf_data.grib \
 
  -source ecmwf_data.grib \
Line 86: Line 80:
 
== RETRIEVE, LIST, READ, COMPUTE and WRITE CALLS ==
 
== RETRIEVE, LIST, READ, COMPUTE and WRITE CALLS ==
  
 +
Instead of the ''retrieve'' keyword in the above script you can use other keywords to ''list'', ''read'', ''compute'' or ''write'' data.
  
Instead of the ''retrieve'' keyword in the above script you can use other keywords to ''list'', ''read'', ''compute'' or ''write'' data.
 
 
 
  ''list''    Lists the data in the archive (metadata). See amount of data, number of fields, number of tapes. Alternative to the archive catalogue on web-MARS.
 
  ''list''    Lists the data in the archive (metadata). See amount of data, number of fields, number of tapes. Alternative to the archive catalogue on web-MARS.
 
  ''read''    Reads fields from a local disk/file (already retrieved data). Used to filter/manipulate already retrieved data. Need to specify ''source''.
 
  ''read''    Reads fields from a local disk/file (already retrieved data). Used to filter/manipulate already retrieved data. Need to specify ''source''.
  ''compute''  Performs computation on GRIB fields. E.g. u
+
  ''compute''  Performs computation on GRIB fields. Need to specify ''formula'', e.g. formula="sqrt(u*u + v*v)" to calculate windspeed from the u and v velocities.
  ''write''     
+
  ''write''    Writes the computed data to a target GRIB file.
 +
 
 +
<br/>With these keywords it is useful to use ''fieldset'' for temporary storage for further processing. A variable is then set which can be referenced in a further request.
 +
 
 +
At the end of the call to MARS, all fieldsets are released.
  
  
  
'''As a rule - you should have as few ''retrieve'' calls as possible, but rather have each retrieval call collect as much data as possible. '''
+
'''As a rule - you should have as few ''retrieve'' calls as possible, but rather have each retrieval call collect as much data as possible.'''
  
This is to limit the number of calls to the archive considering the architectural structure of the archive. Some data are stored on tape, and this is collected "manually".  
+
This is to limit the number of calls to the archive considering the architectural structure of the archive. Some data are stored on tape, and this is collected "manually".
  
Accessing the same tape multiple times in form of many retrieval commands is a large effort.  
+
Accessing the same tape multiple times in form of many retrieval commands is a large effort.
  
 
You can collect a lot of data in one single retrieve call and subsequently change the output file and create multiple files of the data with the ''read''/''compute''/''write'' commands.
 
You can collect a lot of data in one single retrieve call and subsequently change the output file and create multiple files of the data with the ''read''/''compute''/''write'' commands.
  
However, to retrieve e.g. ERA-INTERIM data for a full year, it would be most appropriate to split the retrieval calls up in 12 (one for each month) considering  
+
However, to retrieve e.g. ERA-INTERIM data for a full year, it would be most appropriate to split the retrieval calls up in 12 (one for each month) considering
 +
 
 
  - the architecture of MARS
 
  - the architecture of MARS
 
  - volume of data extracted
 
  - volume of data extracted
Line 111: Line 109:
 
  - queuing (the larger the request the slower)
 
  - queuing (the larger the request the slower)
  
 
+
<br/>By optimizing your request you can jump forward in the queue!
By optimizing your request you can jump forward in the queue!
 
  
 
== SUBMIT YOUR JOB ==
 
== SUBMIT YOUR JOB ==
Line 119: Line 116:
  
 
To submit your script:
 
To submit your script:
 +
 
  llsubmit myscript
 
  llsubmit myscript
 +
  
  
Line 132: Line 131:
 
  llcancel <jobId>To cancel your script
 
  llcancel <jobId>To cancel your script
  
 +
<br/>See ''man llq'' for more options
  
See ''man llq'' for more options
+
[[Category:Tools]][[Category:Models]][[Category:ECMWF]]

Latest revision as of 16:22, 30 January 2015

Back to ECMWF overview

Updated by nik: 12 March 2013


To retrieve large numbers of files from MARS you need to log in to the ECgate server and issue jobs in form of scripts.


BATCH JOBS

The retrieval of data from MARS is done through a submission of a shell-script.

When running a program on the UNIX system use the batch system (not interactive mode). That means you submit the job with explicit commands so that the job is run unattended under Unix.

nohup is a way to submit a job to run unattended on a Unix system, but there exists more sophisticated batch systems for handling jobs.

The batch system currently available on ECgate and HPCF is called LoadLeveler and jobs are submitted with the command llsubmit.


OBS OBS! From June 2013 the batch system on ECgate will change from Loadleveler to SLURM and jobs are submitted with the command sbatch.

This page should be updated to contain the commands for the new batch system when it is up running.

CREATING A SCRIPT

Log in to ECgate.

Create a ksh script on your home directory on ecgate.

(The default shell is either Korn (ksh) or C-shell (csh)).


In the beginning of the script, set the Batch System keywords:

#@ shell        = /usr/bin/ksh                   (Specify the shell)
#@ job_type     = serial                         (Indicates that this is a serial job)
#@ job_name     = request_to_MARS                (Name of job)
#@ initialdir   = /scratch/ms/no/sb9/test/       (Pathname to the initial working directory. OBS: Do not user environmental variables like $USER or $SCRATCH)
#@ output       = $(job_name).$(jobid).out       (*.out file)
#@ error        = $(job_name).$(jobid).err       (Error file)
#@ class        = normal                         (indicates the priority of the job, usually this is "normal")
#@ notification = complete                       (Sends notification on completion)
#@ notify_user = <userId>@ecmwf.int              (change to your userID, by default your userID )
#@ account_no   = spnoflex                       (FLEXPART account)
#@ queue                                         (indicates the end of the keywords, mandatory)

OBS: Do not user environmental variables like $USER or $SCRATCH in the LL keywords!


Then add one or more requests which might look like this:

retrieve,
class    = od,                                             ("Operational archive")
stream   = oper,                                           ("Which model forecast" - here:operational Atmospheric model)
expver   = 1,                                              ("Experiment version", always use 1 which is the verified version)
date     = 20130312,                                       ("Specifies the Analysis/Forecast base date")
time     = 0,                                              ("Specifies the Analysis/Forecast base time"
step     = 0/to/72/by/6,                                   ("Specifies the Analysis/Forecast time steps to retrieve data for - from the base time")
type     = an,                                             ("Type of field", Forecast=fc, Analysis=an)
levtype  = pl,                                             ("Type of level", pressure level=pl, model level=ml, surface=sfc...)
levelist = 100/150/200/250/300/400/500/700/850/925/1000,   ("Levels for the specified levtype", this can also be specified as "all")
param    = 129.128/130.128/131.128/132.128/133.128,        ("Meteorological parameters to retrieve", can be specified in various ways, e.g. t, temperature, 130,30.128)
grid     = 0.5/0.5,                                        ("Post-processing". The data are interpolated to a grid of lat/long increments)
target   = "ecmwf_data.grib"                               ("Storage", pathname where retrieved data is stored)

You can repeat this retrieve section again and the values are inherited to the next retrieval section so that for the second section you only need to specify the ones you want to change from the first section.


A summary of MARS keywords can be found here:

http://www.ecmwf.int/publications/manuals/mars/guide/MARS_keywords.html


The GRIB data in MARS are in Spherical Harmonics for Upper-Air fields, but by specifying grid one can interpolate this data to a lon/lat grid, or by specifying gaussian one can interpolate the data to a regular or reduced Gaussian grid.


To transfer files to zardoz add the following to your ksh script:

ectrans -remote my_ms_association@genericFtp \
-source ecmwf_data.grib \
-mailto user@nilu.no \
-onfailure \
-remove \
-verbose

RETRIEVE, LIST, READ, COMPUTE and WRITE CALLS

Instead of the retrieve keyword in the above script you can use other keywords to list, read, compute or write data.

list     Lists the data in the archive (metadata). See amount of data, number of fields, number of tapes. Alternative to the archive catalogue on web-MARS.
read     Reads fields from a local disk/file (already retrieved data). Used to filter/manipulate already retrieved data. Need to specify source.
compute  Performs computation on GRIB fields. Need to specify formula, e.g. formula="sqrt(u*u + v*v)" to calculate windspeed from the u and v velocities. 
write    Writes the computed data to a target GRIB file.


With these keywords it is useful to use fieldset for temporary storage for further processing. A variable is then set which can be referenced in a further request.

At the end of the call to MARS, all fieldsets are released.


As a rule - you should have as few retrieve calls as possible, but rather have each retrieval call collect as much data as possible.

This is to limit the number of calls to the archive considering the architectural structure of the archive. Some data are stored on tape, and this is collected "manually".

Accessing the same tape multiple times in form of many retrieval commands is a large effort.

You can collect a lot of data in one single retrieve call and subsequently change the output file and create multiple files of the data with the read/compute/write commands.

However, to retrieve e.g. ERA-INTERIM data for a full year, it would be most appropriate to split the retrieval calls up in 12 (one for each month) considering

- the architecture of MARS
- volume of data extracted
- restrictions on disk space
- restart availability
- queuing (the larger the request the slower)


By optimizing your request you can jump forward in the queue!

SUBMIT YOUR JOB

Submit your job as a batch job to LoadLeveler.

To submit your script:

llsubmit myscript


MONITOR YOUR JOB

llq -u <UserId> To view where your job is in the queue
llq -l <jobId>  To get a detailed description for a job
llq -s <jobId>  To determine why the job is not running
llcancel <jobId>To cancel your script


See man llq for more options