Retrieve large data sets from MARS with ecmwfapi
ecmwfapi is a python interface which allows you to retrieve data from ECMWF directly on your computer (no need to be run at ECMWF). To use it you need:
- to set-up your environment for python
- to install your ECMWFAPI key
Contents
The easiest to set-up your environment is to "load" flexpart modulefile:
module load flexpart
Note: if you are willing to run WRF instead, you should load wrf instead of flexpart.
module load wrf
If you have any problems with your environment, look here.
To install your API key:
To access ECMWF you will need an API key. For that you first need to login at https://apps.ecmwf.int/auth/login/ and then retrieve you key at https://api.ecmwf.int/v1/key/. For this, you will need to have an account on ECMWF web site. If you don't have an account, please self register at https://apps.ecmwf.int/registration/
Copy the information in this page and paste it in the file $HOME/.ecmwfapirc.
{
"key"
:
"XXXXXXXXXXXXXXXXXXXXXX"
,
"email"
:
"john.smith@example.com"
}
If you need to retrieve data to run your model on abel,
you should copy this file on abel too.Example
#!/usr/bin/env python # # (C) Copyright 2012-2013 ECMWF. # # This software is licensed under the terms of the Apache Licence Version 2.0 # which can be obtained at http://www.apache.org/licenses/LICENSE-2.0. # In applying this licence, ECMWF does not waive the privileges and immunities # granted to it by virtue of its status as an intergovernmental organisation nor # does it submit to any jurisdiction. # from ecmwfapi import ECMWFDataServer # To run this example, you need an API key # available from https://api.ecmwf.int/v1/key/ server = ECMWFDataServer() current_date=20120801 server.retrieve({ 'dataset' : "interim", 'date' : "%s"%(current_date), 'time' : "00", 'step' : "0", 'stream' : "oper", 'levtype' : "sfc", 'levelist' : "all", 'type' : "an", 'class' : "ei", 'grid' : "128", 'param' : "148", 'target' : "ERA_148_%s.grb"%(current_date), })
Specifying class is not enough and you also need to set "dataset" (see below).
To run any of this script, give the file a name, and type ./<filename>.py . (First, you may have to type chmod u+x <filename>.py).
Script to specify domain and time period
A comprehensive download file is shown below. To run this, the start and end dates must be specified in the call:
python wrf_era_interim.py --start_year 1992 --end_year 1992 # one year or ./wrf_era_interim.py --start_year 1992 --end_year 1992 --start_month 01 --end_month 01 # one month ./wrf_era_interim.py --start_year 1992 --end_year 1992 --start_month 01 --end_month 01 --start_day 01 --end_day 05 or if you want the data in one file for each day i in {1..31}; do ./wrf_era_interim.py --start_year=1992 --end_year=1992 --start_month=01 --end_month=01 --start_day="$i" --end_day="$i"; done
This file creates four files if you need to download forecast fields and analysis fields for surface and pressure layer parameters. To concatenate them into one file, you need to concatenate them using cat file1.grb file2.grb > fileall.mars
cat an_sfc_19920101_19920131.grb an_pl_19920101_19920131.grb fc_sfc_19920101_19920131.grb > ma199201.mars
Then, make sure that all pressure levels are present in the .mars file (otherwise, real.exe will give an error message about num_metgrid_levels):
grib_ls ma199201.mars # or g1print ma199201.mars
The number of metgrid levels should be 38 (starting at 1,2,3,5,7,10,20... and ending at 950,975,1000 hPa)
#!/usr/bin/env python from ecmwfapi import ECMWFDataServer import calendar import os import shutil from optparse import OptionParser def main(): usage = "usage: %prog --start_year YYYY --end_year YYYY [--start_month MM] [--end_month MM] [--start_day DD] [--end_day DD]" parser = OptionParser(usage=usage) parser.add_option("-s", "--start_year", dest="start_year", help="start year YYYY", metavar="start_year",type=int ) parser.add_option("-e", "--end_year", dest="end_year", help="end_year YYYY", metavar="end_year", type=int) parser.add_option("--start_month", dest="start_month", help="start month MM", metavar="start_month", type=int) parser.add_option("--end_month", dest="end_month", help="end month DD", metavar="end_month", type=int) parser.add_option("--start_day", dest="start_day", help="start day DD", metavar="start_day", type=int) parser.add_option("--end_day", dest="end_day", help="end day DD", metavar="end_day", type=int) (options, args) = parser.parse_args() if not options.start_year: parser.error("start year must be specified!") else: start_year=options.start_year if not options.end_year: end_year=start_year else: end_year=options.end_year if not options.start_month: start_month=1 else: start_month=options.start_month if not options.end_month: end_month=12 else: end_month=options.end_month server = ECMWFDataServer() print start_year print end_year for year in range(start_year, end_year+1): print 'YEAR ',year for month in range(start_month,end_month+1): if not options.start_day: sdate="%s%02d01"%(year,month) else: sdate="%s%02d%02d"%(year,month,int(options.start_day)) if not options.end_day: lastday=calendar.monthrange(year,month)[1] edate="%s%02d%s"%(year,month,lastday) else: edate="%s%02d%02d"%(year,month,int(options.end_day)) print 'get data from ', sdate,' to ',edate,' (YYYYMMDD)' server.retrieve({ 'dataset' : "interim", 'date' : "%s/to/%s"%(sdate,edate), 'time' : "00/06/12/18", 'step' : "00", 'stream' : "oper", 'levtype' : "sfc", 'type' : "an", 'class' : "ei", 'grid' : "0.5/0.5", 'param' : "165/166/167/168/134/151/235/31/34/33/141/139/170/183/236/39/40/41/42", #### 'param' : "167.128/168.128/165.128/166.128/134.128/151.128/39.128/40.128/41.128/42.128/139.128/170.128/183.128/236.128/31.128/172.128/129.128/235.128/141.128/34.128", 'area' : "70./-9./46./37.", 'target' : "an_sfc_%s_%s.grb"%(sdate,edate), }) server.retrieve({ 'dataset' : "interim", 'date' : "%s/to/%s"%(sdate,edate), 'time' : "00/06/12/18", 'step' : "00", 'stream' : "oper", 'levtype' : "pl", 'levelist' : "all", 'type' : "an", 'class' : "ei", 'grid' : "0.5/0.5", 'param' : "129/130/131/132/133/157 #### 'param' : "130.128/131.128/132.128/157.128/129.128", 'frame' : "OFF", 'area' : "70./-9./46./37.", 'target' : "an_pl_%s_%s.grb"%(sdate,edate), }) server.retrieve({ 'dataset' : "interim", 'date' : "%s/to/%s"%(sdate,edate), 'time' : "00/12", 'step' : "06/12", 'stream' : "oper", 'levtype' : "sfc", 'type' : "fc", 'class' : "ei", 'grid' : "0.5/0.5", 'param' : "169.128/175.128/228.128", 'area' : "70./-9./46./37.", 'target' : "fc_sfc_%s_%s.grb"%(sdate,edate), }) if __name__ == "__main__": main()
Choosing a dataset
The "dataset
" parameter is one of:
Dataset | Description | Licence |
---|---|---|
era15 | ECMWF Global Reanalysis Data - ERA-15 (Jan 1979 - Dec 1993) | general |
era20cmv0 | ERA-20CM: Ensemble of climate model integrations (Experimental version) | general |
era40 | ECMWF Global Reanalysis Data - ERA-40 (Sep 1957 - Aug 2002) | general |
eraclim | ERA-20CM: Ensemble of climate model integrations | general |
icoads | ICOADS v2.5.1 with interpolated 20CR feedback | general |
interim | ECMWF Global Reanalysis Data - ERA Interim (Jan 1979 - present) | general |
ispd | ISPD v2.2 | general |
macc | MACC | macc |
macc_ghg_inversions | N/A | macc_ghg_inversions |
macc_nrealtime | MACC Near Real-time | macc_nrealtime |
tigge | TIGGE (THORPEX Interactive Grand Global Ensemble) | tigge |
yotc | YOTC (Year of Tropical Convection) | general |
To access these dataset, you need to agree on the the corresponding terms and conditions that can be found under the "Licence" link in the table above. See http://apps.ecmwf.int/datasets/ for the content of the datasets. The other parameters are described at: http://www.ecmwf.int/publications/manuals/mars/guide/index.html