In this chapter we will explore different ways to retrieve the ECMWF real-time open data using Python libraries. The main focus of this handbook will be on the earthkit
and ecmwf-opendata
packages.
The development environment setup¶
The tutorials use the Python programming language and its libraries:
earthkit
to speed up weather and climate science workflowsecmwf-opendata
to download the ECMWF open dataxarray
for multi-dimensional arrays of geospatial datapandas
to perform powerful operations on datasetsmatplotlib
for creating static and interactive visualizationscartopy
for cartographic visualizationsplotly
for interactive data visualizationgeopandas
to handle geographic data of pandas objectsxagg
for aggregating raster data over polygons
We will install our packages using the !pip3 install <package name>
commands. The Jupyter Notebooks will execute these as shell commands as at the beginning of each command is the mark !
.
1. The earthkit
and ecmwf-opendata
package¶
Here we will retrieve ECMWF real-time open data from ECMWF Data Store (ECPDS).
If the packages are not installed yet, uncomment the code below and run it.
# !pip3 install earthkit-data ecmwf-opendata requests datetime
from ecmwf.opendata import Client
import earthkit.data as ekd
client = Client(source="ecmwf")
request = {
"date" : -1,
"time" : 0,
"step" : 12,
"type" : "fc",
"stream": "oper",
"levtype" : "sfc",
"model" : "aifs-single",
"param" : "2t",
}
client.retrieve(request, "2t.grib2")
ds_2t = ekd.from_source("file", "2t.grib2")
ds_2t.ls()
ds_uv = ekd.from_source("ecmwf-open-data",
time=12,
param=["u", "v"],
levelist=[1000, 850, 500],
step=0
)
ds_uv.ls()
data = ekd.from_source("ecmwf-open-data",
date=-1,
time=12,
step=0,
param=['msl', 'tp'],
stream="oper",
type="fc",
levtype="sfc",
model=["ifs", "aifs-single"]
)
data.describe()
2. The requests
package¶
import requests
import datetime
DATADIR = './'
today = datetime.date.today().strftime('%Y%m%d') # a user can choose current date or data up to four days before today
timez = "00z/"
model = "aifs-single/"
resol = "0p25/"
stream_ = "oper"
type_ = "fc"
step = "6"
filename = f'{today}{timez[:-2]}0000-{step}h-{stream_}-{type_}.grib2'
with requests.Session() as s:
start = datetime.datetime.now()
response = requests.get(f'https://data.ecmwf.int/ecpds/home/opendata/{today}/{timez}{model}{resol}{stream_}/{filename}', stream=True)
response.raise_for_status()
with open(filename, mode="wb") as file:
for chunk in response.iter_content(chunk_size=10 * 1024):
file.write(chunk)
end = datetime.datetime.now()
diff = end - start
print(f'The {filename} file was downloaded in {diff.seconds} seconds.')
data = ekd.from_source("file", f'{DATADIR}/{filename}')
data.ls()
3. The wget
command-line tool¶
To install wget
on Linux, execute
sudo apt-get install wget
For extremely large files, it is recommended to use the -b
option that will download your content in the background. In your working directory a wget-log
will also appear that can be used to check your download progress and status. You can save the file you retrieve in another directory using the -P
option
ROOT="https://data.ecmwf.int/forecasts"
yyyymmdd="20250525"
HH="00"
model="ifs"
resol="0p25"
stream="oper"
step="24"
U="h"
type="fc"
format="grib2"
wget -P ../datadownload/ -b "$ROOT/$yyyymmdd/$HH"z"/$model/$resol/$stream/$yyyymmdd$HH"0000"-$step$U-$stream-$type.$format"
4. The curl
command-line tool¶
When you need to download a single field from a GRIB file, inspect the corresponding index file and look for the parameter of your interest. For example, to download only the 2 m temperature at step=0h from the 00 UTC HRES forecast on 08 June 2025
{"domain": "g", "date": "20250608", "time": "0000", "expver": "0001", "class": "od", "type": "fc", "stream": "oper", "step": "0", "levtype": "sfc", "param": "2t", "_offset": 62556419, "_length": 660091}
use the values of _offset
and _length
keys and calculate the start_bytes
and end_bytes
start_bytes = _offset = 62556419
end_bytes = _offset + _length - 1 = 62556419 + 660091 - 1 = 63216509
ROOT="https://data.ecmwf.int/forecasts"
yyyymmdd="20250608"
HH="00"
model="ifs"
resol="0p25"
stream="oper"
step="0"
U="h"
type="fc"
format="grib2"
start_bytes=62556419
end_bytes=63216509
curl --range "$start_bytes-$end_bytes" "$ROOT/$yyyymmdd/$HH"z"/$model/$resol/$stream/$yyyymmdd$HH"0000"-$step$U-$stream-$type.$format" --output 2t.grib2