In this chapter we will explore different ways to retrieve the ECMWF real-time open data using Python libraries. The main focus of this handbook will be on the earthkit
and ecmwf-opendata
packages.
The development environment setup¶
The tutorials use the Python programming language and its libraries:
earthkit
to speed up weather and climate science workflowsecmwf-opendata
to download the ECMWF open datanumpy
for scientific computing of multi-dimensional arrayscfgrib
to map GRIB files to the netCDFrequests
(or urllib3,wget
) for sending HTTP requestsxarray
for multi-dimensional arrays of geospatial dataeccodes
to decode and encode GRIB/BUFR filesmatplotlib
for creating static and interactive visualizationscartopy
for cartographic visualizationsplotly
(or metview) for interactive data visualization
We will install our packages using the !pip3 install <package name>
commands. The Jupyter Notebooks will execute these as shell commands as at the beginning of each command is the mark !
.
1. The earthkit
and ecmwf-opendata
package¶
Here we will retrieve ECMWF real-time open data from ECMWF Data Store.
from ecmwf.opendata import Client
import earthkit.data as ekd
client = Client(source="ecmwf")
request = {
"date" : -1,
"time" : 0,
"step" : 12,
"type" : "fc",
"stream": "oper",
"levtype" : "sfc",
"model" : "aifs-single",
"param" : "2t",
}
client.retrieve(request, "2t.grib2")
ds_2t = ekd.from_source("file", "2t.grib2")
ds_2t.ls()
ds_uv = ekd.from_source("ecmwf-open-data",
time=12,
param=["u", "v"],
levelist=[1000, 850, 500],
step=0
)
ds_uv.ls()
When we set source to azure
, data hosted on Microsoft’s Azure will be accessed.
client = Client(source="azure")
request = {
"time": 12,
"type": "fc",
"step": 0,
"param": "2t",
}
# client.retrieve(request, "azure_2t_data.grib2")
# dm_2t = ekd.from_source("file", "azure_2t_data.grib2")
# dm_2t.ls()
Below, two examples for downloading data from the Amazon’s AWS location.
client = Client(source="aws")
request = {
"time": 0,
"type": "fc",
"step": 24,
"param": "2t",
}
client.retrieve(request, "aws_2t_data.grib2")
da_2t = ekd.from_source("file", "aws_2t_data.grib2")
da_2t.ls()
data = ekd.from_source("s3", {
"endpoint": "s3.amazonaws.com",
"region": "eu-central-1",
"bucket": "ecmwf-forecasts",
"objects": "20230118/00z/0p4-beta/oper/20230118000000-0h-oper-fc.grib2"
}, anon=True)
ds = data.to_xarray()
ds
2. The requests
package¶
import requests
import datetime
DATADIR = './'
today = datetime.date.today().strftime('%Y%m%d') # a user can choose current date or data up to four days before today
timez = "00z/"
model = "aifs-single/"
resol = "0p25/"
stream_ = "oper"
type_ = "fc"
step = "6"
filename = f'{today}{timez[:-2]}0000-{step}h-{stream_}-{type_}.grib2'
with requests.Session() as s:
try:
start = datetime.datetime.now()
response = requests.get(f'https://data.ecmwf.int/ecpds/home/opendata/{today}/{timez}{model}{resol}{stream_}/{filename}', stream=True)
if response.status_code == 200:
with open(filename, mode="wb") as file:
for chunk in response.iter_content(chunk_size=10 * 1024):
file.write(chunk)
end = datetime.datetime.now()
diff = end - start
print(f'The {filename} file downloaded in {diff.seconds} seconds.')
except:
print(f'There is no file {filename} to download.')
ekd.download_example_file(f'{DATADIR}/{filename}')
The 20250608000000-6h-oper-fc.grib2 file downloaded in 8 seconds.
3. The wget
command-line tool¶
To install wget
on Linux, execute
sudo apt-get install wget
For extremely large files, it is recommended to use the -b
option that will download your content in the background. In your working directory a wget-log
will also appear that can be used to check your download progress and status. You can save the file you retrieve in another directory using the -P
option
ROOT="https://data.ecmwf.int/forecasts"
yyyymmdd="20250525"
HH="00"
model="ifs"
resol="0p25"
stream="oper"
step="24"
U="h"
type="fc"
format="grib2"
wget -P ../datadownload/ -b "$ROOT/$yyyymmdd/$HH"z"/$model/$resol/$stream/$yyyymmdd$HH"0000"-$step$U-$stream-$type.$format"
ROOT="https://data.ecmwf.int/forecasts"
yyyymmdd="20250608"
HH="00"
model="ifs"
resol="0p25"
stream="oper"
step="0"
U="h"
type="fc"
format="grib2"
start_bytes=62556419
end_bytes=63216509
wget "$ROOT/$yyyymmdd/$HH"z"/$model/$resol/$stream/$yyyymmdd$HH"0000"-$step$U-$stream-$type.$format" --header="Range: bytes=$start_bytes-$end_bytes"
4. The curl
command-line tool¶
When you need to download a single field from a GRIB file, inspect the corresponding index file and look for the parameter of your interest. For example, to download only the 2m temperature at step=0h from the 00 UTC HRES forecast on 08 June 2025
{"domain": "g", "date": "20250608", "time": "0000", "expver": "0001", "class": "od", "type": "fc", "stream": "oper", "step": "0", "levtype": "sfc", "param": "2t", "_offset": 62556419, "_length": 660091}
use the values of _offset
and _length
keys and calculate the start_bytes
and end_bytes
(the _offset
and _length
values of a specific field are different for each forecast run!)
start_bytes = _offset = 62556419
end_bytes = _offset + _length - 1 = 62556419 + 660091 - 1 = 63216509
ROOT="https://data.ecmwf.int/forecasts"
yyyymmdd="20250608"
HH="00"
model="ifs"
resol="0p25"
stream="oper"
step="0"
U="h"
type="fc"
format="grib2"
start_bytes=62556419
end_bytes=63216509
curl --range "$start_bytes-$end_bytes" "$ROOT/$yyyymmdd/$HH"z"/$model/$resol/$stream/$yyyymmdd$HH"0000"-$step$U-$stream-$type.$format" --output 2t.grib2