Skip to article frontmatterSkip to article content

Real-time Forecasts


Here we will focus on the latest forecast of 2 metre temperature of the AIFS-Single over the world over next seven days.

1. Set Up Your Environment and Find ECMWF Open Data

Open data will be downloaded from a publicly available Amazon S3 Bucket. First, the following Python libraries need to be installed in the current Jupyter kernel:

  • ecmwf-opendata to download data,

  • earthkit to analyse data,

  • xarray for multi-dimensional arrays of geospatial data,

  • pandas to perform powerful operations on datasets,

  • geopandas to handle geographic data of pandas objects,

  • xagg to aggregate gridded data to polygons, and

  • plotly for interactive data visualization.

If the packages are not installed yet, uncomment the code below and run it.

# !pip3 install earthkit ecmwf-opendata xarray pandas geopandas xagg plotly
from ecmwf.opendata import Client
import earthkit.data as ekd

import os

import xarray as xr
import pandas as pd
import geopandas as gpd
import xagg as xa
import plotly.express as px

Data and plots directories

DATADIR = './data_dir/'
os.makedirs(DATADIR, exist_ok=True)

PLOTSDIR = './plots/'
os.makedirs(PLOTSDIR, exist_ok=True)

When using the ls() method, a list of all the fields in the file we downloaded will be displayed.

filename = f"{DATADIR}2t_0-168h.grib2"

client = Client(source="aws")
client.retrieve(
    date=0,
    time=0,
    step=[step for step in range(0, 174, 6)],
    stream="oper",
    type="fc",
    levtype="sfc",
    levelist=[],
    param="2t",
    model="aifs-single",
    target=filename
    )
data = ekd.from_source("file", filename)
data.ls()
Loading...

We will extract the date of the latest forecast from our dataset.

today = data.index("date")[0]
today
20250831

Our data can be converted to Xarray using the .to_xarray() method and the longitude coordinates brought to a [-180, 180] grid.
ECMWF data are by default on a [0, 360] grid.

ds = data.to_xarray()
ds_180 = ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')

We need to convert 2-metre temperature from Kelvin to Celsius.

t2m_K = ds_180['2t']
t2m = t2m_K - 273.15
t2m = t2m.assign_attrs(t2m_K.attrs)
t2m.attrs['units'] = '° C'

The latitude and longitude columns will be renamed.

t2m = t2m.rename({'latitude':'lat','longitude':'lon'})
t2m
Loading...

2. GeoJSON data

Our GeoJSON file containing a vector map of the world can be read with the geopandas package. The sov_a3 column will be renamed.
When using the head() method, a selected number of rows n (n=5 by default) and information about the fields will be displayed.

countries = gpd.read_file(f"{DATADIR}globe.geojson")
countries = countries[['sov_a3','geometry']]
countries = countries.rename(columns={'sov_a3':'cntry_id'})
countries = countries.sort_values(by=['cntry_id'])
countries.head()
Loading...

For data visualisation purposes, we will calculate the centroid of our geometry object.

x_cntry, y_cntry = gpd.GeoSeries(countries.geometry).union_all().centroid.xy
_lat = y_cntry[0]
_lon = x_cntry[0]
_lat, _lon
(1.1627400958433622, 18.216117510970225)

3. Aggregating the data

We need to create a weight map of pixels in the gridded dataset by taking the intersect between each country polygon and all pixel polygons and compute the average area of overlap. For each country, then we will calculate the weighted average of the 2-metre temperatures. For instance, a pixel that covers a larger area of a country will have a different weight than a pixel that covers a smaller area.

# get overlap between pixels and polygons
weightmap = xa.pixel_overlaps(t2m, countries, subset_bbox=False)
# aggregate data in t2m onto polygons
aggregated = xa.aggregate(t2m, weightmap)
ds2t = aggregated.to_dataframe()
# export the data into a .csv file
ds2t.to_csv(f"{DATADIR}agg_t2m.csv")

To read our .csv file into Pandas DataFrame, we will use the read_csv() function. Then, we will sort the data by the step and cntry_id columns.

ds2t = pd.read_csv(f"{DATADIR}agg_t2m.csv")
ds2t = ds2t.sort_values(by=['step', 'cntry_id'])
ds2t.head()
Loading...

We want to display the step range as a list of numbers from 0 to 168 and thus we will update it.

step1 = ds2t.iloc[0]['step']
step1
'0 days 00:00:00'
steps = ds2t['step'].unique()
steps
array(['0 days 00:00:00', '0 days 06:00:00', '0 days 12:00:00', '0 days 18:00:00', '1 days 00:00:00', '1 days 06:00:00', '1 days 12:00:00', '1 days 18:00:00', '2 days 00:00:00', '2 days 06:00:00', '2 days 12:00:00', '2 days 18:00:00', '3 days 00:00:00', '3 days 06:00:00', '3 days 12:00:00', '3 days 18:00:00', '4 days 00:00:00', '4 days 06:00:00', '4 days 12:00:00', '4 days 18:00:00', '5 days 00:00:00', '5 days 06:00:00', '5 days 12:00:00', '5 days 18:00:00', '6 days 00:00:00', '6 days 06:00:00', '6 days 12:00:00', '6 days 18:00:00', '7 days 00:00:00'], dtype=object)
ds2t['step'] = ds2t['step'].replace({'0 days 00:00:00': '0',
                                     '0 days 06:00:00': '6',
                                     '0 days 12:00:00': '12',
                                     '0 days 18:00:00': '18',
                                     '1 days 00:00:00': '24',
                                     '1 days 06:00:00': '30',
                                     '1 days 12:00:00': '36',
                                     '1 days 18:00:00': '42',
                                     '2 days 00:00:00': '48',
                                     '2 days 06:00:00': '54',
                                     '2 days 12:00:00': '60',
                                     '2 days 18:00:00': '66',
                                     '3 days 00:00:00': '72',
                                     '3 days 06:00:00': '78',
                                     '3 days 12:00:00': '84',
                                     '3 days 18:00:00': '90',
                                     '4 days 00:00:00': '96',
                                     '4 days 06:00:00': '102',
                                     '4 days 12:00:00': '108',
                                     '4 days 18:00:00': '114',
                                     '5 days 00:00:00': '120',
                                     '5 days 06:00:00': '126',
                                     '5 days 12:00:00': '132',
                                     '5 days 18:00:00': '138',
                                     '6 days 00:00:00': '144',
                                     '6 days 06:00:00': '150',
                                     '6 days 12:00:00': '156',
                                     '6 days 18:00:00': '162',
                                     '7 days 00:00:00': '168'}, regex=True)

4. Data visualisation

The interactive plot and animation below show analysis of 2-metre temperature for today at 00 UTC.

# Specify a time step up to 168
step_n = '168'

# Plot the data
fig = px.choropleth(ds2t.loc[ds2t.step == step_n],
                        geojson=countries,
                        featureidkey='properties.cntry_id',
                        locations='cntry_id',
                        color='2t',
                        color_continuous_scale="Spectral_r",
                        range_color = [-40, 50])
fig.update_layout(title=f"AIFS: 2 metre temperature (˚C) on {today} (+{step_n}h)",
                  width=650,
                  height=420)
fig.show()
Loading...
fig = px.choropleth_map(ds2t,
                        geojson=countries,
                        featureidkey='properties.cntry_id',
                        locations='cntry_id',
                        color='2t',
                        animation_frame = 'step',
                        animation_group='cntry_id',
                        color_continuous_scale="Spectral_r",
                        range_color = [-40, 50],
                        map_style="satellite")
fig.update_layout(title=f"AIFS: 2 metre temperature (˚C) on {today} over the next seven days",
                  map_center={"lat": _lat, "lon": _lon},
                  map_zoom=0.8,
                  height=755)
fig.show()
Loading...