Building Electricity Networks#

The preparation process of the PyPSA-Eur energy system model consists of a group of snakemake rules which are briefly outlined and explained in detail in the sections below.

Not all data dependencies are shipped with the git repository. Instead we provide separate data bundles which can be obtained using the retrieve* rules (Retrieving Data). Having downloaded the necessary data, it can build a base PyPSA network with the following rules

  • build_shapes generates GeoJSON files with shapes of the countries, exclusive economic zones and NUTS3 areas.

  • base_network builds and stores the base network with all buses, HVAC lines and HVDC links, and determines Voronoi cells for all substations.

The network is then simplified by preparing approximations of the network model, for which it is computationally viable to co-optimize generation, storage and transmission capacities.

  • simplify_network transforms the transmission grid to a 380 kV only equivalent network, while

  • cluster_network uses a k-means based clustering technique to partition the network into a given number of zones and then reduce the network to a representation with one bus per zone.

The simplification and clustering steps are described in detail in the paper

Then, the process continues by calculating conventional power plant capacities, potentials, and per-unit availability time series for variable renewable energy carriers and hydro power plants with the following rules:

  • build_powerplants for today’s thermal power plant capacities using powerplantmatching allocating these to the matching clustered region for each powerplant,

  • determine_availability_matrix for the land eligibility analysis of each cutout grid cell for PV, onshore and offshore wind,

  • build_renewable_profiles for the hourly capacity factors and installation potentials constrained by land-use in each substation’s Voronoi cell for PV, onshore and offshore wind, and

  • build_hydro_profile for the hourly per-unit hydro power availability time series.

The rules add_electricity and prepare_network then tie all the different data inputs together into a detailed PyPSA network stored in networks/base_s_{clusters}

Rule build_cutout#

Create cutouts with atlite.

For this rule to work you must have

For details on the weather data read the atlite documentation. If you need help specifically for creating cutouts the corresponding section in the atlite documentation should be helpful.


  • cutouts/{cutout}: weather data from either the ERA5 reanalysis weather dataset or SARAH-3 satellite-based historic weather data with the following structure:

ERA5 cutout:






time, y, x


Surface pressure


time, y, x


Air temperature 2 meters above the surface.

soil temperature

time, y, x


Soil temperature between 1 meters and 3 meters depth (layer 4).


time, y, x


Top of Earth’s atmosphere TOA incident solar radiation


time, y, x


Total sky direct solar radiation at surface


time, y, x


Runoff (volume per area)


y, x


Forecast surface roughness (roughness length)


y, x


Surface elevation above sea level


time, y, x

Albedo measure of diffuse reflection of solar radiation. Calculated from relation between surface solar radiation downwards (Jm**-2) and surface net solar radiation (Jm**-2). Takes values between 0 and 1.


time, y, x


Diffuse solar radiation at surface. Surface solar radiation downwards minus direct solar radiation.


time, y, x


Wind speeds at 100 meters (regardless of direction)


A SARAH-3 cutout can be used to amend the fields temperature, influx_toa, influx_direct, albedo, influx_diffuse of ERA5 using satellite-based radiation observations.

Rule clean_osm_data#

This script is used to clean OpenStreetMap (OSM) data for creating a PyPSA-Eur ready network.

The script performs various cleaning operations on the OSM data, including: - Cleaning voltage, circuits, cables, wires, and frequency columns - Splitting semicolon-separated cells into new rows - Distributing values to circuits based on the number of splits - Adding line endings to substations based on line data

Rule build_osm_network#

Rule base_network#

Creates the network topology from a ENTSO-E map extract. (March 2022) or OpenStreetMap data (Aug 2024) as a PyPSA network.


Creates the network topology from an ENTSO-E map extract, and create Voronoi shapes for each bus representing both onshore and offshore regions.

Rule build_transmission_projects#

Gets the transmission projects defined in the config file, concatenates and deduplicates them. Projects are later included in


  • networks/ Base network topology for the electricity grid. This is processed in

  • data/transmission_projects/"project_name"/: Takes the transmission projects from the subfolder of data/transmission_projects. The subfolder name is the project name.

  • offshore_shapes.geojson: Shapefile containing the offshore regions. Used to determine if a new bus should be added for a new line or link.

  • europe_shape.geojson: Shapefile containing the shape of Europe. Used to determine if a project is within the considered countries.


  • transmission_projects/new_lines.csv: New project lines to be added to the network. This includes new lines and upgraded lines.

  • transmission_projects/new_links.csv: New project links to be added to the network. This includes new links and upgraded links.

  • transmission_projects/adjust_lines.csv: For lines which are upgraded, the decommissioning year of the existing line is adjusted to the build year of the upgraded line.

  • transmission_projects/adjust_links.csv: For links which are upgraded, the decommissioning year of the existing link is adjusted to the build year of the upgraded link.

  • transmission_projects/new_buses.csv: For some links, we have to add new buses (e.g. North Sea Wind Power Hub).

Rule build_shapes#

Creates GIS shape files of the countries, exclusive economic zones and `NUTS3 <> `_ and OSM ADM1 areas (for BA, MD, UA, and XK).

Rule build_gdp_pop_non_nuts3#

Rule build_electricity_demand#

This rule downloads the load data from Open Power System Data Time series. For all countries in the network, the per country load timeseries are extracted from the dataset. After filling small gaps linearly and large gaps by copying time-slice of a given period, the load data is exported to a .csv file.

Rule simplify_network#

Lifts electrical transmission network to a single 380 kV voltage layer, removes dead-ends of the network, and reduces multi-hop HVDC connections to a single link.



The rule simplify_network does up to three things:

  1. Create an equivalent transmission network in which all voltage levels are mapped to the 380 kV level by the function simplify_network(...).

  2. DC only sub-networks that are connected at only two buses to the AC network are reduced to a single representative link in the function simplify_links(...).

  3. Stub lines and links, i.e. dead-ends of the network, are sequentially removed from the network in the function remove_stubs(...).

Rule cluster_network#

Creates networks clustered to {cluster} number of zones with aggregated buses and transmission corridors.




Is it possible to run the model without the simplify_network rule?

No, the network clustering methods in the PyPSA module pypsa.clustering.spatial do not work reliably with multiple voltage levels and transformers.

Exemplary unsolved network clustered to 512 nodes:


Exemplary unsolved network clustered to 256 nodes:


Exemplary unsolved network clustered to 128 nodes:


Exemplary unsolved network clustered to 37 nodes:


Rule build_monthly_prices#

This script extracts monthly fuel prices of oil, gas, coal and lignite, as well as CO2 prices.


The rule build_monthly_prices collects monthly fuel prices and CO2 prices and translates them from different input sources to pypsa syntax

Data sources:

[1] Fuel price index. Destatis [2] average annual fuel price lignite, ENTSO-E [3] CO2 Prices, Emission spot primary auction, EEX

Data was accessed at 16.5.2023

Rule build_ship_raster#

Transforms the global ship density data from the `World Bank Data Catalogue.

<>`_ to the size of the considered cutout. The global ship density raster is later used for the exclusion when calculating the offshore potentials.


  • resources/ Reduced version of global shipping traffic density from World Bank Data Catalogue to reduce computation time.


Rule determine_availability_matrix_MD_UA#

Create land elibility analysis for Ukraine and Moldova with different datasets.

Rule determine_availability_matrix#

The script performs a land eligibility analysis of what share of land is availability for developing the selected technology at each cutout grid cell. The script uses the atlite library and several GIS datasets like the CORINE land use data, LUISA land use data, Natura2000 nature reserves, GEBCO bathymetry data, and shipping lanes.



  • resources/availability_matrix_{clusters_{technology}.nc

Rule build_renewable_profiles#

Calculates for each clustered region the (i) installable capacity (based on land-use from determine_availability_matrix), (ii) the available generation time series (based on weather data), and (iii) the average distance from the node for onshore wind, AC-connected offshore wind, DC-connected offshore wind and solar PV generators.


Hydroelectric profiles are built in script build_hydro_profiles.


  • resources/profile_{technology}.nc with the following structure





    bus, time

    the per unit hourly availability factors for each bus



    maximal installable capacity at the bus (in MW)



    average distance of units in the region to the grid bus for onshore technologies and to the shoreline for offshore technologies (in km)

    • profile

    • p_nom_max

    • average_distance



This script functions at two main spatial resolutions: the resolution of the clustered network regions, and the resolution of the cutout grid cells for the weather data. Typically the weather data grid is finer than the network regions, so we have to work out the distribution of generators across the grid cells within each region. This is done by taking account of a combination of the available land at each grid cell (computed in determine_availability_matrix) and the capacity factor there.

Based on the availability matrix, the script first computes how much of the technology can be installed at each cutout grid cell. To compute the layout of generators in each clustered region, the installable potential in each grid cell is multiplied with the capacity factor at each grid cell. This is done since we assume more generators are installed at cells with a higher capacity factor.

_images/offwinddc-gridcell.png _images/offwindac-gridcell.png _images/onwind-gridcell.png _images/solar-gridcell.png

This layout is then used to compute the generation availability time series from the weather data cutout from atlite.

The maximal installable potential for the node (p_nom_max) is computed by adding up the installable potentials of the individual grid cells.

Rule build_hydro_profile#

Build hydroelectric inflow time-series for each country.


  • resources/





    countries, time

    Inflow to the state of charge (in MW), e.g. due to river inflow in hydro reservoir.

    _images/inflow-ts.png _images/inflow-box.png

Rule build_powerplants#

Retrieves conventional powerplant capacities and locations from powerplantmatching, assigns these to buses and creates a .csv file. It is possible to amend the powerplant database with custom entries provided in data/custom_powerplants.csv. Lastly, for every substation, powerplants with zero-initial capacity can be added for certain fuel types automatically.


  • resource/powerplants_s_{clusters}.csv: A list of conventional power plants (i.e. neither wind nor solar) with fields for name, fuel type, technology, country, capacity in MW, duration, commissioning year, retrofit year, latitude, longitude, and dam information as documented in the powerplantmatching README; additionally it includes information on the closest substation/bus in networks/base_s_{clusters}.nc.


The configuration options electricity: powerplants_filter and electricity: custom_powerplants can be used to control whether data should be retrieved from the original powerplants database or from custom amendments. These specify pandas.query commands. In addition the configuration option electricity: everywhere_powerplants can be used to place powerplants with zero-initial capacity of certain fuel types at all substations.

  1. Adding all powerplants from custom:

    powerplants_filter: false
    custom_powerplants: true
  2. Replacing powerplants in e.g. Germany by custom data:

    powerplants_filter: Country not in ['Germany']
    custom_powerplants: true


    powerplants_filter: Country not in ['Germany']
    custom_powerplants: Country in ['Germany']
  3. Adding additional built year constraints:

    powerplants_filter: Country not in ['Germany'] and YearCommissioned <= 2015
    custom_powerplants: YearCommissioned <= 2015
  4. Adding powerplants at all substations for 4 conventional carrier types:

    everywhere_powerplants: ['Natural Gas', 'Coal', 'nuclear', 'OCGT']

Rule add_electricity#

Adds existing electrical generators, hydro-electric plants as well as greenfield and battery and hydrogen storage to the clustered network.


The rule add_electricity ties all the different data inputs from the preceding rules together into a detailed PyPSA network that is stored in networks/base_s_{clusters} It includes:

  • today’s transmission topology and transfer capacities (optionally including lines which are under construction according to the config settings lines: under_construction and links: under_construction),

  • today’s thermal and hydro power generation capacities (for the technologies listed in the config setting electricity: conventional_carriers), and

  • today’s load time-series (upsampled in a top-down approach according to population and gross domestic product)

It further adds extendable generators with zero capacity for

  • photovoltaic, onshore and AC- as well as DC-connected offshore wind installations with today’s locational, hourly wind and solar capacity factors (but no current capacities),

  • additional open- and combined-cycle gas turbines (if OCGT and/or CCGT is listed in the config setting electricity: extendable_carriers)

Furthermore, it attaches additional extendable components to the clustered network with zero initial capacity:

  • StorageUnits of carrier ‘H2’ and/or ‘battery’. If this option is chosen, every bus is given an extendable StorageUnit of the corresponding carrier. The energy and power capacities are linked through a parameter that specifies the energy capacity as maximum hours at full dispatch power and is configured in electricity: max_hours:. This linkage leads to one investment variable per storage unit. The default max_hours lead to long-term hydrogen and short-term battery storage units.

  • Stores of carrier ‘H2’ and/or ‘battery’ in combination with Links. If this option is chosen, the script adds extra buses with corresponding carrier where energy Stores are attached and which are connected to the corresponding power buses via two links, one each for charging and discharging. This leads to three investment variables for the energy capacity, charging and discharging capacity of the storage unit.

Rule prepare_network#

Prepare PyPSA network for solving according to The {opts} wildcard and ll, such as.

  • adding an annual limit of carbon-dioxide emissions,

  • adding an exogenous price per tonne emissions of carbon-dioxide (or other kinds),

  • setting an N-1 security margin factor for transmission line capacities,

  • specifying an expansion limit on the cost of transmission expansion,

  • specifying an expansion limit on the volume of transmission expansion, and

  • reducing the temporal resolution by averaging over multiple hours or segmenting time series into chunks of varying lengths using tsam.



The rule prepare_elec_networks runs for all scenario s in the configuration file the rule prepare_network.