The green transition: Why the availability of power system data matters

The green transition has overwhelmed the power grid connection process for new customers. Swift and cost-effective connection relies on the expertise of power system analysts. To achieve this, Statnett believe we must enable analysts to adopt a data-driven approach akin to data scientists.

In 2019, Statnett therefore established the FRIDA program with a cross-disciplinary product team whose objective was to empower analyst by making it much easier to find and use power system data for analysis. In this post we’ll present how the team accomplished this task. The first part focuses on on the current industry challenge faced by Statnett and network operators around the world. The second part presents the tools and processes that were created to tackle it.

Overwhelming demand for grid capacity poses challenges for the power system

Statnett holds three key roles in the power system: grid owner, system operator, and transmission system network planner. We currently face challenges in:

Grid Connection Requests: An increasing number of requests from power producers and industrial projects, such as battery factories and data centers, demand timely responses, proper guidance for optimal project planning, and accelerated grid connection processes.
High Power System Utilization: During certain times of the year or when components are disconnected, the power system operates at maximum capacity. Consequently, optimizing incident management, maintenance planning, and disconnections is becoming increasingly vital.

Overall, Statnett’s decisions and services significantly impact both our customers’ value creation and the security of supply. Even minor improvements can yield considerable value.

Power system analysts require access to comprehensive amounts of data

To achieve these improvements, Statnett’s power system engineers need easy access to reliable information from various areas of the organization. They analyze vast quantities of data, including:

Graph model: The Common Information Model (CIM) is a comprehensive graph model that details all relevant components within the power system. It outlines connections, electrical attributes, and metadata, such as ownership. However, its complexity, consisting of millions of triplets and hundreds of data-classes, requires a level of IT knowledge that most analyst don’t have, which prevents efficient data exploration.
Time series: Long-term data, especially extreme values, is crucial to identify stress scenarios in the system. Given the significant influence of weather on the Norwegian power system, analysts typically require a decade’s worth of data to build confidence. For most equipment Statnett receives measurements from sensors and estimates from a network solver about every ten seconds.
Events: Information on faults, maintenance, and special regulations affecting system utilization is vital, along with other market data and the cost of non-delivered energy that influences bottleneck-related costs.

Traditionally, this information was only accessible through source systems or data warehouses using reports. Analysts spent 30-50% of their time locating, extracting, and ensuring data quality. To get the customers connected the analysts should spend their time finding solutions, not data.

The biggest challenge is just the sheer volume of projects. There are only so many power engineers out there who can do the sophisticated studies we need to do to ensure the system stays reliable, and everyone else is trying to hire them, too
Ken Seiler, who leads system planning at PJM Interconnection. Source: Wind and Solar Energy Projects Risk Overwhelming America’s Antiquated Electrical Grids – The New York Times (nytimes.com)

The Power SDK makes power system data instantly available in Python

As a team we worked closely with the power system engineers to identify the areas and user stories that would be most useful to focus on. We mostly focused on creating a web app to meet their needs for finding components and studying observations related to them. This is further explained in the movie below. The movie is in Norwegian.

In the remaining part of this post, we will focus on what we did to meet the wish from our power system engineers to have the power system data instantly available in Python. The Statnett Power SDK is a Python package built on top of Statnetts data platform, to make it easier to retrive data from multiple sources and analyse this data to make better decisions.

The Power SDK was initially developed in cooperation with Cognite. Cognite Data Fusion is a part of Statnetts data platform and the primary source of information in the Power SDK. You may find the original source code here. The Power SDK has since then been further developed in Statnett. The Power SDK aims to address three common issues that made the initial user threshold too high:

End users needed to connect to different databases and understand the data structure and format, which typically differed from database to database.
Related data could be identified through code, but this required knowledge of graph queries and graph traversals, which most analysts lacked. For instance, adherence to the same price.
The standard SDK provided by the platform used unfamiliar terms and lacked functions for typical use cases within the power analysis domain.

Below we have provided examples of how the Power SDK addresses these issues. Let’s say that an analyst needs to know the power produced in price area NO5. With the Power SDK, she may get this information through querying in terms that are familiar to how she navigates the power system mentally.

To get all substations in a price area you may write:

Python

substations = client.substations.list(bidding_area="Elspot NO5")

You may also specify the substations you want by name, voltage level, grid type and whether to include historic substations as well. You may even define areas based on the power lines entering the area. The area object will contain all the substations and power lines within the area.

From the area or list of substations you may find all connected components, such as generating units or power transformers:

Python

generating_units = substations.generating_units()

From these components you may retrieve all time series to the generators. Here I retrieve all time series of ThreePhaseActivePower related to the generators:

Python

generating_units.time_series(measurment_type="ThreePhaseActivePower")

To get events related to maintenance instead:

Python

generating_units.events(type="Driftsstans")

Once you have the time series or events, a user with basic Python skills may easily do powerful analysis. The example below shows how to find the total production in a power area over time. The area object refered to in the example could be defined by (in this example fictitious substation names):

Python

area = client.power_area(substations=['Substation1', 'Substation2']).expand_area(level=2)

or by specifying which power lines that defines the electric area.

The code below finds all relevant time series associated with the generators in the power area defined earlier, summarizes these and plots hourly values over the past year.

The analysts need guidance and support to embrace new tools

Getting busy analysts to change how they work is challenging unless you provide easy to use tools with reliable access to support. Otherwise, the barrier to trying new ways of doing things can be too significant, especially in a busy work environment. Nevertheless, we believe that the workload does not have to be large to make a change.

To provide support, we have established a common chat for users to facilitate discussions with the team and other users. Both the team and expert users can quickly provide feedback, making it easier for users to explore new approaches. Moreover, we have set up JupyterHub, enabling users without Python experience to get started faster. This also allows departments to use notebooks as lightweight dashboards, facilitating the sharing of insights.

As a result, the adoption of our Power SDK has increased dramatically, and the service has also reduced the cost and time to market for integrating other data sources for analysts. For instance, the Entsoe-py package has become popular and reduced the load on our internal data platform team.

Furthermore, helping users who encounter challenges provides us with a better understanding of their needs and pain points, facilitating the development of our products. Through our dialogue with users, we can capture the value of what we do and share our knowledge with others. For example, feedback from a user states:

Previously, I would have spent a week extracting data from the data arehouse to investigate load and flow into and through the Oslo area. Now I can create good assumptions within 1 hour.
Power System Analyst in Statnetts Regional Grid Planning department

The feedback we receive from users is useful for sharing knowledge and gaining support from management to continue our development efforts.

The analysts have used the tools to improve their methods

We are thrilled to see that analysts have a keen interest in using statistical methods to make better decisions. With our experience and expertise, we can increase the scope of data analysis and data science in Statnett, empowering analysts to generate valuable insights.

The Power SDK can also load weather data from Statnetts substations. This figure shows the relationship between the peak hourly load per day and average outdoor temperature. The figure is done by an analyst working in our long term planning department and used to assess the expected peak load in a region in Norway.

These assessments may have great economic and environmental impact as well: As the figure shows, the load is increasing at a lower rate as it gets colder. Compared to a linear trend, which was common historically, this leads to a lower expected peak demand and hence lower demand for grid capacity.

If you want more information or have ideas for cooperation, please don’t hesitate to contact us.

Data Science @ Statnett

Leave a ReplyCancel reply

Quantifying uncertainty in forecasts – methods and lessons from mFRR flow prediction

From 10,000 Manual Checks to Continuous Insight

Improving and evaluating prediction intervals

Trending

Quantifying uncertainty in forecasts – methods and lessons from mFRR flow prediction

From 10,000 Manual Checks to Continuous Insight

Improving and evaluating prediction intervals

Do you wish you could learn how to automate power grid operation, increase transfer capacity, predict transformer failure, and more in one day? – Data Science at #techday