Writing readable mathematical code

In Statnett, we solve problems that involve a significant amount of mathematics. We need code to solve these problems, and writing readable, maintainable, testable and efficient code that computes intricate mathematical equations can be challenging. This blog post will look at one way to tackle this challenge. As a practical example, we will discuss the architectural and style choices behind the (soon to be open source) linerate Python package, which implements the conductor heat model described in CIGRE TB 601.

Dynamic line rating and CIGRE TB 601

Let’s first look briefly at the usecase we consider:  CIGRE’s Technical Brochure number 601, or CIGRE601. CIGRE601 is an essential part of our dynamic line rating work, which you can read more about in one of our previous blog posts. Here, we focus on the architectural and programming style choices we made when we developed code that implements the CIGRE 601 thermal heating model.

The mathematics of CIGRE 601 is not overly complicated, but there is a lot of it, which can make it difficult to follow if you’re not careful. At its core, CIGRE601 is a heat balance equation:

$$P_J(I) + P_s(\mathbf{w}) = P_c(T, \mathbf{w}) + P_r(T, \mathbf{w}). $$

In other words, the heating from the current ($P_J$) plus the solar heating ($P_s$) is equal to the convective cooling ($P_c$) plus the radiative (black body) cooling ($P_r$). In other words: the heating from the sun and the electricity flowing through is equal to the cooling from the wind and the radiated heat waves when the line has a constant temperature. In our case, we want to provide a current, $I$, and weather conditions, $\mathbf{w}$, and use these to solve the above equation with respect to the line temperature, $T$.

To evaluate only a single term of the above equation, we must evaluate up to another 20 other “sub-equations”. In the code, we need to keep track of all these sub-equations: what should be their input and what should be their output? We also want the implementations to be readable correct, and efficient.


To get readable correct, and efficient code, we need to consider the software architecture. We need an architecture that strives for readability and testability without sacrificing speed (too much). How can we accomplish this?

A user-friendly frontend with a functional backend

As a basic user of a toolkit, I want to run code on the form

model = Cigre601(line_parameters, weather_parameters, time)
convective_cooling = model.compute_convective_cooling(temperature)

However, we also want access to the equations themselves. While the code above is excellent for computing properties for specific lines (like the total convective cooling or ampacity rating), it could be better structured for exploring how the equations behave. To facilitate such exploration, we also want access to the sub-terms themselves.

What’s more, if we use an object-oriented approach to implement these sub-terms, then it might be appealing to store their values as properties or attributes. By doing so, we can save time on repeated computations. However, we also hide what each function depends on and provides. We want an interface that makes it clear what information we need for each computation, and if the methods implementing the equations have access to self.anything_at_all, then it is unclear what they need.

Instead, I think that a functional approach, where each mathematical equation is implemented as a pure function— a deterministic function without state — is better. Therefore, linerate has a single function for (more or less) each equation in CIGRE 601. This makes is very easy to see how the code ties together with the equations. For example, the method that computes the Joule heating is defined by

def compute_joule_heating(current, resistance):
    I = current  # noqa
    R = resistance
    return I * I * R

And yes, this approach will create a minor additional computational overhead, but I found it insignificant compared to the benefits such an approach gave.

Still, I’m not saying that an object-oriented approach is only a bad thing. For linerate, we have the best of both worlds: a functional “backend” that makes it easy to explore each equation in CIGRE601 and an object-oriented “frontend” that makes it easy to compute ampacity ratings for a given power line.

Performant Python code

An essential part of mathematical code is its performance. We do not want to use native Python code for any numerical operation. Instead, we want to use the vectorisation capabilities of NumPy.

Specifically, we only use basic binary operators and NumPy universal functions (ufuncs). In the cases when we cannot use builtin ufuncs or binary operators (for example table lookups), I recommend using numba.vectorize to create your own blazing fast JIT-compiled ufuncs. By structuring the code this way, we never need to write another loop!

Say, for example, that we want to compute the line rating for all integer air temperatures between $0^\circ \text{C}$ and $6^\circ \text{C}$ and five linearly spaced wind speeds between $0 \frac{\text{m}}{\text{s}}$ and $1 \frac{\text{m}}{\text{s}}$. If we make sure that all code only uses binary operators and numpy ufuncs, then we only need to write


air_temperature = np.arange(0, 7).reshape(7, 1)
wind_speed = np.linspace(0, 1, 5).reshape(1, 5)
weather = Weather(
model = Cigre601(weather=weather, [...])
line_rating = model.compute_steady_state_ampacity(current=2000)

The magic happens in the highlighted lines. By adding singleton axes to the array, we enable NumPy array broadcasting. Broadcasting is almost magic! Typically, when we do array computations, we need array dimensions that match up. However, singleton dimensions are “wildcards” and will match any other dimension size by repeating values in the singleton dimension until they match up. I think the best way to understand broadcasting is by computing your own multiplication table:

x = np.arange(0, 5).reshape(1, 5)
y = np.arange(0, 5).reshape(5, 1)

print("x * y:")
print(x * y)
[[0 1 2 3 4]]
x * y:
[[ 0  0  0  0  0]
 [ 0  1  2  3  4]
 [ 0  2  4  6  8]
 [ 0  3  6  9 12]
 [ 0  4  8 12 16]]

We see that with broadcasting we can very easily compute the product of all possible combinations of entries in x and y. By only using ufuncs in our numerical code, we enable similar approaches, for example, to compute the ampacity rating for all possible combinations of weather- and line parameters.

Automatic testing of mathematical code

Okay, we got the code architecture sorted and the code super fast, so now we only need tests. Fortunately, another benefit of using pure functions for the equations is that they become straightforward to test.

Here, we will discuss two types of tests that we can make for our code: acceptance tests and property-based unit tests. The acceptance test defines a minimal set of requirements for the implementation and is provided by the standard. In CIGRE 601, we can find these under Annex E. Examples of calculation.

The more interesting part is the property-based unit tests. Property-based testing is a beautiful way to write elegant tests on the form:

given input that satisfies X, output should satisfy Y.

In Python, we use the Hypothesis library for property based testing, and I think the best way to understand it is with an example.

Consider the sorted-function. Once we sort a list, its numbers should be in ascending order, which means that lst[i] <= lst[i+1], and a test for this property could look like this:

from itertools import pairwise
import hypothesis.strategies as st
from hypothesis import given

def test_sorted_is_ascending(lst):
    lst = sorted(lst)
    for l1, l2 in pairwise(lst):
        assert l1 <= l2

Do you think this test will pass? Click here to try it out and see for yourself!

In our case, we have mathematical equations that we want to test. And the nice thing with mathematical equations is that they have a vast amount of properties we can test! For example, the equation to compute the convective cooling, $P_c$, states that

$$ P_c = \pi \lambda_f (T_s – T_a) \text{Nu}, $$

where $\lambda_f$ is the thermal conductivity of air, $\text{Nu}$ is the Nusselt number, and $T_s$ and $T_a$ is the temperature of the conductor (surface) and air, respectively. We know that this function is linear with respect to $\lambda_f$. Moreover, if we set $T_s – T_a = \text{Nu} = 1$, then $P_c = \pi \lambda_f$, which gives us the following property-based test

def test_convective_cooling_scales_linearly_with_thermal_conductivity_of_air(
    Nu = 1
    T_s = 1
    T_a = 0
    lambda_f = thermal_conductivity_of_air
    P_c = convective_cooling.compute_convective_cooling(T_s, T_a, Nu, lambda_f)
    assert P_c == np.pi * approx(lambda_f)

Here, we let Hypothesis select arbitrary values for $\lambda_f$, and only test whether whether $P_c = \pi \ \lambda_f$ when $\text{Nu} = T_s – T_a = 1$. If we didn’t use Hypothesis, we would either have to hard-code the test cases (e.g. with pytest.mark.parametrize) or iterate over random values for $\lambda_f$, which would make it less obvious what the code actually tests.

Style considerations

Selecting good variable names

One of the first style considerations we encounter when implementing mathematical equations with code is what we should name our variables. Do we use mathematical symbols or semantic names for the variable names?

Recall the equation for the convective cooling:

$$ P_c = \pi \lambda_f (T_s – T_a) \text{Nu}. $$

With Python, we could write the following equivalent function:

def compute_convective_cooling(
    lambda_f: float,
    T_s: float,
    T_a: float,
    Nu: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    return pi * lambda_f * (T_s - T_a) * Nu

However, while this makes it easy to see how the mathematics works, it could be more user-friendly. For example, do you remember what lambda_f represents?

We could, of course, switch it around and use argument names with semantic meaning:

def compute_convective_cooling(
    thermal_conductivity_of_air: float,
    surface_temperature: float,
    air_temperature: float,
    nusselt_number: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    return pi * thermal_conductivity_of_air * (surface_temperature - air_temperature) * nusselt_number

but then it becomes challenging to connect the code and the equations.

To get code where the variables both have semantic meaning, and the equations are easy to follow, we can combine these two approaches:

def compute_convective_cooling(
    thermal_conductivity_of_air: float,
    surface_temperature: float,
    air_temperature: float,
    nusselt_number: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    lambda_f = thermal_conductivity_of_air
    T_s = surface_temperature
    T_a = air_temperature
    Nu = nusselt_number
    return pi * lambda_f * (T_s - T_a) * Nu

This way, the API clarifies what we need to know to compute the convective cooling, while it is also easy to compare the code with the reference equations.

Selecting good function names

One of my main pet peeves when I deal with mathematical code is function names. All functions (unless you have a very good reason) should have a verb in their name. After all, a function does do something, so why not explicitly state what it does?

In mathematical code, naming functions with verbs is even more critical. Take, for example, the convective_cooling function we defined above. The function is very readable. We see what the inputs are, what it computes and how. Still, if we want to use this function to compute some convective cooling, we should also give that variable a good name. Any suggestions for a good name? I would probably want to go with convective_cooling. But wait, that’s the name of the function, so we might end up writing

convective_cooling = convective_cooling([...])

This is not ideal.

What is the solution to the naming problem? Simply add a verb to the function name. For example, “compute”:

def compute_convective_cooling(
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    lambda_f = thermal_conductivity_of_air
    T_s = surface_temperature
    T_a = air_temperature
    Nu = nusselt_number
    return lambda * (T_s - T_a) * Nu

convective_cooling = compute_convective_cooling(

Sure, the function names get longer, but the code also becomes much more pleasant to read!

Using typing.annotated to provide units

A handy new Python feature (3.9) is annotated types. Annotated types allow us to provide extra information to our type hints. This is very useful for physical equations, where we can use the annotations to represent units. In linerate, we do this in the units.py-file

from typing import Union

    from typing import Annotated
except ImportError:  # Python version <3.9
    from typing_extensions import Annotated

import numpy as np
import numpy.typing as npt

FloatOrFloatArray = Union[float, np.float64, npt.NDArray[np.float64]]

OhmPerMeter = Annotated[FloatOrFloatArray, "Ω/m"]
Ampere = Annotated[FloatOrFloatArray, "A"]
Radian = Annotated[FloatOrFloatArray, "rad"]

We use these annotations in the functions that implement the mathematical equations:

def compute_joule_heating(current: Ampere, resistance: OhmPerMeter) -> WattPerMeter:
    I = current  # noqa
    R = resistance
    return I * I * R

Using annotations like this makes it extraordinarily easy to know what units we use in the different functions. But unfortunately, the units are not type-checked, so we can still write the wrong units by accident. I’m hopeful that a package like Pint can provide static checking of units in the future, but that is not possible (at least for now).

Extensive use of documentation strings

I believe docstrings are particularly important when we implement mathematical equations. A bare minimum is listing the source of the equations (e.g. Equation (17) on page 24 of CIGRE TB 601.). Even better, if you’re using Sphinx for the documentation, you can include the sphinxcontrib.bibtex plugin, which enables you to link the documentation against a BibTeX database (e.g. Equation (17) on page 24 of :cite:p:`cigre601`.).

After listing which equation(s) each function represents, we may want also to describe the notation. I like using the function arguments and returns field of the docstring for this, preferably formatted with LaTeX and using either the NumPy docstring style or the Google docstring style. A nice benefit we get by including type hints is that Sphinx can automatically parse these to provide the types in the docstring, also. To enable this, set

autodoc_typehints = "description"
autodoc_typehints_description_target = "documented"

in the Sphinx conf.py file. Unfortunately, I have yet to figure out how to get it to include the return types properly, but at least we don’t need to worry about the input types.

A final and somewhat optional step is to also include information about the quantities in the docstring so that the user does not need to look at the references. If we combine all these steps, we get a docstring kind of like the one below

def compute_convective_cooling(
    surface_temperature: Celsius,
    air_temperature: Celsius,
    nusselt_number: Unitless,
    thermal_conductivity_of_air: WattPerMeterPerKelvin,
) -> WattPerMeter:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`, given by
    .. math::
        P_c = \lambda_f \ Nu \ (T_a - T_s).

        :math:`T_s~\left[^\circ\text{C}\right]`. The conductor surface temperature.
        :math:`T_a~\left[^\circ\text{C}\right]`. The ambient air temperature.
        :math:`Nu`. The nusselt number.
        :math:`\lambda_f~\left[\text{W}~\text{m}^{-1}~\text{K}^{-1}\right]`. The thermal
        conductivity of air at the given temperature.

    Union[float, float64, ndarray[Any, dtype[float64]]]
        :math:`P_c~\left[\text{W}~\text{m}^{-1}\right]`. The convective cooling of the conductor.
        Either due to wind, or passive convection, whichever is largest.

Closing notes

There is no single solution to writing maintainable and readable mathematical code. This blog post demonstrates the choices behind linerate, but I am sure there are ways to improve these decisions. My main pain point is units, and for a future project, I hope to check if Pint might be part of a solution to ensuring consistent type usage. Moreover, I would test the code more thoroughly if given infinite time. For example, instead of checking that a function is linear by fixing all parameters but one, we could numerically differentiate it and check that the derivative is constant almost everywhere. If you have any ideas for writing maintainable and readable mathematical code, please share them in the comments below!

Cimsparql: Loading power system data into pandas dataframes in Python

In 2019, we started working on a model that should be able to handle intra-zonal constraints in the upcoming balancing market. That methodology has been presented in a previous post in January 2022. In this post, we will focus on an open source Python library called cimsparql that we have developed to support this model. For the model to be able to perform any analysis, it needs data that describe the state of the power system. At Statnett, these data are available as CIM (Common Information Model) profiles. The data is made available through a triple store (GraphDB/Blazegraph/rdf4j), using a resource description framework which is a standard model for data interchange.

The information about the power system available in these CIM profiles can be used for different purposes, and what information should be extracted depends on the requirement of your model. In the previously presented post, a DC optimal power flow model is used. Thus we need data on generation, demand and transmission lines. The purpose of the cimsparql package is to extract this information from the triple store, through a set of predefined sparql queries, and loading them into Python as pandas dataframes. Cimsparql will also make sure that columns in the dataframes have correct types, either string, float or integer, as defined by the CIM standard.

Cimsparql uses the SPARQLWrapper library to remotely execute sparql queries, and extends it with extra functionality, assuming the data conform to the CIM standard. Even though the package is an important part of the balancing market model, it is open source available from github and can be installed using pip.

~/pip install cimsparql

Once the library is installed, it must be configured to query a triple store using the ServiceConfig class in cimsparql.graphdb. The example below assumes you have a graphdb server with a CIM model in a repository called “micro_t1_nl”. This test case, available at the cimsparql repository on github, is used to test the development of the predefined queries.

  >>> service_cfg = ServiceConfig(repo="micro_t1_nl")
  >>> model = get_cim_model(service_cfg)

If you need to provide other configurations such as server, username and password, this can be done with the same ServiceConfig class.

Once the model is configured, the data can be loaded into a pandas dataframe using the predefined queries. In the example below, topological node information is extracted from the triple store.

>>> bus = model.bus_data()
>>> print(bus.to_string())
                                           busname      un
795a117d-7caf-4fc2-a8d9-dc8f4cf2344a  NL_Busbar__4  220.00
6bdc33de-d027-49b7-b98f-3b3d87716615   N1230822413   15.75
81b0e447-181e-4aec-8921-f1dd7813bebc   N1230992195  400.00
afddd60d-f7e6-419a-a5c2-be28d29beaf9   NL-Busbar_2  220.00
97d7d14a-7294-458f-a8d7-024700a08717    NL_TR_BUS2   15.75

Here the values in the nominal voltage column has been converted to float values as defined by the CIM standard, while node and bus names are strings.

All the predefined queries can be executed using the cimsparql.model.CimModel class. Examples are the already shown bus_data as well as loads, synchronous_machines, ac_lines and coordinates. The latter extracts coordinates of all equipment in the model from the CIM Geographical Location profile. Cimsparql orders the rows in the dataframe such that it is straightforward to use with plotly’s map functionality. The example below was made in a Jupyter notebook.

df = model.coordinates()
lines = df.query("rdf_type == 'http://iec.ch/TC57/2013/CIM-schema-cim16#ACLineSegment'")
stations = df.query("rdf_type == 'http://iec.ch/TC57/2013/CIM-schema-cim16#Substation'")
center_x, center_y = df["x"].mean(), df["y"].mean()

fig = px.line_mapbox(lines, lon="x", lat="y", color="mrid", height=1000)
fig2 = px.scatter_mapbox(stations, lon="x", lat="y", color="mrid", size=[1]*len(stations))
fig.update_geos(countrycolor="black", showcountries=True, showlakes=True, showrivers=True, fitbounds="locations")

all_fig = go.Figure(data=fig.data + fig2.data, layout = fig.layout)
AC line segments and substations included in the model

The main goal of cimsparql is to read data for the purpose of running power flow analysis using sparql queries to read data from triple store into pandas dataframes in Python. Currently the package is used internally at Statnett, where we also have some data which is yet not covered by the CIM standard. Thus some of the queries contains a namespace which will probably only be used by Statnett. However, this should not pose any problem for the use of this package elsewhere, as these namespaces or columns have been made optional. So any query towards a data set that does not contain these, will just produce a column for the given namespace with NaN values.

The package can also be uses in cases where the predefined queries does not produce data for a specific purpose. In this case, the user can provide their own queries as a string argument to the get_table_and_convert method. The example below list out the numbers of ac line segments for each voltage level in your data.

>>> query='''
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?un (count(?mrid) as ?n) where { 
?mrid rdf:type cim:ACLineSegment;
   cim:ConductingEquipment.BaseVoltage/cim:BaseVoltage.nominalVoltage ?un.
} group by ?un'''
>>> df = model.get_table_and_convert(query)

So to summarize, the main contribution of cimsparl is a set of predefined queries for the purpose of running power flow simulations and type conversion of data that follows the CIM standard.

Julia: A powerful alternative to Python

Almost three years ago, I started on my journey towards an industrial PhD in reinforcement learning applied to power systems. At around that time, I had used Python daily for 6-7 years. But one day during the autumn of 2018, Thomas, the founder of our Data Science team, threw out a new idea to me, like he used to do at the time: "Have you ever tried Julia? You really should! Some guys at MIT were tired of the slowness and problems with Python so they decided to create a new language. And there’s a module for power systems too, the PowerModels.jl." So there it started for me. And at the start my PhD project I thought it was a good opportunity for me to learn a new programming language from scratch along the way.

Getting started with Julia

If you don’t have Julia installed already, I suggest you install it now so you can follow along through the rest of this post. If you’re on Linux, the easiest way to get the installation of Julia up and running is to use juliaup:

curl -fsSL https://install.julialang.org | sh

or have a look at the Julia homepage.

Once you’re up and running, the easiest way to install a package when you’re in
a Julia session is to just start using it by typing using <package name>, as
for example in

julia> using Plots
 │ Package Plots not found, but a package named Plots is available from a registry. 
 │ Install package?
 │   (@v1.8) pkg> add Plots 
 └ (y/n/o) [y]: 

Julia seeks to solve the two-language problem

Julia is an open-source, high-level, dynamically-typed language. It was started back in 2009 by a group of people at MIT: Stefan Karpinski, Jeff Bezanson, Viral B. Shah and Alan Edelmann. The first version was released in 2012. In their blogpost from february 2012, the authors stated they wanted a programming language "… [with] the speed of C… as usable for general programming as Python…, [and] as easy for statistics as Matlab". Following this, Julia seeks to solve the two-language problem. If you want a language that is dynamic, flexible and easy to prototype in, you usually have to sacrifice speed, as in Python and R. On the other hand, if you want performance-critical code you have to resort to fast, static and compiled languages such as C and C++.

The way Julia solves this is by having a native, feature-rich REPL (Read-Eval-Print Loop) and by having JIT (just in time)-compilation together with a flexible, parametric type system along with multiple dispatch. More on this later!

The Julia REPL is your best friend

When prototyping in Julia, the natural starting point is the Julia REPL, which in many ways behaves pleasantly like the iPython interface but also with so much more. The Julia REPL has four main modes (together with full history search):

  • The Julia code prompt
  • Help mode
  • Package manager
  • Shell mode

1) The standard Julia code prompt

This mode is invoked at startup and is the mode where you do all your prototyping. For example, illustrating 3 different ways to write a function:

add_one_line(x) = x + 1 # one-line
add_one_anon = x -> x + 1 # anonymous function
function add_one_full(x)
    return x + 1

2) The help mode

This is invoked by typing ? and give you quick access to the docstring of a function. So if we are in Julia prompt mode and type


Add x and y.
function myadd(x,y)
    return x + y

we can activate the help mode by typing ? and then type the function name:

help?> myadd
search: myadd


  Add x and y.

3) The package manager

Apart from the speed, this is perhaps my favourite feature of the Julia language. In Python, there are number of different environment and package managers, like "pip", "pipenv", "virtualenv" and "poetry". Choosing between them and understanding how they work together can be confusing and time-consuming. In Julia, the package manager is built into the language and it just works. The package manager is invoked by typing ] and you leave it by typing backspace. In an empty directory you can create a new environment like this:

(@v1.7) pkg> activate .
  Activating new project at `~/JuliaProjects/sandbox/blog`

(blog) pkg> add UnicodePlots
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `~/JuliaProjects/sandbox/blog/Project.toml`
  [b8865327] + UnicodePlots v3.1.0
    Updating `~/JuliaProjects/sandbox/blog/Manifest.toml`

(blog) pkg> 

This can also be done programatically in the Julia REPL like this:

julia> using Pkg; Pkg.add("UnicodePlots")
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `~/.julia/environments/v1.7/Project.toml`
  [b8865327] + UnicodePlots v3.1.0
    Updating `~/.julia/environments/v1.7/Manifest.toml`

An environment is defined by two .toml-files. The Project.toml contains the packages you want in the environment together with version restrictions. The Project.toml is the file to commit to github. The file Manifest.toml is the detailed machine-generated file which contains all the information necessary to exactly reproduce a specific environment. Both files are constructed when you instantiate an environment and updated automatically when you add new packages.

4) The shell mode

By typing ; in the REPL you get instant access to a shell that behaves like a bash shell, which in many occasions could be really handy. But in practice I usually just open a new terminal with my regular bash shell.

Julia is not object-oriented – it relies on multiple dispatch

On the surface, the Julia syntax doesn’t differ too much from Python, the main differences being 1-based indexing and that functions and control structures in Julia are always closed by the end keyword. And as in Python, you don’t need to annotate the type of the variable. In this context, an important part of Julia, is the type inference. This means that the types are inferred from the actual variable values. For example,

julia> a = 1; typeof(a)

julia> a = 1.0; typeof(a)

julia> a = "Hello, word"; typeof(a)

julia> a = 1 + 1.0f0; typeof(a)

One main difference from Python is that Julia is not objected-oriented, so there are no classes and inheritance. Instead, Julia has a very flexible type hiearchy, with C-like structs as the main building block. This means that functions don’t belong to certain types or objects as they do in Python. Instead, Julia has polymorphism where functions can have the same name but which function to be used at runtime, or dispatched, depends on the types of the arguments of the function call. I write arguments in plural, because Julia has so called multiple dispatch, meaning that the language considers the types of all the arguments when choosing which function to execute. This is a remarkable feature. Let me show an example:

struct MyNumber

f(x) = "I am the fallback for one argument" 
f(x::Integer) = "I'm dispatched on integers"
f(x::Float64) = "I'm dispatched on floats"
f(x, y) = "Fallback for two arguments"
f(x::Int, y::Float64) = "First is Integer, second argument is Float64"
f(x::MyNumber) = "Hello, mynumber!"

mynumber = MyNumber(1.0)

f("Hello, world!")  # I am the fallback for one argument
f(1)  # I'm dispatched on integers
f(1.0)  # I'm dispatched on floats
f("Hello", 1)  # Fallback for two arguments
f(1, 2.0)  # First is Integer, second argument is Float64
f(mynumber)  # "Hello, mynumber!"

Julia is compiled just-in-time

This brings me to the second main difference: Julia is compiled. The compilation is done at runtime by a so-called JIT (just-in-time) compiler. JIT-compilation is also found in for example the Numba library in Python. In the example above, we have six different versions of the function called f. These versions are called methods in Julia. We can examine what methods of the function f above are actually defined by calling the methods function on f:

julia> methods(f)
# 6 methods for generic function "f":
[1] f(x::Int64, y::Float64) in Main at REPL[6]:1
[2] f(x::Integer) in Main at REPL[3]:1
[3] f(x::Real) in Main at REPL[4]:1
[4] f(x::MyNumber) in Main at REPL[7]:1
[5] f(x) in Main at REPL[2]:1
[6] f(x, y) in Main at REPL[5]:1

So at runtime, when the program is about to execute a specific function call, the compiler checks the types of all the actual input arguments of the function. If it has been called and compiled before with this combination of argument types, it uses a cached version, which of course is very fast. If not, a new version based on all the argument types of the current call is compiled and stored. This means we only need to compile functions actually used in the code.

Julia is fast because of code specialization

In principle, Julia code can get the speed of C for any user-defined type. This is in contrast to for example numpy in Python, where the speedups only apply to a limited collection of predefined datatypes. Compared to native Python, the speedup naturally varies a lot depending on the task, but figures of 2-10 times faster are not unheard of.

Even though modern data science tools in Python like Numpy and Tensorflow are wrappers around C/C++ and with that bring speed to the table, it is still useful to understand why pure Python is slow.

Coming from Python to Julia, the biggest change for me was to become more aware of the types of the variables and objects you are actually doing things with. "Is this an array of 64-bit floats?" "What are the types of the class members?" Working in pure Python, none of these questions seemed relevant because in Python a variable could bascially be of any type and things just still work.

So for example, when you want to sum an array of values in Python, each and every time you add two values the langauge needs to check the type of the values and at runtime find the right +-function to use in the summation and store the resulting value together with its type. There is no way to inform the language that all the values of an array are the same type, for example 64-bit precision floats.

How slow this is can actually be illustrated in Julia, where the type Any is the mother of all types and of which all other types is a subtype. The Julia type Any is equivalent to the the Python generic type. By using the Julia package BenchmarkTools we can compute statistics on the time spent on summing 10^7 random numbers in an array.

julia> using BenchmarkTools
julia> N = 10^7
julia> arr_any = Any[rand() for i in 1:N];
julia> arr_float64 = Float64[rand() for i in 1:N];
julia> @benchmark sum($arr_any)
BenchmarkTools.Trial: 31 samples with 1 evaluation.
 Range (min … max):  135.170 ms … 169.061 ms  ┊ GC (min … max):  0.00%16.89%
 Time  (median):     165.452 ms               ┊ GC (median):    16.91%
 Time  (mean ± σ):   163.840 ms ±   7.513 ms  ┊ GC (mean ± σ):  16.03% ±  4.23%

                                                       ▂█ ▄      
  ▄▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▆▄████▆▄▄▄ ▁
  135 ms           Histogram: frequency by time          169 ms <

 Memory estimate: 152.59 MiB, allocs estimate: 9999999.

julia> @benchmark sum($arr_float64)
BenchmarkTools.Trial: 1193 samples with 1 evaluation.
 Range (min … max):  4.047 ms …  4.706 ms  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     4.165 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.181 ms ± 78.071 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▂▂▃▃▆▅█████████████████▇█▇▆▅▆▄▄▄▇▃▅▃▃▄▄▃▃▃▂▃▂▁▂▁▂▁▂▂▁▁▁▁▁ ▄
  4.05 ms        Histogram: frequency by time        4.42 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

As we can see, the compiler does its job and the median execution time is about 40 times slower with the generic type where the compiler cannot specialize on the type of the argument.

SINTEF and viralinstruction have written two very helpful blogposts that elaborate on how to get performant Julia code. In the Julia manual, there is a dedicated section on performance tips. I also highly recommend Steven Johson’s MIT lecture on Julia.

Julia has a growing ecosystem

The ecosystem of Python is impressive and mature. And the current era of Data Science has more or less evolved hand in hand with Python and its machine learning libraries such as SciPy and Tensorflow. Python was first released in 1991, more than twenty years before Julia. And as of september 2022, there are 400 000 projects in PyPi. In comparison, Julia has at the same time about 8 000 registered packages. In fact, already in 2017, Stefan Karpinski stated that:

Even NumPy itself was a bit rough back then. It’s entirely possible that if the SciPy ecosystem had been as well developed in 2009 as it is today, we never would have started Julia.

But the number of users and packages in Julia is growing, and I’ve definitely found what I need. To name a few I’ve used, there’s the Plots.jl and Makie.jl for plotting. There’s the Pandas equivalent DataFrames.jl for dataframes. There’s Flux.jl for machine learning. There’s ReinforcementLearning.jl for reinforcement learning, obviously. And there’s the PowerModels.jl for dealing with power systems.

Even though not all packages are as far developed as in Python, this has not been a major problem for me. And there are some benefits too of entering a package in its early phase. Your issues and requests have a better chance of being taken care of and it’s quite possible you are able to contribute to the project yourself, either with code or documentation. The two packages I’ve been deepest involved with, ReinforcementLearning.jl and GraphNeuralNetworks.jl, both have very friendly and helpful core developers.

A nice little Julia package: UnicodePlots.jl

I have to mention this neat little package in Julia, UnicodePlots.jl, which allows you to do simple plotting in the command line. For example,

using UnicodePlots

lineplot([cos, sin], 0:0.01:2pi)
barplot(["Oslo", "Stavanger", "Bergen", "Trondheim"],
        [2.244, 8.406, 4.92, 0.1],
        title = "Population")

which in my terminal renders as

How can Julia be used in power systems research?

My PhD project is about how to use reinforcement learning to handle failures in the power system. This involves a lot of load flow calculations. This is super easy in PowerModels.jl:

using PowerModels
# Casefile "case.14m" downloaded from https://egriddata.org/sites/default/files/case14.m

julia> results = compute_dc_pf("case14.m")
Dict{String, Any} with 5 entries:
  "optimizer"          => "\\"
  "termination_status" => true
  "objective"          => 0.0
  "solution"           => Dict{String, Any}("bus"=>Dict{String, Any}("4"=>Dict("va"=>-0.200698), "1"=>Dict("va"=>-0.0), "12"=>Dic…
  "solve_time"         => 0.000584841

Perhaps even more useful is to do an optimal power flow using the Ipopt.jl optimizer:

julia> using PowerModels
julia> using Ipopt

julia> solve_dc_opf("case14.m", Ipopt.Optimizer)


Number of nonzeros in equality constraint Jacobian...:      106
Number of nonzeros in inequality constraint Jacobian.:       80
Number of nonzeros in Lagrangian Hessian.............:        5

Total number of variables............................:       39
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        5
                     variables with only upper bounds:        0
Total number of equality constraints.................:       35
Total number of inequality constraints...............:       40
        inequality constraints with only lower bounds:       20
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:       20

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.6032287e+02 9.32e-01 1.89e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  8.6948059e+03 1.46e-16 6.86e+01  -1.0 7.93e-01    -  1.41e-02 1.00e+00h  1
   2  7.6460391e+03 6.66e-16 2.59e+00  -1.0 1.43e+00    -  5.95e-01 9.60e-01f  1
   3  7.6605209e+03 5.48e-16 3.39e+00  -1.0 1.48e-01    -  9.76e-01 5.00e-01f  2
   4  7.6544392e+03 6.66e-16 1.00e-06  -1.0 3.40e-02    -  1.00e+00 1.00e+00f  1
   5  7.6457228e+03 5.76e-16 2.83e-08  -2.5 5.25e-02    -  1.00e+00 1.00e+00f  1
   6  7.6432829e+03 4.44e-16 2.83e-08  -2.5 1.85e-02    -  1.00e+00 1.00e+00f  1
   7  7.6426423e+03 3.89e-16 1.50e-09  -3.8 5.43e-03    -  1.00e+00 1.00e+00f  1
   8  7.6425922e+03 6.66e-16 1.84e-11  -5.7 4.35e-04    -  1.00e+00 1.00e+00f  1
   9  7.6425918e+03 3.37e-16 2.51e-14  -8.6 3.75e-06    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 9

                                   (scaled)                 (unscaled)
Objective...............:   1.9106479435807935e+02    7.6425917743231739e+03
Dual infeasibility......:   2.5059035640133008e-14    1.0023614256053203e-12
Constraint violation....:   3.3653635433950058e-16    3.3653635433950058e-16
Variable bound violation:   8.9305948524944555e-09    8.9305948524944555e-09
Complementarity.........:   2.6421734022358593e-09    1.0568693608943436e-07
Overall NLP error.......:   2.6421734022358593e-09    1.0568693608943436e-07

Number of objective function evaluations             = 11
Number of objective gradient evaluations             = 10
Number of equality constraint evaluations            = 11
Number of inequality constraint evaluations          = 11
Number of equality constraint Jacobian evaluations   = 1
Number of inequality constraint Jacobian evaluations = 1
Number of Lagrangian Hessian evaluations             = 1
Total seconds in IPOPT                               = 0.005

EXIT: Optimal Solution Found.
Dict{String, Any} with 8 entries:
  "solve_time"         => 0.00622511
  "optimizer"          => "Ipopt"
  "termination_status" => LOCALLY_SOLVED
  "dual_status"        => FEASIBLE_POINT
  "primal_status"      => FEASIBLE_POINT
  "objective"          => 7642.59
  "solution"           => Dict{String, Any}("baseMVA"=>100, "branch"=>Dict{String, Any}("4"=>Dict{String, Any}("qf"=>NaN, "qt"=>N…
  "objective_lb"       => -Inf

Deep learning in Julia? Use Flux!

Flux.jl is the go-to library for deep learning in Julia. It is the Tensorflow and Pytorch equivalent.

Let’s do a simple example on training a linear regression model with intercept at 1.5 and slope 2.0. We first start by importing the necessary library and defining our data:

julia> using Flux
julia> x = rand(Float32, 1, 100)
#1×100 Matrix{Float32}:
# 0.971723  0.279388  0.718561  0.580433  0.319538  0.571858  0.808591  0.967042  0.511453  0.824858  0.0246731  0.924845  0.804781  0.0334803  0.864933  0.561797  0.459436  0.134477  0.397105  …  0.885082  0.444496  0.891089  0.452616  0.0905207  0.258379  0.736683  0.28399  0.624088  0.604748  0.275982  0.696864  0.735082  0.959392  0.580974  0.75722  0.763027  0.0576547

julia> intercept = 1.5
julia> slope = 2.0
julia> linear_noise(x, ϵ=0.1) = intercept + slope*x + randn()*ϵ
# linear_noise (generic function with 2 methods)

julia> y = linear_noise.(x) .|> Float32
#1×100 Matrix{Float32}:
# 3.25849  2.26917  2.8143  2.85554  2.12951  2.77449  3.19595  3.5636  2.56589  3.22538  1.59178  3.22211  3.02087  1.59712  3.15416  2.68378  2.3113  1.82769  2.43546  3.04249  2.02023  3.20036  …  1.91827  1.88766  3.24948  2.37444  3.17674  2.45275  1.57015  2.00188  2.83694  2.09291  2.79805  2.75575  2.06506  2.73183  2.99427  3.38747  2.55995  3.17439  3.12668  1.64066

The piping operator |> is very handy in Julia, the same is the broadcasting done by the .-operator. See the manual for more informations about these.

So now we have linear data with some noise added. Let’s also define a very simple neural network, with 1 layer and only one node. As we can see, there are two parameters in this model:

julia> model = Dense(1,1)
# Dense(1 => 1)       # 2 parameters

We can get the parameters of the model, a.k.a. the weights and the biases, by

julia> ps = Flux.params(model)
# Params([Float32[0.819971;;], Float32[0.0]])

We also define a loss function by

julia> loss(input, output) = Flux.Losses.mse(model(input), output)
# loss (generic function with 1 method)

julia> loss(x, y)
# 11.150247f0

So with random weight and bias, the loss is initially quite large. Let’s implement a simple gradient descent, loop over the data and train on the cpu (which is the default):

julia> epochs = 10_000

julia> opt = Flux.Descent()  # 0.1 is the default step size
# Descent(0.1)

julia> for epoch in 1:epochs
           ps = Flux.params(model)
           Flux.train!(loss, ps, [(x,y)], opt)

julia> loss(x,y)
# 0.009441698f0

julia> Flux.params(model)
# Params([Float32[1.9716092;;], Float32[1.5128096]])

Finally let’s plot the results using the standard Plots.jl library using the GR backend:

julia> using LaTeXStrings
julia> using Plots

julia> gr() # to activate the GR backend (not necessary as it is the default)
# Plots.GRBackend()

julia> p = scatter(x[1,:],y[1,:], label="observations", legend=:topleft, xlabel=L"x", ylabel=L"y")
julia> plot!(p, x[1,:], model(x)[1,:], label="model")

In the last function call, we see an example of a Julia convention: Any function that changes the argument in place has an "!" at the end of the identifier.

What’s not to like?

Time to first plot

The first time a Julia method is called it is compiled and this obviously prolongs the execution time. Sometimes this can lead to some delays (and frustration) when you’re typing interactively. Every time you close the Julia session and start a new one, you have to go throuh the compilations again. This manifests itself also in the so-called Time To First Plot (TTFP). Although this has been improved in almost every release of Julia, you still have two wait some long seconds. For example, these lines of code takes about 8s on my terminal from a fresh Julia session:

using Plots
x = range(0, 2pi, length=100)
plot(x, sin.(x))

Of course, the second time these lines are run from the same session it runs instantaneously.

In Python, the equivalent code took much less than 1s:

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 2*np.pi, num=100, endpoint=True)
plt.plot(x, np.sin(x))


The Julia debugger is also one part of Julia that has not brought forth the best of my feelings. Although it also has improved recently, to be frank I don’t use it much. In Python I used to step through the code in for example the Pycharm IDE. This gave much insight. In Julia, I code much more interactively through the REPL. I also implement shorter functions that are more testable and in a functional programming style. In addition, I use tools like Infiltrator.jl. After a year of coding like this, I hardly miss a debugger.

Closing remarks

To me, learning Julia has been a very positive experience. Julia is fast, fun to read and write, and it is applicable to a wide range of computational problems. But maybe more importantly, by learning a new language I have learned a lot on how computers and programming languages work together. And I have learned that there are alternatives to the Pythonic way. So if in doubt: Give Julia a try!

Is bid filtering effective against network congestion?

Earlier this year, I wrote an introduction to the bid filtering problem, and explained how my team at Statnett are trying to solve it. The system we’ve built at Statnett combines data from various sources in its attempt to make the right call. But how well is it doing its job? Or, more precisely, what is the effect on network congestion of applying our bid filtering system in its current form?

Kyoto. Photo: Belle Co

Without calling it a definitive answer, a paper I wrote for the CIGRE Symposium contains research results that provide new insight. The symposium was in Kyoto, but a diverse list of reasons (including a strict midwife) forced me to leave the cherry blossom to my imagination and test my charming Japanese phrases from a meeting room in Trondheim.

A quick recap

European countries are moving toward a new, more integrated way of balancing their power systems. In a country with highly distributed electricity generation, we want to automatically identify power reserves that should not be used in a given situation due to their location in the grid. If you would like to learn the details about the approach, you are likely to enjoy reading the paper. Here is the micro-version:

To identify bids in problematic locations, we need a detailed network model, we try to predict the future situation in the power grid, and then we apply a nodal market model which gives us the optimal plan for balancing activations for a specific situation. But since we don’t really know how much is going to flow into or out of the country, we optimize many times with different assumptions on cross-border flows. Each of the exchange scenarios tells its own story about which bids should -and shouldn’t- be activated. The scenarios don’t always agree, but in the aggregate they let us form a consensus result, determining which bids will be made unavailable for selection in the balancing market.

An unfair competition

Today, human operators at Statnett select power reserves for activation when necessary to balance the system, always mindful of their locations in the grid and potential bottlenecks. Their decisions on which balancing bids to activate – and not activate – often build years of operational experience and an abundance of real-time data.

Before discussing whether our machine can beat the human operators, it’s important to keep in mind that the bid filtering system will take part in a different context: the new balancing market, where everyday balancing will take place without the involvement of human operators. This will change the rules of the balancing game completely. While human operators constantly make a flow of integrated last-minute decisions, the new automatic processes are distinct in their separation of concerns and must often act much earlier to respect strict timelines.

Setting up simulations

The quantitative results in our paper come from simulating one day in the Norwegian power grid, using our detailed, custom-built Python model together with recorded data. The balancing actions -and the way they are selected- are different between the simulations.

The first simulation is Historical operation. Here, we simply replay the historical balancing decisions of the human operators.

The second simulation is Bid filtering. Here, we replace the historical human decisions with balancing actions selected by a zonal market mechanism that doesn’t see the internal network constraints or respect the laws of physics. The balancing decisions will often be different from the human ones in order to save some money. But before the market selects any bids, some of them are removed from the list by our bid filtering machine in order to prevent network congestion. We try not to cheat, the bid filtering takes place using data and forecasts available 30 minutes before the balancing actions take effect.

The third simulation is No filtering. Here we try to establish the impact on congestion of moving from today’s manual, but flexible operation to zonal, market-based balancing. This simulation is a parallel run of the market-based selection, but without pre-filtering any bids, and it provides a second, possibly more relevant benchmark.

Example from 09:30 on August 25, 2021. Red cells are balancing bids made unavailable in the bid filtering simulation. As a result, the market-based balancing will not select exactly the same bids in the Bid filtering scenario (black dots) and the No filtering scenario (white dots).

Power flow analyses

The interesting part of the simulation is when we inject the balancing decisions into the historical system state and calculate all power flows in the network. Comparing these flows to the operational limits reveals which balancing approaches are doing a better job at avoiding overloads in the network.

Example from 09:30 on August 25, 2021 showing reliability limits. Reliability limits in Norway restrict the flow on a combination of transmission lines, so-called Power Transfer Corridors (PTCs). These 13 PTC constraints are violated in one or more of the simulations.

The overloads are similar between the simulation, but they are not the same. To better understand the big picture, we created a congestion index that summarizes the resulting overload situation in a single value. The number doesn’t have any physical interpretation, but gives a relative indication of how severe the overload situation is.

Congestion index for reliability limits in the Norwegian system from August 25, 2021

When we run the simulation for 24 historical hours, we see that with market-based balancing, there would be overloads throughout the day. When we apply bid filtering and remove the bids expected to be problematic, overloads are reduced in 9 of the 24 hours, and we’re able to avoid the most serious problems in the afternoon.

No matter the balancing mechanism, the congestion index virtually never touches zero. Even the human operators with all their extra information and experience run into many of the same congestion problems. This shows that balancing activations play a role in the amount of congestion, but they are just one part of the story, along with several other factors.

With that in mind, if you’re going to let a zonal market mechanism decide your balancing decisions, it seems that bid filtering can have a clear, positive effect in reducing network overloads.

What do you think? Do you read the results differently? Don’t be afraid to get in touch, my team and I are always happy to discuss.


Being a trainee on the forecasting team, including some secret tips

I’ve been a trainee at Statnett’s Data Science unit over the past months and learned a lot. In this post I will give you a quick look at what I have been working on, and some helpful advice for how to survive and thrive as a data science trainee.

I was put on a team together with colleagues from both Statnett and Svenska kraftnät, producing short-term time series forecasts. These forecasts automate several system operation tasks which are handled manually today, and are essential in the transition to a new Nordic balancing model. Specifically, my team is responsible for predicting power consumption and power imbalance (i.e. the mismatch between production and consumption). Every five minutes, we need to provide predictions for the next two hours, with high demands on robustness and data quality. This is not a one-man job, but here are some of the areas where I put my effort.

Building apps and developing models

Our goal is to forecast the development of the system imbalance and power consumption in the near future. Statnett provides time series data describing power consumption, power production, power imbalance, weather forecasts and other relevant information. Utilizing these data, we can build mathematical models with machine learning algorithms that in the end are used to generate the predictions.

A comparison of our forecasted imbalance (orange) and the recorded, actual imbalance (blue).

Currently, a linear regression with a ridge penalty has proven to be the superior choice at Statnett, but we always aim to discover new models to keep improving our forecasts.

As a fun way of developing new models, the team recently set up a classic clash. The Norwegians on the team competed against the Swedes in improving the current model. For a week, communication across the border dropped to a bare minimum as we worked with high intensity to win the competition. Most importantly, we learned a lot on how to improve performance. And almost as important, the Norwegians won…

Live monitoring of apps and data streams

When a prediction model is fit for purpose, we deploy it in our production environment. At Statnett, we orchestrate our application containers with OpenShift, a PaaS equivalent to Kubernetes, but with stronger security policies. We queue our data with Apache Kafka, not only to maintain data flow between components, but also to deliver our end-product. Other than that, we also rely on PostgreSQL, Kotlin and GitLab in our everyday development.

To make the transition into more automatic operation of the power grid as smooth as possible, we need to make sure that our services are running robustly. In an effort to keep the required uptime, we monitor our services, including the full machine learning pipeline, with Grafana dashboards.

This dashboard provides a convenient overview of the system end-to-end. The dashboard shows the flow between different components, and also the status of streams and apps, indicated by colors. This makes it easy to identify and deal with issues when they occur.

Happy team, happy life

I’ve discovered how to cope with the daily struggles of working as a programmer, especially when working from home. The most important is that you end up laughing (or at least smiling) a couple of times each day while working. If you find that hard to achieve, feel free to apply these well-proven steps:

Sync each other’s Spotify playlists

and regularly dance in your office. I think this is how silent disco was invented.

Speak Swedish to your Swedish colleagues

Say things like tjenare and grabben to Swedish people in meetings. They will laugh, I don’t really know why.

Put sunglasses on while programming

Ask about other people’s day

take 5 minutes just to talk, while performing some paint art

and last, but not least,

Do not push to the main production branch as a trainee

Please be careful. If you do this, and something goes wrong, some of the more senior employees may not fix it right away, and might even enjoy seeing you stress to fix the issue. It has not happened to me, but if it did, it would be difficult to recover from.

Using data to handle intra-zonal constraints in the upcoming balancing market

Together with almost half of the Data science group at Statnett, I spend my time building automatic systems for congestion management. This job is fascinating and challenging at the same time, and I would love to share some of what our cross-functional team has done so far. But before diving in, let me first provide some context.

A good day at Statnett’s control centre. Photo: Trond Isaksen

The balancing act

Like other European transmission system operators (TSOs), Statnett is keeping the frequency stable at 50 Hz by making sure generation always matches consumption. The show is run by the human operators in Statnett’s control centre. They monitor the system continuously and instruct flexible power producers or consumers to increase or decrease their power levels when necessary.

These balancing adjustsments are handled through a balancing energy market. Flexible producers and consumers offer their reserve capacity as balancing energy bids (in price-volume pairs). The operators select and and activate as many of them as needed to balance the system. To minimize balancing costs, they try to follow the price order, utilizing the least expensive resources first.

While busy balancing the system, control centre operators also need to keep an eye on the network flows. If too much power is injected in one location, the network will be congested, meaning there are overloads that could compromise reliable operation of the grid. When an operator realizes a specific bid will cause congestion, she will mark it as unavailable and move on to use more expensive bids to balance the system.

In the Norwegian system, congestion does not only occur in a few well-known locations. Due to highly distributed generation and a relatively weak grid, there are hundreds of network constraints that could cause problems, and the Norwegian operators often need to be both careful and creative when selecting bids for activation.

A filtering problem

The Nordic TSOs are transitioning to a new balancing model in the upcoming year. A massive change is that balancing bids will no longer be selected by humans, but by an auction-like algorithm, just as in many other electricity markets. This algorithm (unfortunately, but understandably) uses a highly aggregated zonal structure, meaning that it will consider capacity restrictions between the 12 Nordic bidding zones, but not within them.

Consequently, the market algorithm will disregard all of the more obscure (but still important) intra-zonal constraints . This will -of course- lead to market results that simply do not fit inside the transmission grid, and there is neither time nor opportunity after the market clearing to modify the outcome in any substantial way.

My colleagues and I took the task of creating a bid filtering system. This means predicting which bids would cause congestion, and mark them as unavailable to prevent them from being selected by the market algorithm.

Filtering the correct bids is challenging. Network congestions depend on the situation in the grid, which is anything but constant due to variations in generation, consumption, grid topology and exchange with other countries. The unavailability decision must be made something like 15-25 minutes before the bid is to be used, and there is plenty of room for surprises during that period.

How it works

Although I enjoy discussing the details, I will give a only a short summary of the system works. To decide the availability of each bid in the balancing market, we have created a Python universe with custom-built libraries and microservices that follows these steps

  1. Assemble a detailed model of the Norwegian power system in its current state. Here, we combine the grid model from Statnett’s equpment database and combine it with fresh data from Statnett’s SCADA system.
  2. Adjust the model to reflect the expected state.
    Since we are looking up to 30 minutes ahead, we offset the effect of current balancing actions, and apply generation schedules and forecasts to update all injections in the model.
  3. Prepare to be surprised.
    To make more robust decisions in the face of high uncertainty in exchange volumes, we even apply a hundred or more scenarios representing different exchange patterns on the border.
  4. Find the best balancing actions for each scenario of the future.
    Interpreting the results of an optimal power flow calculation provides lots of insight into which bids should be activated (and which should not) in each exchange scenario.
  5. Agree on a decision.
    In the final step, the solution from each scenario is used to form a consensus decision on which bids to make unavailable for the balancing market algorithm.

An example result and how to read it

My friend and mentor Gerard Doorman recently submitted a paper for the 2022 CIGRE session in Paris, explaining the bid filtering system in more detail. I will share one important figure here to illustrate the final step of the bid filtering method. The figure shows the simulated result of running the bid filtering system at 8 AM on August 23, 2021.

Simulated bid filtering results from August 23, 2021. Bids are sorted horizontally, according to price, and grouped by their bidding zone (NO1 through 5) and direction (up/down).

Before you cringe from information overload, let me assist you by explaining that the abundance of green cells in the horizontal bar on top shows that the vast majority of balancing bids were decided to be available.

There are also yellow cells, showing bids that likely need to be activated to keep the system operating within its security limits, no matter what happens.

The red cells are bids that have been made unavailable to prevent network congestion. To understand why, we need to look at the underlying results in the lower panel. Here, each row presents the outcome of one scenario, and purple cells show the bids that were rejected, i.e. not activated in the optimal solution, although being less expensive than other ones that were activated for balancing in the same scenario (in pink).

The different scenarios often do not tell the same story, a bid that is rejected in one scenario can be perfectly fine in the next, it all depends on the situation in the grid and which other bids are also activated. Because of this ambiguity, business rules are necessary to create a reasonable aggregate result, and the final outcome will generally be imperfect.

So, does this filtering system have any postive impact on network congestions at Statnett? I will leave the answer for later, but if you’re curious to learn more, don’t hesitate to leave a comment.

Retrofitting the Transmission Grid with Low-cost Sensors

In Statnett, we collect large amounts of sensing data from our transmission grid. This includes both electric parameters such as power and current, and parameters more directly related to the individual components, such as temperatures, gas concentrations and so on.

Nevertheless, the state and behaviour of many of our assets are to a large extent not monitored and to some extent unobservable outside of regular maintenance and inspection rounds. We believe that more data can give us a better estimate of the health of each component, allowing for better utilization of the grid, more targeted maintenance, and reduced risk of component failure.

About a year ago, we therefore acquired a Pilot Kit from Disruptive Technologies, packed with a selection of miniature sensors that are simple to deploy and well-suited for retrofitting on existing infrastructure. We set up a small project in collaboration with Statnett R&D, where we set about testing the capabilities of this technology, and its potential value for Statnett.

Since then we’ve experimented with deploying these tiny IOT-enabled devices on a number of things, ranging from coffee machines to 420 kV power transformers.

To gauge the value and utility of these sensors in our transmission grid, we had to determine what to measure, how to measure, and how to gather and analyze the data.

This blog post summerizes what we’ve learnt so far, and evaluates some of the main use cases we’ve identified for instrumenting the transmission grid. The process is by no means finished, but we will describe the steps we have taken so far.

Small, Low-cost IoT-Sensors

Disruptive Technologies is a fairly young company whose main product is a range of low-cost, IoT-enabled sensors. Their lineup includes temperature sensors, touch sensors, proximity sensors, and more. In addition to sensors, you need one or more cloud connectors per area you are looking to instrument. The sensors transmit their signals to a cloud connector, which in turn streams the data to the cloud through mobile broadband or the local area network. Data from the sensors are encrypted, so a sensor can safely transmit through any nearby cloud connector without prior pairing or configuration.

The devices and the accompanying technology have some characteristics that make them interesting for power grid instrumentation, in particular for retrofitting sensors to already existing infrastructure:

  • Cost: The low price of sensors makes experimental or redundant installation a low-risk endeavour.
  • Size: Each sensor is about the size of a coin, including battery and the wireless communication layer.
  • Simplicity: Each sensor will automatically connect to any nearby cloud connector, so installation and configuration amounts to simply sticking the sensor onto the surface you want to monitor, and plugging the cloud connector into a power outlet.
  • Battery life: expected lifetime for the integrated battery is 15 years with 100 sensor readings per day.
  • Security: Data is transmitted with full end-to-end encryption from sensor to Disruptive’s cloud service.
  • Open API: Data are readily available for download or streaming to an analytics platform via a REST API.

Finding Stuff to Measure

Disruptive develops several sensor types, including temperature, proximity and touch. So far we have chosen to focus primarily on the temperature sensors, as this is the area where we see the most potential value for the transmission grid. We have considered use cases in asset health monitoring, where temperature is a well-established indicator of weaknesses or incipient failure. Depending on the component being monitored, unusual heat patterns may indicate poor electrical connections, improper arc quenching in switchgear, damaged bushings, and a number of other failure modes.

Asset management and thermography experts in Statnett helped us compile an initial list of components where we expect temperature measurements to be useful:

  • Transformers and transformer components. At higher voltage levels, transformers typically have built-in sensors for oil temperature, and modern transformers also tend to monitor winding and hotspot temperatures. Measurement on sub-components such as bushings and fans may however prove to be very valuable.
  • Ciruit breakers. Ageing GIS facilities are of particular importance both due their importance in the grid, and to the risk of environmental consequences in case of SF6 leakage. Other switchgear may also be of interest, since intermittent heat development during breaker operation will most likely not be uncovered by traditional thermography.
  • Disconnectors. Disconnectors (isolator switches) come in a number of flavors, and we often see heat development in joints and connection points. However, we know from thermography that hotspots are often very local, and it may be hard to predict in advance where on the disconnector the sensor should be placed.
  • Voltage and current transformers. Thermographic imaging has shown heat development in several of our instrument transformers. Continuous monitoring of temperature would enable us to better track this development and understand the relationship between power load, air temperature and transformer heating.
  • Capacitor banks. Thermography often reveals heat development at one or more capacitor in capacitor banks. However, it would require a very large number of sensors required to fully monitor all potential weak spots of a capacitor bank.

A typical use cases for the proximity sensors in the power system is open door or window alarms. Transmission level substations are typically equipped with alarms and video surveillance, but it might be relevant for other types of equipment in the field, or at lower voltage levels.

The touch sensors may for instance be used to confirm operator presence at regular inspection or maintenance intervals. Timestamping and georeferencing as part of an integrated inspection reporting application is a more likely approach for us, so we have not pursued this further.

Deployment on Transformer

Our first pilot deployment (not counting the coffee machine) was on three transformers and one reactor in a 420 kV substation. The sensors were deployed in winter when all components were energized, so we could only access the lower part of the main transformer and reactor bodies. This was acceptable, since the primary intention of the deployment was to gain experience with the process and hopefully avoid a few pitfalls in the future.

Moreover, the built-in temperature sensors in these components gave us a chance to compare readings from Disruptive sensors with the “true” inside temperature, giving us an impression of both the reliability of readings from Disruptive sensors and the ability to estimate oil temperature based on measurements on the outside of the transformer housing. We also experimented with different types of insulation covering the sensor, in order to gauge the effect of air temperature variations on sensor readings.

Deployment in Indoor Substation

Following the initial placement on the transformer bodies, we instrumented an indoor GIS facility, where we deployed sensors on both circuit breakers and disconnectors; plus one additional sensor to measure ambient temperature in the room. Since the facility is indoors and all energized components are fully insulated, this deployment was fairly straightforward. Our main challenge was that the cloud connector had a hard time connecting finding a cellular signal, but with a bit of fiddling we eventually found a few locations in the room with sufficient signal strength.

Concrete buildings can make it hard to find a good signal for mobile broadband. In this case we raised the cloud connector to the ceiling in search of a signal.

Deployment on Air Insulated Breakers

Finally, we took advantage of a planned disconnection of one of the busbars at a 300 kV facilty to instrument all poles of an outdoor SF6 circuit breaker. As mentioned above, disconnectors and instrument transformers were other instrument transformers. However, due to the layout of the substation, these were still energized so the circuit breakers were the only components we could gain access to.

Apart from monitoring the breakers, this deployment enabled us to test how the sensors reacted to being placed directly on uninsulated high-voltage equipment, and to check for any negative side-effects such as corona discharges.

Developing a Microservice for Data Ingestion

The sensors from Disruptive Technologies work by transmitting data from the sensor, via one or more cloud connectors, to Disruptive’s cloud software solution. The data are encrypted end-to-end, so the sensors may use any reachable cloud connector to transmit data.

As a precautionary measure, we opted to maintain an in-house mapping between sensor device ID and placement in the grid. This way, there is nothing outside Statnett’s systems to identify where the sensor data are measured.

Disruptive provides various REST APIs for streaming and downloading data from their cloud solution. For internal technical reasons, we chose to use a “pull” architecture, where we download new sensory readings every minute and pass them on to our internal data platform. We therefore developed a microservice that:

  1. Pulls data from Disruptive’s web service at regular intervals.
  2. Enriches the data with information about which component the sensor is placed on and how it is positioned.
  3. Produces each sensor reading as a message to our internal Kafka cluster.

From Kafka, the data are consumed and stored in a TimescaleDB database. Finally, we display and analyze the data using a combination of Grafana and custom-built dashboards.

The microservice runs on our internal Openshift Container Platform (PaaS).

We wrote a custom adapter that ingests data from Disruptive’s web services and produces messages on our internal Kafka cluster. From here, the data flows to Timescale and dashboards. The data are anonymous on Disruptive’s web service, so the adapter also adds contextual information to the messages.

The Value of Data

Do these newly acquired data help us take better decisions in the operation of the grid, and hence operate the grid in a smarter, safer, and more cost-effective way? This is really the litmus test for the value of retrofitting sensors and gathering more data about the components in the grid.

In this pilot project, we consulted field personnel and component experts regularly for advice on where and how to place sensors. However, it was the FRIDA project, a large cross-disciplinary digitalization project at Statnett, that really enabled and inspired the relevant switchgear expert to analyze the data we had collected in more detail.

Once he looked at the data, he discovered heat generation in one of the breakers, with temperatures that significantly exceeded what would be expected under normal operation. A thermographic imaging inspection was immediately ordered, which confirmed the readings from the Disruptive sensors.

The temperature sensors made us aware of high temperatures in one of the breakers, indicating a possible incipient failure of a critical component. As a result, the control centre changed the operating pattern on the substation and planned for mainentance of the unhealthy component. The figure shows the temperature on each phase of the breaker, with busbar A in the top panel and busbar B in the bottom one. The room temperature is shown as a dotted orange line.

Based on the available data, the breaker and thermography experts concluded that the breaker, altough apparently operating normally, shows signs of weakness and possibly incipient failure. This in turn lead to new parts being ordered and maintenance work planned for the near future.

While waiting for the necessary maintenance work to be performed, the operation of the substation has been adapted to reduce the stress on the weakened equipment. Until maintenance is performed, the limits for maximum amount of power flowing through the switchgear are now updated on a regular (and frequent) basis, based on the latest temperature readings from our sensors. Having access to live component state monitoring has also made the control centre able to make other changes to the operating pattern on the substation.

The control centre now continuously monitors the heat development in the substation, using the sensors from this pilot project. The new data has thus not only helped discovered an incipient failure in a critical component, it also allows us to keep operating the substation in a safe and controlled way while we are waiting for an opportunity to repair or replace the troublesome components.

Lessons Learned in the Field

Under optimal conditions, the wireless range, i.e. the maximum distance between sensors and cloud connectors, is 1000 meters with line of sight. Indoor, signals are reliably transmitted over ranges of 20+ meters in normal mode when the conditions are favorable. A weak signal will make the sensor transmit in “Boost mode”, and this quickly drains the battery.

High-voltage power transformer are huge oil-filled steel constructions, often surrounded by thick concrete blast walls. We quickly learned that these are not the best conditions for low-power wireless signal transfer. When attaching the sensors to the main body of the transformer, we observed that the communication distance was reduced to less than 10 meters and required line-of-sight between the sensors and the cloud connector.

One reason for the short transmission distance is that the metal body of the transformer absorbs most of the RF signal from the sensor. Disruptive therefore adviced us to use a bracket so that the sensors could be mounted at a 90 degree angle to the surface when mounting the sensors on large metal bodies. We used Lego blocks for this purpose. Disruptive have since developed a number of range extenders that are arguably better-looking.

Lego bracket vs. surface range extender. If you roll your own, keep in mind where on the sensor the actual temperature measurement is made, and rotate it accordingly. On the sensors in our Pilot Kit, this was the upper right corner of the device (not in the corner where the small orange dot is located).

Although we did experience an improvement in signal transmission range when mounting the sensors at an angle to the transformer body, sending signals around the corner of the transformer still turned out to be very challenging.

Disruptive have yet to develop a ruggedized cloud connector. The need to position the cloud connector such that line of sight was maintained, limited our ability to place the cloud connector inside existing shelters such as fuse cabinets. We therefore developed a weather-proof housing for outdoor use, so we could position the cloud connectors for optimal transmission conditions.

A weather-proof housing allowed us to position the cloud connectors for optimal transmission conditions.

All sensors have their device ID printed on the device itself. However, the text is tiny and the ID is long and complex. We opted to put a short label on each sensor using a magic marker in order to simplify the deployment process in the field. This simplistic approach was satisfactory for our pilot project, and required minimal support system development, but obviously does not scale to larger deployments.

As mentioned above, we have deployed a number of sensors directly on energized, unisolated, 300 kV components. We were quite curious to see how the sensors would cope with being mounted directly on high-voltage equipment. So far, the measurements from the sensors seem to be unaffected by the voltage. However, we have lost about 25 % of these sensors in less than a year due to high battery drainage. We suspect that this may be related to the environment in which they operate, but it may also be bad luck or related to the fact that our sensors are part of a pre-production pilot kit.

Finally, sticking coin-sized sensors onto metal surfaces is easy in the summer. It’s not always equally easy on rainy days or in winter, with cold fingers and icy or snow-covered components.

Other Things we Learned Along the Way

So far, our impression is the the data quality is good. The sensor readings are precise, and the sensors are mostly reliable. The snapshot from Grafana gives an impression of the data quality: as can be seen, there is very good correspondence between the temperature readings from the three different phases on busbar A after switching, when it has been disconnected.

However, both the cloud connectors and the SaaS-based architecture are weak spots where redundancy is limited. If a cloud connector fails, we risk loosing data from all the sensors communicating with that connector. This can to some extent be alleviated by using more cloud connectors. The SaaS architecture is more challenging: downtime on Disruptive’s servers sometimes affects the entire data flow from all their sensors.

Deployment is super easy, but an automated link to our ERP system would ease installation further, and significantly reduce the risk of human error when mapping sensors to assets.

The sensors are very small. This is cool, but if they were 10x larger with 10x stronger wireless signal, this would probably be better for many of our use cases.

Finally, communication can be challenging when dealing with large metal and concrete constructions. This goes for both the communication between sensor and cloud connector, and for the link between the cloud connector and the outside world. This is in most cases solvable, but may require some additional effort during installation.

Next Step: Monitor All Ageing GIS Facilities?

This pilot project has demonstrated that increased component monitoring can have a high value in the transmission grid.

Our main focus has been on temperature monitoring to assess component health, but the project has spurred quite a lot of enthusiasm in Statnett, and a number of other application areas have been suggested to us during the project.

One of the prime candidates for further rollout is to increase the monitoring of GIS substations, either by adding sensors to other parts of the facility, by selecting other substations for instrumentation, or both.

A further deployment of sensors must be aligned with other ongoing activities at Statnett, such as rollout of improved wireless communication at the substations. Nonetheless, we have learned that there are many valuable use cases for retrofitting sensors to the transmission grid, and we expect to take advantage of this kind of technology in the years to come.

%d bloggers like this: