Writing readable mathematical code

In Statnett, we solve problems that involve a significant amount of mathematics. We need code to solve these problems, and writing readable, maintainable, testable and efficient code that computes intricate mathematical equations can be challenging. This blog post will look at one way to tackle this challenge. As a practical example, we will discuss the architectural and style choices behind the (soon to be open source) linerate Python package, which implements the conductor heat model described in CIGRE TB 601.

Dynamic line rating and CIGRE TB 601

Let’s first look briefly at the usecase we consider:  CIGRE’s Technical Brochure number 601, or CIGRE601. CIGRE601 is an essential part of our dynamic line rating work, which you can read more about in one of our previous blog posts. Here, we focus on the architectural and programming style choices we made when we developed code that implements the CIGRE 601 thermal heating model.

The mathematics of CIGRE 601 is not overly complicated, but there is a lot of it, which can make it difficult to follow if you’re not careful. At its core, CIGRE601 is a heat balance equation:

$$P_J(I) + P_s(\mathbf{w}) = P_c(T, \mathbf{w}) + P_r(T, \mathbf{w}). $$

In other words, the heating from the current ($P_J$) plus the solar heating ($P_s$) is equal to the convective cooling ($P_c$) plus the radiative (black body) cooling ($P_r$). In other words: the heating from the sun and the electricity flowing through is equal to the cooling from the wind and the radiated heat waves when the line has a constant temperature. In our case, we want to provide a current, $I$, and weather conditions, $\mathbf{w}$, and use these to solve the above equation with respect to the line temperature, $T$.

To evaluate only a single term of the above equation, we must evaluate up to another 20 other “sub-equations”. In the code, we need to keep track of all these sub-equations: what should be their input and what should be their output? We also want the implementations to be readable correct, and efficient.


To get readable correct, and efficient code, we need to consider the software architecture. We need an architecture that strives for readability and testability without sacrificing speed (too much). How can we accomplish this?

A user-friendly frontend with a functional backend

As a basic user of a toolkit, I want to run code on the form

model = Cigre601(line_parameters, weather_parameters, time)
convective_cooling = model.compute_convective_cooling(temperature)

However, we also want access to the equations themselves. While the code above is excellent for computing properties for specific lines (like the total convective cooling or ampacity rating), it could be better structured for exploring how the equations behave. To facilitate such exploration, we also want access to the sub-terms themselves.

What’s more, if we use an object-oriented approach to implement these sub-terms, then it might be appealing to store their values as properties or attributes. By doing so, we can save time on repeated computations. However, we also hide what each function depends on and provides. We want an interface that makes it clear what information we need for each computation, and if the methods implementing the equations have access to self.anything_at_all, then it is unclear what they need.

Instead, I think that a functional approach, where each mathematical equation is implemented as a pure function— a deterministic function without state — is better. Therefore, linerate has a single function for (more or less) each equation in CIGRE 601. This makes is very easy to see how the code ties together with the equations. For example, the method that computes the Joule heating is defined by

def compute_joule_heating(current, resistance):
    I = current  # noqa
    R = resistance
    return I * I * R

And yes, this approach will create a minor additional computational overhead, but I found it insignificant compared to the benefits such an approach gave.

Still, I’m not saying that an object-oriented approach is only a bad thing. For linerate, we have the best of both worlds: a functional “backend” that makes it easy to explore each equation in CIGRE601 and an object-oriented “frontend” that makes it easy to compute ampacity ratings for a given power line.

Performant Python code

An essential part of mathematical code is its performance. We do not want to use native Python code for any numerical operation. Instead, we want to use the vectorisation capabilities of NumPy.

Specifically, we only use basic binary operators and NumPy universal functions (ufuncs). In the cases when we cannot use builtin ufuncs or binary operators (for example table lookups), I recommend using numba.vectorize to create your own blazing fast JIT-compiled ufuncs. By structuring the code this way, we never need to write another loop!

Say, for example, that we want to compute the line rating for all integer air temperatures between $0^\circ \text{C}$ and $6^\circ \text{C}$ and five linearly spaced wind speeds between $0 \frac{\text{m}}{\text{s}}$ and $1 \frac{\text{m}}{\text{s}}$. If we make sure that all code only uses binary operators and numpy ufuncs, then we only need to write


air_temperature = np.arange(0, 7).reshape(7, 1)
wind_speed = np.linspace(0, 1, 5).reshape(1, 5)
weather = Weather(
model = Cigre601(weather=weather, [...])
line_rating = model.compute_steady_state_ampacity(current=2000)

The magic happens in the highlighted lines. By adding singleton axes to the array, we enable NumPy array broadcasting. Broadcasting is almost magic! Typically, when we do array computations, we need array dimensions that match up. However, singleton dimensions are “wildcards” and will match any other dimension size by repeating values in the singleton dimension until they match up. I think the best way to understand broadcasting is by computing your own multiplication table:

x = np.arange(0, 5).reshape(1, 5)
y = np.arange(0, 5).reshape(5, 1)

print("x * y:")
print(x * y)
[[0 1 2 3 4]]
x * y:
[[ 0  0  0  0  0]
 [ 0  1  2  3  4]
 [ 0  2  4  6  8]
 [ 0  3  6  9 12]
 [ 0  4  8 12 16]]

We see that with broadcasting we can very easily compute the product of all possible combinations of entries in x and y. By only using ufuncs in our numerical code, we enable similar approaches, for example, to compute the ampacity rating for all possible combinations of weather- and line parameters.

Automatic testing of mathematical code

Okay, we got the code architecture sorted and the code super fast, so now we only need tests. Fortunately, another benefit of using pure functions for the equations is that they become straightforward to test.

Here, we will discuss two types of tests that we can make for our code: acceptance tests and property-based unit tests. The acceptance test defines a minimal set of requirements for the implementation and is provided by the standard. In CIGRE 601, we can find these under Annex E. Examples of calculation.

The more interesting part is the property-based unit tests. Property-based testing is a beautiful way to write elegant tests on the form:

given input that satisfies X, output should satisfy Y.

In Python, we use the Hypothesis library for property based testing, and I think the best way to understand it is with an example.

Consider the sorted-function. Once we sort a list, its numbers should be in ascending order, which means that lst[i] <= lst[i+1], and a test for this property could look like this:

from itertools import pairwise
import hypothesis.strategies as st
from hypothesis import given

def test_sorted_is_ascending(lst):
    lst = sorted(lst)
    for l1, l2 in pairwise(lst):
        assert l1 <= l2

Do you think this test will pass? Click here to try it out and see for yourself!

In our case, we have mathematical equations that we want to test. And the nice thing with mathematical equations is that they have a vast amount of properties we can test! For example, the equation to compute the convective cooling, $P_c$, states that

$$ P_c = \pi \lambda_f (T_s – T_a) \text{Nu}, $$

where $\lambda_f$ is the thermal conductivity of air, $\text{Nu}$ is the Nusselt number, and $T_s$ and $T_a$ is the temperature of the conductor (surface) and air, respectively. We know that this function is linear with respect to $\lambda_f$. Moreover, if we set $T_s – T_a = \text{Nu} = 1$, then $P_c = \pi \lambda_f$, which gives us the following property-based test

def test_convective_cooling_scales_linearly_with_thermal_conductivity_of_air(
    Nu = 1
    T_s = 1
    T_a = 0
    lambda_f = thermal_conductivity_of_air
    P_c = convective_cooling.compute_convective_cooling(T_s, T_a, Nu, lambda_f)
    assert P_c == np.pi * approx(lambda_f)

Here, we let Hypothesis select arbitrary values for $\lambda_f$, and only test whether whether $P_c = \pi \ \lambda_f$ when $\text{Nu} = T_s – T_a = 1$. If we didn’t use Hypothesis, we would either have to hard-code the test cases (e.g. with pytest.mark.parametrize) or iterate over random values for $\lambda_f$, which would make it less obvious what the code actually tests.

Style considerations

Selecting good variable names

One of the first style considerations we encounter when implementing mathematical equations with code is what we should name our variables. Do we use mathematical symbols or semantic names for the variable names?

Recall the equation for the convective cooling:

$$ P_c = \pi \lambda_f (T_s – T_a) \text{Nu}. $$

With Python, we could write the following equivalent function:

def compute_convective_cooling(
    lambda_f: float,
    T_s: float,
    T_a: float,
    Nu: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    return pi * lambda_f * (T_s - T_a) * Nu

However, while this makes it easy to see how the mathematics works, it could be more user-friendly. For example, do you remember what lambda_f represents?

We could, of course, switch it around and use argument names with semantic meaning:

def compute_convective_cooling(
    thermal_conductivity_of_air: float,
    surface_temperature: float,
    air_temperature: float,
    nusselt_number: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    return pi * thermal_conductivity_of_air * (surface_temperature - air_temperature) * nusselt_number

but then it becomes challenging to connect the code and the equations.

To get code where the variables both have semantic meaning, and the equations are easy to follow, we can combine these two approaches:

def compute_convective_cooling(
    thermal_conductivity_of_air: float,
    surface_temperature: float,
    air_temperature: float,
    nusselt_number: float,
) -> float:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    lambda_f = thermal_conductivity_of_air
    T_s = surface_temperature
    T_a = air_temperature
    Nu = nusselt_number
    return pi * lambda_f * (T_s - T_a) * Nu

This way, the API clarifies what we need to know to compute the convective cooling, while it is also easy to compare the code with the reference equations.

Selecting good function names

One of my main pet peeves when I deal with mathematical code is function names. All functions (unless you have a very good reason) should have a verb in their name. After all, a function does do something, so why not explicitly state what it does?

In mathematical code, naming functions with verbs is even more critical. Take, for example, the convective_cooling function we defined above. The function is very readable. We see what the inputs are, what it computes and how. Still, if we want to use this function to compute some convective cooling, we should also give that variable a good name. Any suggestions for a good name? I would probably want to go with convective_cooling. But wait, that’s the name of the function, so we might end up writing

convective_cooling = convective_cooling([...])

This is not ideal.

What is the solution to the naming problem? Simply add a verb to the function name. For example, “compute”:

def compute_convective_cooling(
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`.
    lambda_f = thermal_conductivity_of_air
    T_s = surface_temperature
    T_a = air_temperature
    Nu = nusselt_number
    return lambda * (T_s - T_a) * Nu

convective_cooling = compute_convective_cooling(

Sure, the function names get longer, but the code also becomes much more pleasant to read!

Using typing.annotated to provide units

A handy new Python feature (3.9) is annotated types. Annotated types allow us to provide extra information to our type hints. This is very useful for physical equations, where we can use the annotations to represent units. In linerate, we do this in the units.py-file

from typing import Union

    from typing import Annotated
except ImportError:  # Python version <3.9
    from typing_extensions import Annotated

import numpy as np
import numpy.typing as npt

FloatOrFloatArray = Union[float, np.float64, npt.NDArray[np.float64]]

OhmPerMeter = Annotated[FloatOrFloatArray, "Ω/m"]
Ampere = Annotated[FloatOrFloatArray, "A"]
Radian = Annotated[FloatOrFloatArray, "rad"]

We use these annotations in the functions that implement the mathematical equations:

def compute_joule_heating(current: Ampere, resistance: OhmPerMeter) -> WattPerMeter:
    I = current  # noqa
    R = resistance
    return I * I * R

Using annotations like this makes it extraordinarily easy to know what units we use in the different functions. But unfortunately, the units are not type-checked, so we can still write the wrong units by accident. I’m hopeful that a package like Pint can provide static checking of units in the future, but that is not possible (at least for now).

Extensive use of documentation strings

I believe docstrings are particularly important when we implement mathematical equations. A bare minimum is listing the source of the equations (e.g. Equation (17) on page 24 of CIGRE TB 601.). Even better, if you’re using Sphinx for the documentation, you can include the sphinxcontrib.bibtex plugin, which enables you to link the documentation against a BibTeX database (e.g. Equation (17) on page 24 of :cite:p:`cigre601`.).

After listing which equation(s) each function represents, we may want also to describe the notation. I like using the function arguments and returns field of the docstring for this, preferably formatted with LaTeX and using either the NumPy docstring style or the Google docstring style. A nice benefit we get by including type hints is that Sphinx can automatically parse these to provide the types in the docstring, also. To enable this, set

autodoc_typehints = "description"
autodoc_typehints_description_target = "documented"

in the Sphinx conf.py file. Unfortunately, I have yet to figure out how to get it to include the return types properly, but at least we don’t need to worry about the input types.

A final and somewhat optional step is to also include information about the quantities in the docstring so that the user does not need to look at the references. If we combine all these steps, we get a docstring kind of like the one below

def compute_convective_cooling(
    surface_temperature: Celsius,
    air_temperature: Celsius,
    nusselt_number: Unitless,
    thermal_conductivity_of_air: WattPerMeterPerKelvin,
) -> WattPerMeter:
    r"""Compute the convective cooling of the conductor.

    Equation (17) on page 24 of :cite:p:`cigre601`, given by
    .. math::
        P_c = \lambda_f \ Nu \ (T_a - T_s).

        :math:`T_s~\left[^\circ\text{C}\right]`. The conductor surface temperature.
        :math:`T_a~\left[^\circ\text{C}\right]`. The ambient air temperature.
        :math:`Nu`. The nusselt number.
        :math:`\lambda_f~\left[\text{W}~\text{m}^{-1}~\text{K}^{-1}\right]`. The thermal
        conductivity of air at the given temperature.

    Union[float, float64, ndarray[Any, dtype[float64]]]
        :math:`P_c~\left[\text{W}~\text{m}^{-1}\right]`. The convective cooling of the conductor.
        Either due to wind, or passive convection, whichever is largest.

Closing notes

There is no single solution to writing maintainable and readable mathematical code. This blog post demonstrates the choices behind linerate, but I am sure there are ways to improve these decisions. My main pain point is units, and for a future project, I hope to check if Pint might be part of a solution to ensuring consistent type usage. Moreover, I would test the code more thoroughly if given infinite time. For example, instead of checking that a function is linear by fixing all parameters but one, we could numerically differentiate it and check that the derivative is constant almost everywhere. If you have any ideas for writing maintainable and readable mathematical code, please share them in the comments below!

Cimsparql: Loading power system data into pandas dataframes in Python

In 2019, we started working on a model that should be able to handle intra-zonal constraints in the upcoming balancing market. That methodology has been presented in a previous post in January 2022. In this post, we will focus on an open source Python library called cimsparql that we have developed to support this model. For the model to be able to perform any analysis, it needs data that describe the state of the power system. At Statnett, these data are available as CIM (Common Information Model) profiles. The data is made available through a triple store (GraphDB/Blazegraph/rdf4j), using a resource description framework which is a standard model for data interchange.

The information about the power system available in these CIM profiles can be used for different purposes, and what information should be extracted depends on the requirement of your model. In the previously presented post, a DC optimal power flow model is used. Thus we need data on generation, demand and transmission lines. The purpose of the cimsparql package is to extract this information from the triple store, through a set of predefined sparql queries, and loading them into Python as pandas dataframes. Cimsparql will also make sure that columns in the dataframes have correct types, either string, float or integer, as defined by the CIM standard.

Cimsparql uses the SPARQLWrapper library to remotely execute sparql queries, and extends it with extra functionality, assuming the data conform to the CIM standard. Even though the package is an important part of the balancing market model, it is open source available from github and can be installed using pip.

~/pip install cimsparql

Once the library is installed, it must be configured to query a triple store using the ServiceConfig class in cimsparql.graphdb. The example below assumes you have a graphdb server with a CIM model in a repository called “micro_t1_nl”. This test case, available at the cimsparql repository on github, is used to test the development of the predefined queries.

  >>> service_cfg = ServiceConfig(repo="micro_t1_nl")
  >>> model = get_cim_model(service_cfg)

If you need to provide other configurations such as server, username and password, this can be done with the same ServiceConfig class.

Once the model is configured, the data can be loaded into a pandas dataframe using the predefined queries. In the example below, topological node information is extracted from the triple store.

>>> bus = model.bus_data()
>>> print(bus.to_string())
                                           busname      un
795a117d-7caf-4fc2-a8d9-dc8f4cf2344a  NL_Busbar__4  220.00
6bdc33de-d027-49b7-b98f-3b3d87716615   N1230822413   15.75
81b0e447-181e-4aec-8921-f1dd7813bebc   N1230992195  400.00
afddd60d-f7e6-419a-a5c2-be28d29beaf9   NL-Busbar_2  220.00
97d7d14a-7294-458f-a8d7-024700a08717    NL_TR_BUS2   15.75

Here the values in the nominal voltage column has been converted to float values as defined by the CIM standard, while node and bus names are strings.

All the predefined queries can be executed using the cimsparql.model.CimModel class. Examples are the already shown bus_data as well as loads, synchronous_machines, ac_lines and coordinates. The latter extracts coordinates of all equipment in the model from the CIM Geographical Location profile. Cimsparql orders the rows in the dataframe such that it is straightforward to use with plotly’s map functionality. The example below was made in a Jupyter notebook.

df = model.coordinates()
lines = df.query("rdf_type == 'http://iec.ch/TC57/2013/CIM-schema-cim16#ACLineSegment'")
stations = df.query("rdf_type == 'http://iec.ch/TC57/2013/CIM-schema-cim16#Substation'")
center_x, center_y = df["x"].mean(), df["y"].mean()

fig = px.line_mapbox(lines, lon="x", lat="y", color="mrid", height=1000)
fig2 = px.scatter_mapbox(stations, lon="x", lat="y", color="mrid", size=[1]*len(stations))
fig.update_geos(countrycolor="black", showcountries=True, showlakes=True, showrivers=True, fitbounds="locations")

all_fig = go.Figure(data=fig.data + fig2.data, layout = fig.layout)
AC line segments and substations included in the model

The main goal of cimsparql is to read data for the purpose of running power flow analysis using sparql queries to read data from triple store into pandas dataframes in Python. Currently the package is used internally at Statnett, where we also have some data which is yet not covered by the CIM standard. Thus some of the queries contains a namespace which will probably only be used by Statnett. However, this should not pose any problem for the use of this package elsewhere, as these namespaces or columns have been made optional. So any query towards a data set that does not contain these, will just produce a column for the given namespace with NaN values.

The package can also be uses in cases where the predefined queries does not produce data for a specific purpose. In this case, the user can provide their own queries as a string argument to the get_table_and_convert method. The example below list out the numbers of ac line segments for each voltage level in your data.

>>> query='''
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?un (count(?mrid) as ?n) where { 
?mrid rdf:type cim:ACLineSegment;
   cim:ConductingEquipment.BaseVoltage/cim:BaseVoltage.nominalVoltage ?un.
} group by ?un'''
>>> df = model.get_table_and_convert(query)

So to summarize, the main contribution of cimsparl is a set of predefined queries for the purpose of running power flow simulations and type conversion of data that follows the CIM standard.

Julia: A powerful alternative to Python

Almost three years ago, I started on my journey towards an industrial PhD in reinforcement learning applied to power systems. At around that time, I had used Python daily for 6-7 years. But one day during the autumn of 2018, Thomas, the founder of our Data Science team, threw out a new idea to me, like he used to do at the time: "Have you ever tried Julia? You really should! Some guys at MIT were tired of the slowness and problems with Python so they decided to create a new language. And there’s a module for power systems too, the PowerModels.jl." So there it started for me. And at the start my PhD project I thought it was a good opportunity for me to learn a new programming language from scratch along the way.

Getting started with Julia

If you don’t have Julia installed already, I suggest you install it now so you can follow along through the rest of this post. If you’re on Linux, the easiest way to get the installation of Julia up and running is to use juliaup:

curl -fsSL https://install.julialang.org | sh

or have a look at the Julia homepage.

Once you’re up and running, the easiest way to install a package when you’re in
a Julia session is to just start using it by typing using <package name>, as
for example in

julia> using Plots
 │ Package Plots not found, but a package named Plots is available from a registry. 
 │ Install package?
 │   (@v1.8) pkg> add Plots 
 └ (y/n/o) [y]: 

Julia seeks to solve the two-language problem

Julia is an open-source, high-level, dynamically-typed language. It was started back in 2009 by a group of people at MIT: Stefan Karpinski, Jeff Bezanson, Viral B. Shah and Alan Edelmann. The first version was released in 2012. In their blogpost from february 2012, the authors stated they wanted a programming language "… [with] the speed of C… as usable for general programming as Python…, [and] as easy for statistics as Matlab". Following this, Julia seeks to solve the two-language problem. If you want a language that is dynamic, flexible and easy to prototype in, you usually have to sacrifice speed, as in Python and R. On the other hand, if you want performance-critical code you have to resort to fast, static and compiled languages such as C and C++.

The way Julia solves this is by having a native, feature-rich REPL (Read-Eval-Print Loop) and by having JIT (just in time)-compilation together with a flexible, parametric type system along with multiple dispatch. More on this later!

The Julia REPL is your best friend

When prototyping in Julia, the natural starting point is the Julia REPL, which in many ways behaves pleasantly like the iPython interface but also with so much more. The Julia REPL has four main modes (together with full history search):

  • The Julia code prompt
  • Help mode
  • Package manager
  • Shell mode

1) The standard Julia code prompt

This mode is invoked at startup and is the mode where you do all your prototyping. For example, illustrating 3 different ways to write a function:

add_one_line(x) = x + 1 # one-line
add_one_anon = x -> x + 1 # anonymous function
function add_one_full(x)
    return x + 1

2) The help mode

This is invoked by typing ? and give you quick access to the docstring of a function. So if we are in Julia prompt mode and type


Add x and y.
function myadd(x,y)
    return x + y

we can activate the help mode by typing ? and then type the function name:

help?> myadd
search: myadd


  Add x and y.

3) The package manager

Apart from the speed, this is perhaps my favourite feature of the Julia language. In Python, there are number of different environment and package managers, like "pip", "pipenv", "virtualenv" and "poetry". Choosing between them and understanding how they work together can be confusing and time-consuming. In Julia, the package manager is built into the language and it just works. The package manager is invoked by typing ] and you leave it by typing backspace. In an empty directory you can create a new environment like this:

(@v1.7) pkg> activate .
  Activating new project at `~/JuliaProjects/sandbox/blog`

(blog) pkg> add UnicodePlots
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `~/JuliaProjects/sandbox/blog/Project.toml`
  [b8865327] + UnicodePlots v3.1.0
    Updating `~/JuliaProjects/sandbox/blog/Manifest.toml`

(blog) pkg> 

This can also be done programatically in the Julia REPL like this:

julia> using Pkg; Pkg.add("UnicodePlots")
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `~/.julia/environments/v1.7/Project.toml`
  [b8865327] + UnicodePlots v3.1.0
    Updating `~/.julia/environments/v1.7/Manifest.toml`

An environment is defined by two .toml-files. The Project.toml contains the packages you want in the environment together with version restrictions. The Project.toml is the file to commit to github. The file Manifest.toml is the detailed machine-generated file which contains all the information necessary to exactly reproduce a specific environment. Both files are constructed when you instantiate an environment and updated automatically when you add new packages.

4) The shell mode

By typing ; in the REPL you get instant access to a shell that behaves like a bash shell, which in many occasions could be really handy. But in practice I usually just open a new terminal with my regular bash shell.

Julia is not object-oriented – it relies on multiple dispatch

On the surface, the Julia syntax doesn’t differ too much from Python, the main differences being 1-based indexing and that functions and control structures in Julia are always closed by the end keyword. And as in Python, you don’t need to annotate the type of the variable. In this context, an important part of Julia, is the type inference. This means that the types are inferred from the actual variable values. For example,

julia> a = 1; typeof(a)

julia> a = 1.0; typeof(a)

julia> a = "Hello, word"; typeof(a)

julia> a = 1 + 1.0f0; typeof(a)

One main difference from Python is that Julia is not objected-oriented, so there are no classes and inheritance. Instead, Julia has a very flexible type hiearchy, with C-like structs as the main building block. This means that functions don’t belong to certain types or objects as they do in Python. Instead, Julia has polymorphism where functions can have the same name but which function to be used at runtime, or dispatched, depends on the types of the arguments of the function call. I write arguments in plural, because Julia has so called multiple dispatch, meaning that the language considers the types of all the arguments when choosing which function to execute. This is a remarkable feature. Let me show an example:

struct MyNumber

f(x) = "I am the fallback for one argument" 
f(x::Integer) = "I'm dispatched on integers"
f(x::Float64) = "I'm dispatched on floats"
f(x, y) = "Fallback for two arguments"
f(x::Int, y::Float64) = "First is Integer, second argument is Float64"
f(x::MyNumber) = "Hello, mynumber!"

mynumber = MyNumber(1.0)

f("Hello, world!")  # I am the fallback for one argument
f(1)  # I'm dispatched on integers
f(1.0)  # I'm dispatched on floats
f("Hello", 1)  # Fallback for two arguments
f(1, 2.0)  # First is Integer, second argument is Float64
f(mynumber)  # "Hello, mynumber!"

Julia is compiled just-in-time

This brings me to the second main difference: Julia is compiled. The compilation is done at runtime by a so-called JIT (just-in-time) compiler. JIT-compilation is also found in for example the Numba library in Python. In the example above, we have six different versions of the function called f. These versions are called methods in Julia. We can examine what methods of the function f above are actually defined by calling the methods function on f:

julia> methods(f)
# 6 methods for generic function "f":
[1] f(x::Int64, y::Float64) in Main at REPL[6]:1
[2] f(x::Integer) in Main at REPL[3]:1
[3] f(x::Real) in Main at REPL[4]:1
[4] f(x::MyNumber) in Main at REPL[7]:1
[5] f(x) in Main at REPL[2]:1
[6] f(x, y) in Main at REPL[5]:1

So at runtime, when the program is about to execute a specific function call, the compiler checks the types of all the actual input arguments of the function. If it has been called and compiled before with this combination of argument types, it uses a cached version, which of course is very fast. If not, a new version based on all the argument types of the current call is compiled and stored. This means we only need to compile functions actually used in the code.

Julia is fast because of code specialization

In principle, Julia code can get the speed of C for any user-defined type. This is in contrast to for example numpy in Python, where the speedups only apply to a limited collection of predefined datatypes. Compared to native Python, the speedup naturally varies a lot depending on the task, but figures of 2-10 times faster are not unheard of.

Even though modern data science tools in Python like Numpy and Tensorflow are wrappers around C/C++ and with that bring speed to the table, it is still useful to understand why pure Python is slow.

Coming from Python to Julia, the biggest change for me was to become more aware of the types of the variables and objects you are actually doing things with. "Is this an array of 64-bit floats?" "What are the types of the class members?" Working in pure Python, none of these questions seemed relevant because in Python a variable could bascially be of any type and things just still work.

So for example, when you want to sum an array of values in Python, each and every time you add two values the langauge needs to check the type of the values and at runtime find the right +-function to use in the summation and store the resulting value together with its type. There is no way to inform the language that all the values of an array are the same type, for example 64-bit precision floats.

How slow this is can actually be illustrated in Julia, where the type Any is the mother of all types and of which all other types is a subtype. The Julia type Any is equivalent to the the Python generic type. By using the Julia package BenchmarkTools we can compute statistics on the time spent on summing 10^7 random numbers in an array.

julia> using BenchmarkTools
julia> N = 10^7
julia> arr_any = Any[rand() for i in 1:N];
julia> arr_float64 = Float64[rand() for i in 1:N];
julia> @benchmark sum($arr_any)
BenchmarkTools.Trial: 31 samples with 1 evaluation.
 Range (min … max):  135.170 ms … 169.061 ms  ┊ GC (min … max):  0.00%16.89%
 Time  (median):     165.452 ms               ┊ GC (median):    16.91%
 Time  (mean ± σ):   163.840 ms ±   7.513 ms  ┊ GC (mean ± σ):  16.03% ±  4.23%

                                                       ▂█ ▄      
  ▄▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▆▄████▆▄▄▄ ▁
  135 ms           Histogram: frequency by time          169 ms <

 Memory estimate: 152.59 MiB, allocs estimate: 9999999.

julia> @benchmark sum($arr_float64)
BenchmarkTools.Trial: 1193 samples with 1 evaluation.
 Range (min … max):  4.047 ms …  4.706 ms  ┊ GC (min … max): 0.00%0.00%
 Time  (median):     4.165 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.181 ms ± 78.071 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▂▂▃▃▆▅█████████████████▇█▇▆▅▆▄▄▄▇▃▅▃▃▄▄▃▃▃▂▃▂▁▂▁▂▁▂▂▁▁▁▁▁ ▄
  4.05 ms        Histogram: frequency by time        4.42 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

As we can see, the compiler does its job and the median execution time is about 40 times slower with the generic type where the compiler cannot specialize on the type of the argument.

SINTEF and viralinstruction have written two very helpful blogposts that elaborate on how to get performant Julia code. In the Julia manual, there is a dedicated section on performance tips. I also highly recommend Steven Johson’s MIT lecture on Julia.

Julia has a growing ecosystem

The ecosystem of Python is impressive and mature. And the current era of Data Science has more or less evolved hand in hand with Python and its machine learning libraries such as SciPy and Tensorflow. Python was first released in 1991, more than twenty years before Julia. And as of september 2022, there are 400 000 projects in PyPi. In comparison, Julia has at the same time about 8 000 registered packages. In fact, already in 2017, Stefan Karpinski stated that:

Even NumPy itself was a bit rough back then. It’s entirely possible that if the SciPy ecosystem had been as well developed in 2009 as it is today, we never would have started Julia.

But the number of users and packages in Julia is growing, and I’ve definitely found what I need. To name a few I’ve used, there’s the Plots.jl and Makie.jl for plotting. There’s the Pandas equivalent DataFrames.jl for dataframes. There’s Flux.jl for machine learning. There’s ReinforcementLearning.jl for reinforcement learning, obviously. And there’s the PowerModels.jl for dealing with power systems.

Even though not all packages are as far developed as in Python, this has not been a major problem for me. And there are some benefits too of entering a package in its early phase. Your issues and requests have a better chance of being taken care of and it’s quite possible you are able to contribute to the project yourself, either with code or documentation. The two packages I’ve been deepest involved with, ReinforcementLearning.jl and GraphNeuralNetworks.jl, both have very friendly and helpful core developers.

A nice little Julia package: UnicodePlots.jl

I have to mention this neat little package in Julia, UnicodePlots.jl, which allows you to do simple plotting in the command line. For example,

using UnicodePlots

lineplot([cos, sin], 0:0.01:2pi)
barplot(["Oslo", "Stavanger", "Bergen", "Trondheim"],
        [2.244, 8.406, 4.92, 0.1],
        title = "Population")

which in my terminal renders as

How can Julia be used in power systems research?

My PhD project is about how to use reinforcement learning to handle failures in the power system. This involves a lot of load flow calculations. This is super easy in PowerModels.jl:

using PowerModels
# Casefile "case.14m" downloaded from https://egriddata.org/sites/default/files/case14.m

julia> results = compute_dc_pf("case14.m")
Dict{String, Any} with 5 entries:
  "optimizer"          => "\\"
  "termination_status" => true
  "objective"          => 0.0
  "solution"           => Dict{String, Any}("bus"=>Dict{String, Any}("4"=>Dict("va"=>-0.200698), "1"=>Dict("va"=>-0.0), "12"=>Dic…
  "solve_time"         => 0.000584841

Perhaps even more useful is to do an optimal power flow using the Ipopt.jl optimizer:

julia> using PowerModels
julia> using Ipopt

julia> solve_dc_opf("case14.m", Ipopt.Optimizer)


Number of nonzeros in equality constraint Jacobian...:      106
Number of nonzeros in inequality constraint Jacobian.:       80
Number of nonzeros in Lagrangian Hessian.............:        5

Total number of variables............................:       39
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        5
                     variables with only upper bounds:        0
Total number of equality constraints.................:       35
Total number of inequality constraints...............:       40
        inequality constraints with only lower bounds:       20
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:       20

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.6032287e+02 9.32e-01 1.89e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  8.6948059e+03 1.46e-16 6.86e+01  -1.0 7.93e-01    -  1.41e-02 1.00e+00h  1
   2  7.6460391e+03 6.66e-16 2.59e+00  -1.0 1.43e+00    -  5.95e-01 9.60e-01f  1
   3  7.6605209e+03 5.48e-16 3.39e+00  -1.0 1.48e-01    -  9.76e-01 5.00e-01f  2
   4  7.6544392e+03 6.66e-16 1.00e-06  -1.0 3.40e-02    -  1.00e+00 1.00e+00f  1
   5  7.6457228e+03 5.76e-16 2.83e-08  -2.5 5.25e-02    -  1.00e+00 1.00e+00f  1
   6  7.6432829e+03 4.44e-16 2.83e-08  -2.5 1.85e-02    -  1.00e+00 1.00e+00f  1
   7  7.6426423e+03 3.89e-16 1.50e-09  -3.8 5.43e-03    -  1.00e+00 1.00e+00f  1
   8  7.6425922e+03 6.66e-16 1.84e-11  -5.7 4.35e-04    -  1.00e+00 1.00e+00f  1
   9  7.6425918e+03 3.37e-16 2.51e-14  -8.6 3.75e-06    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 9

                                   (scaled)                 (unscaled)
Objective...............:   1.9106479435807935e+02    7.6425917743231739e+03
Dual infeasibility......:   2.5059035640133008e-14    1.0023614256053203e-12
Constraint violation....:   3.3653635433950058e-16    3.3653635433950058e-16
Variable bound violation:   8.9305948524944555e-09    8.9305948524944555e-09
Complementarity.........:   2.6421734022358593e-09    1.0568693608943436e-07
Overall NLP error.......:   2.6421734022358593e-09    1.0568693608943436e-07

Number of objective function evaluations             = 11
Number of objective gradient evaluations             = 10
Number of equality constraint evaluations            = 11
Number of inequality constraint evaluations          = 11
Number of equality constraint Jacobian evaluations   = 1
Number of inequality constraint Jacobian evaluations = 1
Number of Lagrangian Hessian evaluations             = 1
Total seconds in IPOPT                               = 0.005

EXIT: Optimal Solution Found.
Dict{String, Any} with 8 entries:
  "solve_time"         => 0.00622511
  "optimizer"          => "Ipopt"
  "termination_status" => LOCALLY_SOLVED
  "dual_status"        => FEASIBLE_POINT
  "primal_status"      => FEASIBLE_POINT
  "objective"          => 7642.59
  "solution"           => Dict{String, Any}("baseMVA"=>100, "branch"=>Dict{String, Any}("4"=>Dict{String, Any}("qf"=>NaN, "qt"=>N…
  "objective_lb"       => -Inf

Deep learning in Julia? Use Flux!

Flux.jl is the go-to library for deep learning in Julia. It is the Tensorflow and Pytorch equivalent.

Let’s do a simple example on training a linear regression model with intercept at 1.5 and slope 2.0. We first start by importing the necessary library and defining our data:

julia> using Flux
julia> x = rand(Float32, 1, 100)
#1×100 Matrix{Float32}:
# 0.971723  0.279388  0.718561  0.580433  0.319538  0.571858  0.808591  0.967042  0.511453  0.824858  0.0246731  0.924845  0.804781  0.0334803  0.864933  0.561797  0.459436  0.134477  0.397105  …  0.885082  0.444496  0.891089  0.452616  0.0905207  0.258379  0.736683  0.28399  0.624088  0.604748  0.275982  0.696864  0.735082  0.959392  0.580974  0.75722  0.763027  0.0576547

julia> intercept = 1.5
julia> slope = 2.0
julia> linear_noise(x, ϵ=0.1) = intercept + slope*x + randn()*ϵ
# linear_noise (generic function with 2 methods)

julia> y = linear_noise.(x) .|> Float32
#1×100 Matrix{Float32}:
# 3.25849  2.26917  2.8143  2.85554  2.12951  2.77449  3.19595  3.5636  2.56589  3.22538  1.59178  3.22211  3.02087  1.59712  3.15416  2.68378  2.3113  1.82769  2.43546  3.04249  2.02023  3.20036  …  1.91827  1.88766  3.24948  2.37444  3.17674  2.45275  1.57015  2.00188  2.83694  2.09291  2.79805  2.75575  2.06506  2.73183  2.99427  3.38747  2.55995  3.17439  3.12668  1.64066

The piping operator |> is very handy in Julia, the same is the broadcasting done by the .-operator. See the manual for more informations about these.

So now we have linear data with some noise added. Let’s also define a very simple neural network, with 1 layer and only one node. As we can see, there are two parameters in this model:

julia> model = Dense(1,1)
# Dense(1 => 1)       # 2 parameters

We can get the parameters of the model, a.k.a. the weights and the biases, by

julia> ps = Flux.params(model)
# Params([Float32[0.819971;;], Float32[0.0]])

We also define a loss function by

julia> loss(input, output) = Flux.Losses.mse(model(input), output)
# loss (generic function with 1 method)

julia> loss(x, y)
# 11.150247f0

So with random weight and bias, the loss is initially quite large. Let’s implement a simple gradient descent, loop over the data and train on the cpu (which is the default):

julia> epochs = 10_000

julia> opt = Flux.Descent()  # 0.1 is the default step size
# Descent(0.1)

julia> for epoch in 1:epochs
           ps = Flux.params(model)
           Flux.train!(loss, ps, [(x,y)], opt)

julia> loss(x,y)
# 0.009441698f0

julia> Flux.params(model)
# Params([Float32[1.9716092;;], Float32[1.5128096]])

Finally let’s plot the results using the standard Plots.jl library using the GR backend:

julia> using LaTeXStrings
julia> using Plots

julia> gr() # to activate the GR backend (not necessary as it is the default)
# Plots.GRBackend()

julia> p = scatter(x[1,:],y[1,:], label="observations", legend=:topleft, xlabel=L"x", ylabel=L"y")
julia> plot!(p, x[1,:], model(x)[1,:], label="model")

In the last function call, we see an example of a Julia convention: Any function that changes the argument in place has an "!" at the end of the identifier.

What’s not to like?

Time to first plot

The first time a Julia method is called it is compiled and this obviously prolongs the execution time. Sometimes this can lead to some delays (and frustration) when you’re typing interactively. Every time you close the Julia session and start a new one, you have to go throuh the compilations again. This manifests itself also in the so-called Time To First Plot (TTFP). Although this has been improved in almost every release of Julia, you still have two wait some long seconds. For example, these lines of code takes about 8s on my terminal from a fresh Julia session:

using Plots
x = range(0, 2pi, length=100)
plot(x, sin.(x))

Of course, the second time these lines are run from the same session it runs instantaneously.

In Python, the equivalent code took much less than 1s:

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 2*np.pi, num=100, endpoint=True)
plt.plot(x, np.sin(x))


The Julia debugger is also one part of Julia that has not brought forth the best of my feelings. Although it also has improved recently, to be frank I don’t use it much. In Python I used to step through the code in for example the Pycharm IDE. This gave much insight. In Julia, I code much more interactively through the REPL. I also implement shorter functions that are more testable and in a functional programming style. In addition, I use tools like Infiltrator.jl. After a year of coding like this, I hardly miss a debugger.

Closing remarks

To me, learning Julia has been a very positive experience. Julia is fast, fun to read and write, and it is applicable to a wide range of computational problems. But maybe more importantly, by learning a new language I have learned a lot on how computers and programming languages work together. And I have learned that there are alternatives to the Pythonic way. So if in doubt: Give Julia a try!

%d bloggers like this: