Laurent Hoeltgen

Notebook Interfaces for Data Science Reports

2023-01-15  ·  6 min read  ·  Programming

A comparison of Mathematica, Jupyter, and Pluto

I compete at Hyrox races and work out a lot. I have already written about it here, where I have advocated the ease of use of Mathematica for data evaluation tasks. I want to elaborate this topic a bit further.

I keep a log of the performance of my workouts with a fitness tracker and results from all Hyrox races are publicly available as well. Hence there is a lot of data readily available and a few months ago I've started doing some statistical analysis on my workouts. My original motivation was to improve my knowledge on statistics and data science techniques and some curiosity on where I stand with my fitness compared to other people. But in the meantime I really try to extract helpful information to further improve my performance.

In this post I want to discuss the ease of use for evaluating and visualizing the data. Hence, the focus of this article is more on how fast I can generate a plot with relevant quantities with a given programming language/framework. I won't discuss things such as gathering data from the web. If anybody knows how to collect race results from the Hyrox website in a scripted way, I would be very interested in it. I have a semi automated way but even that one is not suitable to grab results from more than one race without spending a considerable amount of time.

Mathematica is really nice for importing, evaluating, and visualizing data. Importing the data is straightforward. You can read dozens of file formats with a simple call to the the Import[] function. Also, you have a huge choice of statistical methods to analyze your data readily available. However, I noticed one significant drawback: performance. I do a lot of sports. I've had an important race beginning of December 2022 and in the months before it, I did up to twelve workouts per week. By now (i.e. January 2023) my whole collection of recorded workouts consists of roughly 400 data sets in CSV format. Processing all this data with Mathematica has been tedious. I want a tool that I can easily launch and have a quick look at the results. The Mathematica notebook that I've used so far needed a few minutes to get things done. The performance bottleneck has been the conversion and processing of data to quantities with physical units. I wanted 2022-11-14 to be a Time object and my body weight to be a quantity in kilograms. Having physical units attached to your numbers avoids all sorts of computational issues (e.g. comparing kilograms with seconds) but it massively increases run time. Hence I started looking for alternatives. I came up with the following setups:

It is actually possible to store data in a binary format that Mathematica can read pretty fast (c.f. LocalObject). However, I found the documentation to be rather subpar and I wasn't very keen on having my data in another proprietary format either. One of the reasons why I like my current sports tracker so much is the fact that I have access to the raw data in plain CSV format that I can process with pretty much any tool of my choice. Also my CSV files are stored in the cloud so that I can access them from any of my computers. LocalObject stores data at a fixed location on your machine. If I remember right, there's a counterpart to store things in the cloud, but that would cause additional costs.

Python and Jupyter were discarded pretty fast as well. I have to use Python at work and I used Jupyter in the past. I like none of them. Parsing the data is probably best done with Pandas and the visualizations with Matplotlib. From a technical point of view, there are probably few arguments against this solution. However, this is a hobby project and I want to have some fun. Therefore, Python is out.

In my opinion, Rust is a very interesting programming language that I would really like to learn at some point. However, for this kind of project I'd like to have some kind of GUI that can display plots and tables. It would probably have become too much work for somebody who is completely unexperienced with the language. Alas! I have to find something else where I can dig into Rust.

The last choice on my list is Julia and Pluto.jl. I know a bit of Julia and it is definitely a suitable tool for this kind of task. Furthermore, I've been reading lots of good things about Pluto.jl. So I've given it a try.

Pluto.jl is a notebook interface for Julia. It's very similar to Jupyter or the notebook interface for Mathematica.

Pluto.jl notebooks are simple Julia scripts that can be edited and run a web browser. There's PlutoUI.jl, which enhances the notebooks with GUI elements such as check boxes, sliders and text fields. Plots.jl integrates well, too. Hence, the most basic requirements are fulfilled.

Pluto.jl tracks dependencies between your code cells. This means that if you define a variable a in one cell and assign it the value 5, then all cells that reference this variable will automatically be updated. If at some later point in time you change the value to 7, then all cells get updated again. This also works if you break things. If you change the signature of a method by e.g. adding a additional parameter, then all cells that call this method will tell you automatically that they need to be updated as well. As far as I know this feature is neither available in Jupyter nor in Mathematica notebooks. However it is incredibly helpful to keep your notebook in a correct and functional state.

Furthermore, Pluto.jl does encourage proper coding of your notebook. Docstrings are nicely formatted, calls to @debug or @info are highlighted below your cells and if you write tests with @test, then you are notified about their status as well. Due to the reactive nature of the notebooks all this happens in a seamless way. You do not need to reevaluate your cells by hand all the time. It happens while you code.

The biggest gripe that I have with Pluto.jl is that you can't really include your own local files or packages. Basically all your code has to be in the notebook. On the one hand this is good, because if you change something, you can easily spot all locations where things broke. The downside is that your notebook file becomes rather large. This deficit is not unique to Pluto.jl. It's a design issue with notebook interfaces that I have experienced in the past with Jupyter and Mathematica as well.

Another minor annoyance that I've encountered was that you have to wrap multiple statements in a single cell in begin, end blocks. Fortunately, if you forget to do so, the notebook reminds you about it with a concise error message.

Working with Pluto.jl has been mostly positive. I love the fact that the interface tracks the dependencies between the cells and updates them immediately (resp. tells you when something breaks). Also, PlutoUI.jl is an amazing extension for interactive usage.

How to cite this page

Hoeltgen, Laurent: Notebook Interfaces for Data Science Reports, 2023-01-15.

BibLaTeX code:
@online{pluto,
  author   = {Hoeltgen, Laurent},
  title    = {Notebook Interfaces for Data Science Reports},
  date     = {2023-01-15},
  language = english
  url      = {https://www.laurenthoeltgen.name/content/blog/
              pluto}
}
Download BibLaTeX file

CC BY-SA 4.0 Laurent Hoeltgen. Last modified: September 19, 2025.
Website built with Franklin.jl and the Julia programming language.
Privacy Policy · Terms