The Story of Flent: The Flexible Network Tester

A lot of my research involves running various types of network tests to measure performance issues with a view to improving them. For instance, I have been measuring queue management algorithms that alleviate bufferbloat, and working on improving WiFi performance. All of this involves lots of experiments which are easy to get wrong in subtle ways, and tedious to repeat manually.

To handle this and stay sane, automation is key. I am the author of an open source testing tool that helps with this, called The Flexible Network Tester (or Flent). After I first released it more than four years ago, it has been widely adopted in the bufferbloat community, and will even be included in the next release of Debian.

Flent was recently featured in Google’s Open Source Peer Bonus programme, which I am using as an occasion to write this post to tell the story of Flent, and highlight some of its notable features.

How Flent came to be

I first heard about the bufferbloat phenomenon almost five years ago, when I came across Jim Getty’s blog post. At the time, I was trying to figure out what to focus on for my master’s thesis, and thought that bufferbloat would be an excellent subject. So I started doing experiments, and it soon became clear to me that it was going to take a lot of experiments to get useful results, and that there was a lot of subtle and not so subtle ways to mess up these experiments and thus invalidate the results.

Fortunately, the bufferbloat community (especially Dave Taht and Jim Gettys) were very helpful in pointing out all my mistakes, and eventually my work was converging towards something that was actually useful. And since I have long been of the opinion that tedious and repetitive tasks should be ruthlessly automated, I started pouring all my newly acquired knowledge into Python code, and eventually released this as open source. Originally the tool was called “netperf-wrapper”, since Netperf was the main tool used to actually run the experiment, and I am terrible at coming up with names. I didn’t come up with the Flent name until two and a half years later, in the spring of 2015, when it had been clear for quite some time that the tool had outgrown the “netperf-wrapper” name.

Of course, once one starts automating a task, things invariably go as predicted by XKCD:

However, as more people have started using Flent, I have also received a steady stream of bug reports and feature requests, which has no doubt helped improve the quality of the tool. And so, working on Flent allows me to not only make my own work easier, but also to contribute something to the community and to help solve an important problem. And knowing that others find it useful makes this work incredibly rewarding.

Since beginning to work on Flent, I have continued working with bufferbloat and network performance, now as part of my PhD research. This means that Flent has grown features as I have needed them for my own research, as well as due to feature requests from the community. Consequently, Flent has now reached sufficient maturity that I decided to bump the version number to the big 1.0 earlier this year. Around the same time, the first major contribution from another contributor was merged: the TCP statistics parsing code written by Matthias Tafelmeier.

Notable Flent features

Flent has grown a number of features over the last four and a half years. The following gives an overview over the most notable features of the current version of Flent. For instructions for how to get started with Flent, see the quick start guide in the documentation.

Test tool orchestration and plotting

Of course, the main feature of Flent is test orchestration and plotting. The tests defined in Flent use other network benchmarking tools, most notably Netperf, to do the heavy lifting of actually sending packets across the network. Flent orchestrates these tools, by starting the right number of instances of the underlying tools, and parsing the results of all of them into a single JSON data file. From this data file format, Flent offers a variety of plotting options to visualise the results.

This approach of composing existing test tools is surprisingly powerful, and it is possible to express a great variety tests through it. One of the early tests developed to by Dave Taht to expose bufferbloat is called the Realtime Respons Under Load (RRUL) test, which was first implemented in Flent. It runs four TCP flows in both the upstream and downstream directions, while simultaneously measuring latency via UDP and ICMP.

An example plot of the RRUL test on a 10Mbps link with plenty of bufferbloat looks like this:

Example plot of the RRUL test.

Bufferbloat is clearly seen: Latency increases to around a second as soon as the TCP flows start up (after 5 seconds).

The top two graphs on the figure show the TCP download and upload flows, while the bottom graph shows the latency graph. The TCP flows start five seconds later than the latency measurement; and the impact of bufferbloat is clearly seen in the sharp increase of latency after the TCP flows start up.

The RRUL test is only one of the (at the time of writing) 78 tests shipped with Flent. And creating a new test is simply a matter of creating a configuration file that defines which instances of the supported test tools to start. There is support for dependencies among test tools (such as starting one after another exists), watchdog timers to start or stop tools after a set amount of time, and support for parameterising tests from the command line. There is also limited support for running tests from GUI. Together, this makes Flent capable of torturing your network in a surprising number of ways.

Batch automation

A typical workflow when testing a network is to vary one or more parameters in the environment (such as the bandwidth of an emulated link, or the queue management algorithm on the bottleneck) and re-run the same test. Flent has support for automating this by way of a batch run facility, where a series of tests can be specified in a configuration file, along with parameters to vary. Flent will then repeat the tests as configured and save the resulting data files with appropriately tagged file names.

While running a batch of tests, Flent can optionally execute arbitrary commands before and after each test run, which allows configuration of the environment. The commands executed can be fed information about the current settings of the variables being varied for the test run, and Flent will abort the test if a command fails.

In my own experiments, I have used this batch facility to do things like configure bottleneck links, configure wireless testbeds and manage automatic packet dumps of the traffic while a test is running. This allows me to run a week-long test series completely unattended, and come back to a treasure trove of test data. Which brings me to the next important feature of Flent… The interactive GUI.

Interactive GUI for data exploration

With automated test orchestration, suddenly producing data is no longer the tedious part of doing experimental network analysis. Rather, time can be spent analysing the data. This is a good thing; but it requires good tools for data analysis.

To aid in this, Flent includes an interactive GUI that makes it easy to load a series of data files and flip through plots of them. Multiple data files can be combined into a single plot, metadata can be explored, and data series are highlighted when hovering the plot legends, to make it easy to make out what one is looking at. There is also limited support for running tests from the GUI.

The GUI looks like this:

The Flent GUI.

This shows the same RRUL test as above, but with the FQ-CoDel queue management algorithm installed on the bottleneck link (and so almost no bufferbloat). The plot from above is open in a different tab in the GUI, which allows for easy comparison.

Metadata capture and storage

One of the challenges when running many tests is keeping track of which data sets correspond to which configuration. And verifying that configuration has actually been applied correctly. Flent aids in this by capturing metadata and storing it in the data files. The metadata includes a rich selection of information about the network environment on the host running Flent, and there is support for capturing additional metadata from remote hosts and for adding custom data. The GUI has an interface to browse the metadata, and selected items can be shown in the overview of open data files (the right-most column in the screenshot above).

Secondary datasets

Sometimes it is useful to capture a secondary dataset along with the primary one generated by the test tool, to gain insight into how other parts of the system are behaving. Flent has support for this, and can be configured to capture one or more of:

The TCP socket stats for the active connections (window size and RTT as reported by the ss tool).
Linux qdisc statistics (as reported by tc -s qdisc) on the interface carrying the test traffic.
WiFi aggregation and airtime statistics from Linux debugfs entries.
CPU usage statistics.

All of these can be captured either from the host running Flent, or from a configured remote host (such as the bottleneck router in the test setup). This facility has been instrumental in my WiFi work.

Future directions

Even though Flent has recently reached the important (if somewhat arbitrary) 1.0 milestone, development is by no means standing still. Features are added as I need them for my own experimental work, or as they are suggested or contributed by others. And of course, as all software projects, there is ongoing maintenance work and bug fixes. Currently planned or in progress work includes (in no particular order):

A revamp of the plotting code to simplify it and make it more flexible.
The ability to capture TCP statistics for flows not initiated by Flent itself.
Full support for starting tests from the GUI.
Making the features for aggregating multiple datasets more accessible.
Improving the documentation.

Of course, contributions are always welcome. The code and issue tracker are available on Github.

Summary

Flent has been a great help in my own research by making it easy to run experiments and process the data gathered in them. Developing the tool as open source software is a rewarding experience, and I am happy that others find the tool useful. I hope the overview given above has succeeded in conveying the usefulness of Flent, and that you will give it a shot. The quick start guide is a good starting point if you want to take Flent for a spin.