Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bufferbloat and Beyond: Removing Performance Barriers in Real-World Networks
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013). (DISCO)ORCID iD: 0000-0001-5241-6815
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The topic of this thesis is the performance of computer networks. While network performance has generally improved with time, over the last several years we have seen examples of performance barriers limiting network performance. In this work we explore such performance barriers and look for solutions.

The problem of excess persistent queueing latency, known as bufferbloat, serves as our starting point; we examine its prevalence in the public internet, and evaluate solutions for better queue management, and explore how to improve on existing solutions to make them easier to deploy.

Since an increasing number of clients access the internet through WiFi networks, examining WiFi performance is a natural next step. Here we also look at bufferbloat, as well as the so-called performance anomaly, where stations with poor signal strengths can severely impact the performance of the whole network. We present solutions for both of these issues, and additionally design a mechanism for assigning policies for distributing airtime between devices on a WiFi network. We also analyse the “TCP Small Queues” latency minimisation technique implemented in the Linux TCP stack and optimise its performance over WiFi networks.

Finally, we explore how high-speed network processing can be enabled in software, by looking at the eXpress Data Path framework that has been gradually implemented in the Linux kernel as a way to enable high-performance programmable packet processing directly in the operating system’s networking stack.

A special focus of this work has been to ensure that the results are carried forward to the implementation stage, which is achieved by releasing implementations as open source software. This includes parts that have been accepted into the Linux kernel, as well as a separate open source measurement tool, called Flent, which is used to perform most of the experiments presented in this thesis, and also used widely in the bufferbloat community.

Abstract [en]

The topic of this thesis is the performance of computer networks in general, and the internet in particular. While network performance has generally improved with time, over the last several years we have seen examples of performance barriers limiting network performance. In this work we explore such performance barriers and look for solutions.

Our exploration takes us through three areas where performance barriers are found: The bufferbloat phenomenon of excessive queueing latency, the performance anomaly in WiFi networks and related airtime resource sharing problems, and the problem of implementing high-speed programmable packet processing in an operating system. In each of these areas we present solutions that significantly advance the state of the art.

The work in this thesis spans all three aspects of the field of computing, namely mathematics, engineering and science. We perform mathematical analysis of algorithms, engineer solutions to the problems we explore, and perform scientific studies of the network itself. All our solutions are implemented as open source software, including both contributions to the upstream Linux kernel, as well as the Flent test tool, developed to support the measurements performed in the rest of the thesis.

Place, publisher, year, edition, pages
Karlstad: Karlstads universitet, 2018.
Series
Karlstad University Studies, ISSN 1403-8099 ; 2018:42
Keywords [en]
Bufferbloat, AQM, WiFi, XDP, TSQ, Flent, network measurement, performance evaluation, fairness, queueing, programmable packet processing
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kau:diva-69416ISBN: 978-91-7063-878-7 (print)ISBN: 978-91-7063-973-9 (electronic)OAI: oai:DiVA.org:kau-69416DiVA, id: diva2:1251705
Public defence
2018-11-23, 21A342, Eva Erikssonsalen, Karlstad, 09:15 (English)
Opponent
Supervisors
Projects
HITS, 4707
Funder
Knowledge Foundation
Note

Paper 6 was published as manuscript in the thesis.

The revised fulltext is identical to the original version with the exception that printing errors have been removed.

Available from: 2018-10-26 Created: 2018-09-27 Last updated: 2020-06-09Bibliographically approved
List of papers
1. Measuring Latency Variation in the Internet
Open this publication in new window or tab >>Measuring Latency Variation in the Internet
2016 (English)In: Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, 2016, p. 473-480Conference paper, Published paper (Refereed)
Abstract [en]

We analyse two complementary datasets to quantify the latency variation experienced by internet end-users: (i) a large-scale active measurement dataset (from the Measurement Lab Network Diagnostic Tool) which shed light on long-term trends and regional differences; and (ii) passive measurement data from an access aggregation link which is used to analyse the edge links closest to the user.

The analysis shows that variation in latency is both common and of significant magnitude, with two thirds of samples exceeding 100\,ms of variation. The variation is seen within single connections as well as between connections to the same client. The distribution of experienced latency variation is heavy-tailed, with the most affected clients seeing an order of magnitude larger variation than the least affected. In addition, there are large differences between regions, both within and between continents. Despite consistent improvements in throughput, most regions show no reduction in latency variation over time, and in one region it even increases.

We examine load-induced queueing latency as a possible cause for the variation in latency and find that both datasets readily exhibit symptoms of queueing latency correlated with network load. Additionally, when this queueing latency does occur, it is of significant magnitude, more than 200\,ms in the median. This indicates that load-induced queueing contributes significantly to the overall latency variation.

Keywords
Latency, Bufferbloat, Access Network Performance
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-46999 (URN)10.1145/2999572.2999603 (DOI)978-1-4503-4292-6 (ISBN)
Conference
ACM CoNEXT 2016
Projects
SIDUS READY
Funder
Knowledge Foundation, 317700
Available from: 2016-11-01 Created: 2016-11-01 Last updated: 2019-11-04Bibliographically approved
2. The Good, the Bad and the WiFi: Modern AQMs in a residential setting
Open this publication in new window or tab >>The Good, the Bad and the WiFi: Modern AQMs in a residential setting
2015 (English)In: Computer Networks, ISSN 1389-1286, E-ISSN 1872-7069, Vol. 89, p. 90-106Article in journal (Refereed) Published
Abstract [en]

Several new active queue management (AQM) and hybrid AQM/fairness queueing algorithms have been proposed recently. They seek to ensure low queueing delay and high network goodput without requiring parameter tuning of the algorithms themselves. However, extensive experimental evaluations of these algorithms are still lacking. This paper evaluates a selection of bottleneck queue management schemes in a test-bed representative of residential Internet connections of both symmetrical and asymmetrical bandwidths as well as WiFi. Latency under load and the performance of VoIP and web traffic patterns are evaluated under steady state conditions. Furthermore, the impact of the algorithms on fairness between TCP flows with different RTTs, and also the transient behaviour of the algorithms at flow startup is examined. The results show that while the AQM algorithms can significantly improve steady state performance, they exacerbate TCP flow unfairness. In addition, the evaluated AQMs severely struggle to quickly control queueing latency at flow startup, which can lead to large latency spikes that hurt the perceived performance. The fairness queueing algorithms almost completely alleviate the algorithm performance problems, providing the best balance of low latency and high throughput in the tested scenarios. However, on WiFi the performance of all the tested algorithms is hampered by large amounts of queueing in lower layers of the network stack inducing significant latency outside of the algorithms’ control.

Place, publisher, year, edition, pages
Elsevier, 2015
Keywords
Active queue management, Fairness queueing, Bufferbloat, Latency, Performance measurement, Wireless networks
National Category
Communication Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-37954 (URN)10.1016/j.comnet.2015.07.014 (DOI)000361403600007 ()
Projects
HITS, 4707
Funder
Knowledge Foundation
Available from: 2015-09-16 Created: 2015-09-16 Last updated: 2019-11-07Bibliographically approved
3. Analyzing the Latency of Sparse Flows in the FQ-CoDel Queue Management Algorithm
Open this publication in new window or tab >>Analyzing the Latency of Sparse Flows in the FQ-CoDel Queue Management Algorithm
2018 (English)In: IEEE Communications Letters, ISSN 1089-7798, E-ISSN 1558-2558, Vol. 22, no 11, p. 2266-2269Article in journal (Refereed) Published
Abstract [en]

The FQ-CoDel queue management algorithm was recently published as an IETF RFC.It achieves low latency especially for low-volume (or sparse) traffic flowscompeting with bulk flows. However, the exact conditions for when a particularflow is considered to be sparse has not been well-explored.

In this work, we analyse the performance characteristics of the sparse flowoptimisation of FQ-CoDel, formulating the constraints that flows must satisfy tobe considered sparse in a given scenario. We also formulate expressions for theexpected queueing latency for sparse flows.

Then, using a numerical example, we show that for a given link and a given typeof sparse flows (VoIP traffic), the number of sparse flows that a givenbottleneck can service with low sparse flow latency is only dependent on thenumber of backlogged bulk flows at the bottleneck. Furthermore, as long as themaximum number of sparse flows is not exceeded, all sparse flows can expect avery low queueing latency through the bottleneck.

Place, publisher, year, edition, pages
IEEE, 2018
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-69414 (URN)10.1109/LCOMM.2018.2871457 (DOI)000449977700022 ()
Available from: 2018-09-27 Created: 2018-09-27 Last updated: 2018-11-29Bibliographically approved
4. Piece of CAKE: A Comprehensive Queue Management Solution for Home Gateways
Open this publication in new window or tab >>Piece of CAKE: A Comprehensive Queue Management Solution for Home Gateways
2018 (English)In: 2018 IEEE INTERNATIONAL SYMPOSIUM ON LOCAL AND METROPOLITAN AREA NETWORKS (LANMAN), IEEE, 2018, p. 37-42Conference paper, Published paper (Refereed)
Abstract [en]

The last several years has seen a renewed interest in smart queue management tocurb excessive network queueing delay, as people have realised the prevalence of bufferbloat in real networks.

However, for an effective deployment at today's last mile connections, animproved queueing algorithm is not enough in itself, as often the bottleneckqueue is situated in legacy systems that cannot be upgraded. In addition,features such as per-user fairness and the ability to de-prioritise backgroundtraffic are often desirable in a home gateway.

In this paper we present Common Applications Kept Enhanced (CAKE), a comprehensive network queue management system designed specifically for homeInternet gateways. CAKE packs several compelling features into an integratedsolution, thus easing deployment. These features include: bandwidth shaping withoverhead compensation for various link layers; reasonable DiffServ handling;improved flow hashing with both per-flow and per-host queueing fairness; andfiltering of TCP ACKs.

Our evaluation shows that these features offer compelling advantages, andthat CAKE has the potential to significantly improve performance of last-mileinternet connections.

Place, publisher, year, edition, pages
IEEE, 2018
Series
IEEE Workshop on Local and Metropolitan Area Networks, E-ISSN 1944-0375
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-68630 (URN)000447699400007 ()
Conference
24th IEEE International Symposium on Local and Metropolitan Area Networks (IEEE LANMAN)
Available from: 2018-07-30 Created: 2018-07-30 Last updated: 2019-11-14Bibliographically approved
5. Ending the Anomaly: Achieving Low Latency and Airtime Fairness in WiFi
Open this publication in new window or tab >>Ending the Anomaly: Achieving Low Latency and Airtime Fairness in WiFi
Show others...
2017 (English)In: Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC ’17), USENIX - The Advanced Computing Systems Association, 2017, p. 139-151Conference paper, Published paper (Refereed)
Abstract [en]

With more devices connected, delays and jitter at the WiFi hop become more prevalent, and correct functioning during network congestion becomes more important. However, two important performance issues prevent modern WiFi from reaching its potential: increased latency under load caused by excessive queueing (i.e. bufferbloat) and the 802.11 performance anomaly.

To remedy these issues, we present a novel two-part solution. We design a new queueing scheme that eliminates bufferbloat in the wireless setting. Leveraging this queueing scheme, we then design an airtime fairness scheduler that operates at the access point and doesn't require any changes to clients.

We evaluate our solution using both a theoretical model and experiments in a testbed environment, formulating a suitable analytical model in the process. We show that our solution achieves an order of magnitude reduction in latency under load, large improvements in multi-station throughput, and nearly perfect airtime fairness for both TCP and downstream UDP traffic. Further experiments with application traffic confirm that the solution provides significant performance gains for real-world traffic.We develop a production quality implementation of our solution in the Linux kernel, the platform powering most access points outside of the managed enterprise setting. The implementation has been accepted into the mainline kernel distribution, making it available for deployment on billions of devices running Linux today.

Place, publisher, year, edition, pages
USENIX - The Advanced Computing Systems Association, 2017
Series
2017 Usenix Annual Technical Conference (Usenix Atc '17)
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-47000 (URN)000428763500011 ()978-1-931971-38-6 (ISBN)
Conference
2017 USENIX Annual Technical Conference (USENIX ATC 17). July 12–14, 2017, Santa Clara, CA, USA
Available from: 2016-11-01 Created: 2016-11-01 Last updated: 2018-11-13Bibliographically approved
6. PoliFi: Airtime Policy Enforcement for WiFi
Open this publication in new window or tab >>PoliFi: Airtime Policy Enforcement for WiFi
2019 (English)In: IEEE Wireless Communications and Networking Conference, WCNC, IEEE, 2019, p. 1-6, article id 8885440Conference paper, Published paper (Refereed)
Abstract [en]

As WiFi grows ever more popular, airtime contention becomes an increasing problem. One way to alleviate this is through network policy enforcement. Unfortunately, WiFi lacks protocol support for configuring policies for its usage, and since network-wide coordination cannot generally be ensured, enforcing policy is challenging. However, as we have shown in previous work, an access point can influence the behaviour of connected devices by changing its scheduling of transmission opportunities, which can be used to achieve airtime fairness. In this work, we show that this mechanism can be extended to successfully enforce airtime usage policies in WiFi networks. We implement this as an extension our previous airtime fairness work, and present PoliFi, the resulting policy enforcement system. Our evaluation shows that PoliFi makes it possible to express a range of useful policies. These include prioritisation of specific devices; balancing groups of devices for sharing between different logical networks or network slices; and limiting groups of devices to implement guest networks or other low-priority services. We also show how these can be used to improve the performance of a real-world DASH video streaming application.

Place, publisher, year, edition, pages
IEEE, 2019
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-69641 (URN)10.1109/WCNC.2019.8885440 (DOI)000519086300020 ()9781538676462 (ISBN)
Conference
2019 IEEE Wireless Communications and Networking Conference, WCNC 2019; Marrakesh; Morocco; 15 April 2019 through 19 April 2019
Available from: 2018-10-16 Created: 2018-10-16 Last updated: 2020-04-23Bibliographically approved
7. Adapting TCP Small Queues for IEEE 802.11 Networks
Open this publication in new window or tab >>Adapting TCP Small Queues for IEEE 802.11 Networks
Show others...
2018 (English)In: 2018 IEEE 29Th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), IEEE, 2018Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, the Linux kernel has adopted an algorithm calledTCP Small Queues (TSQ) for reducing queueing latency by controlling buffering in the networking stack.This solution consists of a back-pressure mechanism that limitsthe number of TCP segments within the sender TCP/IP stack, waitingfor packets to actually be transmitted onto the wire beforeenqueueing further segments.Unfortunately, TSQ prevents the frameaggregation mechanism in the IEEE 802.11n/ac standards from achieving itsmaximum aggregation, because not enough packets are available in the queue to buildaggregates from, which severely limits achievable throughput over wirelesslinks.This paper demonstrates this limitation of TSQ in wireless networks and proposesControlled TSQ (CoTSQ), a solution that improves TSQ so that it controls the amountof data buffered while allowing the IEEE 802.11n/ac aggregation logic to fullyexploit the available channel and achieve high throughput. Results on a real testbed show that CoTSQ leadsto a doubling of throughput on 802.11n and up to an order of magnitudeimprovement in 802.11ac networks, with a negligible latency increase.

Place, publisher, year, edition, pages
IEEE, 2018
Series
IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications workshops, ISSN 2166-9589, E-ISSN 2166-9570
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-68631 (URN)10.1109/PIMRC.2018.8581048 (DOI)000457761900215 ()978-1-5386-6009-6 (ISBN)978-1-5386-6010-2 (ISBN)
Conference
29th IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). Bologna, Italy. 9-12 september, 2018.
Available from: 2018-07-30 Created: 2018-07-30 Last updated: 2019-07-17Bibliographically approved
8. The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel
Open this publication in new window or tab >>The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel
Show others...
2018 (English)In: CoNEXT '18 Proceedings of the 14th International Conference on emerging Networking EXperiments and Technologies, Association for Computing Machinery (ACM), 2018, p. 54-66Conference paper, Published paper (Refereed)
Abstract [en]

Programmable packet processing is increasingly implemented using kernel bypass  techniques, where a userspace application takes complete control of the  networking hardware to avoid expensive context switches between kernel and  userspace. However, as the operating system is bypassed, so are its  application isolation and security mechanisms; and well-tested configuration,  deployment and management tools cease to function.  To overcome this limitation, we present the design of a novel approach to  programmable packet processing, called the eXpress Data Path (XDP). In XDP,  the operating system kernel itself provides a safe execution environment for  custom packet processing applications, executed in device driver context. XDP  is part of the mainline Linux kernel and provides a fully integrated solution  working in concert with the kernel's networking stack. Applications are  written in higher level languages such as C and compiled into custom byte code  which the kernel statically analyses for safety, and translates into native  instructions.  We show that XDP achieves single-core packet processing performance as high as  24 million packets per second, and illustrate the flexibility of the  programming model through three example use cases: layer-3 routing, inline  DDoS protection and layer-4 load balancing.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2018
Keywords
XDP, BPF, Programmable Networking, DPDK
National Category
Computer Sciences
Identifiers
urn:nbn:se:kau:diva-69639 (URN)10.1145/3281411.3281443 (DOI)000455383800006 ()978-1-4503-6080-7 (ISBN)
Conference
CoNEXT '18: International Conference on emerging Networking EXperiments and Technologies
Available from: 2018-10-16 Created: 2018-10-16 Last updated: 2019-11-09Bibliographically approved
9. Flent: The FLExible Network Tester
Open this publication in new window or tab >>Flent: The FLExible Network Tester
2017 (English)In: VALUETOOLS 2017: Proceedings of 11th EAI International Conference on Performance Evaluation Methodologies and Tools, New York, NY: Association for Computing Machinery (ACM), 2017, p. 1-6, article id 271973Conference paper, Published paper (Refereed)
Abstract [en]

Running network performance experiments on real systems is essential for a complete understanding of protocols and systems connected to the internet. However, the process of running experiments can be tedious and error-prone. In particular, ensuring reproducibility across different systems is difficult, and comparing different test runs from an experiment can be non-trivial.In this paper, we present a tool, called Flent, designed to make experimental evaluations of networks more reliable and easier to perform. Flent works by composing well-known benchmarking tools to, e.g., run tests consisting of several bulk data flows combined with simultaneous latency measurements. Tests are specified in source code, and several common tests are included with the tool. In addition, Flent contains features to automate test runs, collect relevant metadata and interactively plot and explore datasets.We showcase Flent's capabilities by performing a set of experiments evaluating the new BBR congestion control algorithm, using Flent's capabilities to reproduce experiments both in a controlled testbed and across the public internet. Our evaluation reveals several interesting features of BBR's performance.

Place, publisher, year, edition, pages
New York, NY: Association for Computing Machinery (ACM), 2017
Keywords
Network experimentation, Network performance analysis, Network measurement, Measurement tools
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-64766 (URN)10.1145/3150928.3150957 (DOI)978-1-4503-6346-4 (ISBN)
Conference
VALUETOOLS 2017, December 5–7, 2017, Venice, Italy
Available from: 2017-10-23 Created: 2017-10-23 Last updated: 2019-11-09Bibliographically approved

Open Access in DiVA

Summary_KUS_2018_42_original(527 kB)2131 downloads
File information
File name FULLTEXT01.pdfFile size 527 kBChecksum SHA-512
a1eabc497ac2854e43e87160201dad9d652cd30384b1182ce6f7dcf646fb017dfec696a36f4f1ab8b6d2621cb77369ffccbbce04821b333e5bf43c978b58739f
Type fulltextMimetype application/pdf
Forskningspodden withToke Høiland-Jørgensen(30334 kB)359 downloads
File information
File name AUDIO01.mp3File size 30334 kBChecksum SHA-512
89136e8c421bd7d0f9faed21b520d91881fb70cbc2a9e76bc3592a1ad7ce1bd872a6dc74e8c00d8afd543090a647bfa948cb8baed39a869580c32c37df236653
Type audioMimetype audio/mpeg
KUS_2018_42_revised(595 kB)259 downloads
File information
File name FULLTEXT02.pdfFile size 595 kBChecksum SHA-512
45f3df00041503fa93c4577c29db3f509b6bbd1851d9697bce23a1f5201dd7142a3a635df0ddac5021eb1d2371fb0ab26f80b12680f820cbe371e7f3e13991b1
Type fulltextMimetype application/pdf

Authority records

Høiland-Jørgensen, Toke

Search in DiVA

By author/editor
Høiland-Jørgensen, Toke
By organisation
Department of Mathematics and Computer Science (from 2013)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2390 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 3597 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf