Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Dynamic Resource Controller for Resolving Quality of Service Issues in Modern Streaming Processing Engines
University of Sydney, AUS.
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013).ORCID iD: 0000-0001-9194-010X
University of Sidney, AUS.
RMIT University, School of Science, AUS.
2020 (English)In: 2020 IEEE 19th International Symposium on Network Computing and Applications, NCA 2020, Institute of Electrical and Electronics Engineers (IEEE), 2020Conference paper, Published paper (Refereed)
Abstract [en]

Devising an elastic resource allocation controller of data analytical applications in virtualized data-center has received a great attention recently, mainly due to the fact that even a slight performance improvement can translate to huge monetary savings in practical large-scale execution. Apache Flink is among modern streamed data processing run-times that can provide both low latency and high throughput computation in to execute processing pipelines over high-volume and high-velocity data-items under tight latency constraints. However, a yet to be answered challenge in a large-scale platform with tens of worker nodes is how to resolve the run-time violation in the quality of service (QoS) level in a multi-tenant data streaming platforms, particularly when the amount of workload generated by different users fluctuates. Studies showed that a static resource allocation algorithm (round-robin), which is used by default in Apache Flink, suffer from lack of responsiveness to sudden traffic surges happening unpredictably during the run-time. In this paper, we address the problem of resource management in a Flink platform for ensuring different QoS enforcement levels in a platform with shared computing resources. The proposed solution applies theoretical principals borrowed from close-loop control theory to design a CPU and memory adjustment mechanism with the primary goal to fulfill the different QoS levels requested by submitted applications while the resource interference is considered as the critical performance-limiting factor. The performance evaluation is carried out by comparing the proposed resource allocation mechanism with two static heuristics (round robin and class-based weighted fair queuing) in a 80-core cluster under multiple traffic patterns resembling sudden changes in the incoming workloads of low-priory streaming applications. The experimental results confirm the stability of the proposed controller to regulate the underlying platform resources to smoothly follow the target values (QoS violation rates). Particularly, the proposed solution can achieve higher efficiency compared to the other heuristics by reducing the response-time of high priority applications by 53% while maintaining the enforced QoS levels during the burst traffic periods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020.
Keywords [en]
Apache Flink Streaming Platform, Elastic Auto-Tuning, Performance Modeling of Computer System, Quality of Services (QoS) Issues, Closed loop control systems, Computation theory, Controllers, Data handling, Economics, Pipeline processing systems, Resource allocation, Adjustment mechanisms, Analytical applications, Large scale platforms, Performance limiting factor, Quality of service issues, Streaming applications, Virtualized data centers, Weighted fair queuing, Quality of service
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kau:diva-83355DOI: 10.1109/NCA51143.2020.9306697ISI: 000661912700018Scopus ID: 2-s2.0-85099730655ISBN: 9781728183268 (print)OAI: oai:DiVA.org:kau-83355DiVA, id: diva2:1534385
Conference
19th IEEE International Symposium on Network Computing and Applications, NCA 2020, 24 November 2020 through 27 November 2020
Available from: 2021-03-05 Created: 2021-03-05 Last updated: 2021-08-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Taheri, Javid

Search in DiVA

By author/editor
Taheri, Javid
By organisation
Department of Mathematics and Computer Science (from 2013)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 228 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf