Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Unchaining Microservice Chains: Machine Learning Driven Optimization in Cloud Native Systems
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013).ORCID iD: 0000-0002-4825-8831
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

As the cloud native landscape flourishes, microservices emerge as a central pillar for contemporary software development, enabling agility, resilience, and scalability in modern computing environments. While these modular services promise opportunities, particularly in the transformative ecosystem of 5G and beyond, they also introduce a myriad of complexities. Notably, the migration from hardware-centric to software-defined environments, culminating in Virtual Network Functions (VNF), has facilitated dynamic deployments across cloud data centers. In this transition, VNFs are often deployed within cloud native environments as independent services, mirroring the microservices model. However, the advantage of flexibility in cloud native systems is shadowed by bottlenecks in computational resource allocation, sub-optimal service chain placements, and the perpetual quest for performance enhancement. Addressing these concerns is not just pivotal but indispensable for harnessing the true potential of microservice chains.

In this thesis, the inherent challenges presented by cloud native microservice chains are addressed through the development and application of various tools and methodologies. The NFV-Inspector is introduced as a foundational tool, employing a systematic approach to profile and analyze Virtual Network Functions, subsequently extracting essential system KPIs essential for further modeling. Subsequent research introduced a Machine Learning (ML) based SLA-Aware resource recommendation system for cloud native functions. This system leveraged regression modeling techniques to correlate key performance metrics. Following this, PerfSim is proposed as a performance simulation tool designed specifically for cloud native computing environments, aiming to improve the accuracy of microservice chain simulations. Further research is conducted on Service Function Chain (SFC) Placement, emphasizing the equilibrium between cost-efficiency and latency optimization. The thesis concludes by integrating Deep Learning (DL) techniques for service chain optimization, employing both Graph Attention Networks (GAT) and Deep Q-Learning (DQN), highlighting the intersection of DL techniques and SFC performance optimization.

Abstract [en]

In the dynamic cloud native landscape, microservices stand out as pivotal for modern software development, enhancing agility, resilience, and scalability. These services, crucial in the transformative 5G era, introduce complexities such as resource allocation, service chain placement, and performance optimization challenges. This thesis delves into these challenges, emphasizing the development and application of tools and methodologies specific to microservice chains.

Key contributions include the NFV-Inspector, which, while focusing on Virtual Network Functions, is instrumental in profiling and analyzing microservices, extracting vital KPIs for advanced modeling. Further, a Machine Learning-based SLA-Aware system is introduced for resource recommendation in cloud-native functions, utilizing regression modeling to link performance metrics. PerfSim, another simulation framework, is proposed for simulating microservice chains in cloud environments. The thesis also explores Service Function Chain (SFC) placement, aiming to balance cost-efficiency with latency optimization. The thesis concludes by integrating Deep Learning (DL) for service chain optimization, employing both Graph Attention Networks (GAT) and Deep Q-Learning (DQN), showcasing the potentials of DL in SFC optimization.

Place, publisher, year, edition, pages
Karlstad: Karlstads universitet, 2023. , p. 36
Series
Karlstad University Studies, ISSN 1403-8099 ; 2023:35
Keywords [en]
Cloud Native Computing, Service Mesh, Performance Modelling, Performance Optimization, Performance Simulation, Machine Learning, Resource Allocation
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kau:diva-97377ISBN: 978-91-7867-420-6 (print)ISBN: 978-91-7867-421-3 (print)OAI: oai:DiVA.org:kau-97377DiVA, id: diva2:1811762
Public defence
2024-01-17, 1B309, Sjöströmsalen, Universitetsgatan 2, Karlstad, 08:30 (English)
Opponent
Supervisors
Available from: 2023-12-04 Created: 2023-11-14 Last updated: 2026-06-09Bibliographically approved
List of papers
1. NFV-Inspector: A Systematic Approach to Profile and Analyze Virtual Network Functions
Open this publication in new window or tab >>NFV-Inspector: A Systematic Approach to Profile and Analyze Virtual Network Functions
Show others...
2018 (English)In: 2018 IEEE 7th International Conference on Cloud Networking (CloudNet), IEEE, 2018, p. 1-7Conference paper, Published paper (Refereed)
Abstract [en]

Network Function Virtualization (NFV) focuses on decoupling network functions from proprietary hardware (i.e., middleboxes) by leveraging virtualization technology. Combining it with Software Defined Networking (SDN) enables us to chain network services much easier and faster. The main idea of using these technologies is to consolidate several Virtual Network Functions (VNFs) into a fewer number of commodity servers to reduce costs, increase VNFs fluidity and improve resource efficiency. However, the resource allocation and placement of VNFs in the network is a multifaceted decision problem that depends on many factors, including VNFs resource demand characteristics, arrival rate, configuration of underlying infrastructure, available resources and agreed Quality of Services (QoS) in Service Level Agreements (SLAs). This paper presents a bottom-up open-source NFV analysis platform (NFV-Inspector) to (1) systematically profile and classify VNFs based on resource capacities, traffic demand rate, underlying system properties, placement of VNFs in the network, etc. and (2) extract/calculate the correlation among the QoS metrics and resource utilization of VNFs. We evaluated our approach using an emulated virtual Evolved Packet Core platform (Open5GCore) to showcase how complex relation among various NFV service chains can be systematically profiled and analyzed.

Place, publisher, year, edition, pages
IEEE, 2018
Series
IEEE International Conference on Cloud Networking, ISSN 2374-3239
Keywords
Classification, Network Function Virtualization, Profiling, Quality of Service, Software Defined Networking, Classification (of information), Open source software, Open systems, Outsourcing, Transfer functions, Virtual reality, Decoupling network, Evolved packet cores, Resource efficiencies, Resource utilizations, Service level agreement (SLAs), Software defined networking (SDN), Virtualization technologies
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-71277 (URN)10.1109/CloudNet.2018.8549333 (DOI)000465081600016 ()2-s2.0-85060215258 (Scopus ID)9781538668313 (ISBN)
Conference
7th IEEE International Conference on Cloud Networking, CloudNet 2018, 22 October 2018 through 24 October 2018
Projects
NFV Optimizer, 5276
Funder
Knowledge Foundation, 20160182
Available from: 2019-02-21 Created: 2019-02-21 Last updated: 2026-06-09Bibliographically approved
2. Automated Analysis and Profiling of VirtualNetwork Functions: the NFV-Inspector Approach
Open this publication in new window or tab >>Automated Analysis and Profiling of VirtualNetwork Functions: the NFV-Inspector Approach
2018 (English)In: 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), IEEE, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Discovering insights about Virtual Network Function (VNFs) resource demand characteristics will enable cloud vendors to optimize their underlying Network Function Virtualization (NFV) system orchestration and dramatically mitigate CapEx and OpEx spendings. However, analyzing large-scale NFV systems, especially in mobile network environments, is a challenging task and requires tailor-made approaches for each particular application. In this demo, we showcase NFV-Inspector, an open source and extensible VNF analysis platform that is capable of systematically benchmark and profile NFV deployments. Based on its pluggable framework, NFV-Inspector classifies VNFs resource demand characteristics and correlate their Key Performance Indicators (KPIs) with system-level Quality of Service (QoS) measurements. 

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
Classification, Network Function Virtualization, Platform, Profiling, Quality of Service
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-71388 (URN)10.1109/NFV-SDN.2018.8725697 (DOI)000475896900023 ()978-1-5386-8281-4 (ISBN)978-1-5386-8282-1 (ISBN)
Conference
IEEE Conference on Network Function Virtulization and Software defined Networks, Verona, Italy, 27-29 November 2018
Projects
NFV Optimizer, 5276
Funder
Knowledge Foundation, 20160182
Note

Available from: 2019-02-28 Created: 2019-02-28 Last updated: 2026-06-09Bibliographically approved
3. A Performance Modelling Approach for SLA-Aware Resource Recommendation in Cloud Native Network Functions
Open this publication in new window or tab >>A Performance Modelling Approach for SLA-Aware Resource Recommendation in Cloud Native Network Functions
Show others...
2020 (English)In: 2020 6th IEEE Conference on Network Softwarization (NetSoft), IEEE, 2020, p. 292-300Conference paper, Published paper (Refereed)
Abstract [en]

Network Function Virtualization (NFV) becomes the primary driver for the evolution of 5G networks, and in recent years, Network Function Cloudification (NFC) proved to be an inevitable part of this evolution. Microservice architecture also becomes the de facto choice for designing a modern Cloud Native Network Function (CNF) due to its ability to decouple components of each CNF into multiple independently manageable microservices. Even though taking advantage of microservice architecture in designing CNFs solves specific problems, this additional granularity makes estimating resource requirements for a Production Environment (PE) a complex task and sometimes leads to an over-provisioned PE. Traditionally, performance engineers dimension each CNF within a Service Function Chain (SFC) in a smaller Performance Testing Environment (PTE) through a series of performance benchmarks. Then, considering the Quality of Service (QoS) constraints of a Service Provider (SP) that are guaranteed in the Service Level Agreement (SLA), they estimate the required resources to set up the PE. In this paper, we used a machine learning approach to model the impact of each microservice's resource configuration (i.e., CPU and memory) on the QoS metrics (i.e. serving throughput and latency) of each SFC in a PTE. Then, considering an SP's Service Level Objectives (SLO), we proposed an algorithm to predict each microservice's resource capacities in a PE. We evaluated the accuracy of our prediction on a prototype of a cloud native 5G Home Subscriber Server (HSS). Our model showed 95%-78% accuracy in a PE that has 2–5 times more computing resources than the PTE.

Place, publisher, year, edition, pages
IEEE, 2020
Keywords
NFV, SDN, performance modeling, cloud, network, optimization
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-80081 (URN)10.1109/NetSoft48620.2020.9165482 (DOI)000623436400048 ()2-s2.0-85091994999 (Scopus ID)
Conference
IEEE NetSoft 2020, 29 June-3 July 2020, Ghent, Belgium
Funder
Knowledge Foundation, 5276
Note

Virtual conference

Available from: 2020-09-06 Created: 2020-09-06 Last updated: 2026-06-09Bibliographically approved
4. Performance benchmarking of virtualized network functions to correlate key performance metrics with system activity
Open this publication in new window or tab >>Performance benchmarking of virtualized network functions to correlate key performance metrics with system activity
2020 (English)In: Proceedings of the 11th International Conference on Network of the Future, NoF 2020, IEEE, 2020, p. 73-81, article id 9249199Conference paper, Published paper (Refereed)
Abstract [en]

Industry is set to enter in a new revolution (Industry 4.0) backed by high inter-connectivity. Therefore, leveraging virtualization technology to deploy networks as virtualized network functions (VNFs) garnered attention. It helps the network operators and service providers to consolidate several VNFs on fewer of-The-shelf servers. This results in reducing the capital and operational expenditures while improving the resource efficiency. However, moving network functions from proprietary devices to standard servers comes with the profound cost of performance degradation. In order to overcome any performance issues to ensure service level agreement (SLA) requirements and before taking the solutions to real world, a sufficient verification and validation of VNFs is required. This is where Network Service benchmarking (NSB) plays a crucial role. NSB identifies any performance compromising bottlenecks by systematically evaluating the capacity of general purpose hardware resources, also know as network function virtualization infrastructure (NFVI), used to host single or multiple VNF instances. This paper presents a benchmarking methodology and framework to extract the correlation among the VNF quality of services (QoS) metrics and NFVI key performance indicators (KPls). For evaluation, VoerEir Touchstone platform is used to execute iPerf based benchmarking application to generate UDP based workload between VNFs. The results demonstrated that CPU utilization and L1-L3 cache memory are statistically correlated with packets dropped (0.43 and 0.47, respectively) and bandwidth utilization (0.99 and 0.92, respectively).

Place, publisher, year, edition, pages
IEEE, 2020
Keywords
Bandwidth Utilization, Benchmarking, Cache Memory, Correlation, CPU, NFVI, Packets Dropped, UDP, VNF, Function evaluation, Quality of service, Transfer functions, Benchmarking methodology, Key performance indicators, Operational expenditures, Performance benchmarking, Performance degradation, Service Level Agreements, Verification-and-validation, Virtualization technologies, Network function virtualization
National Category
Computer Sciences Communication Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-83129 (URN)10.1109/NoF50125.2020.9249199 (DOI)2-s2.0-85097615214 (Scopus ID)9781728180557 (ISBN)
Conference
11th International Conference on Network of the Future, NoF 2020, 12 October 2020 through 14 October 2020
Available from: 2021-02-21 Created: 2021-02-21 Last updated: 2026-06-09Bibliographically approved
5. PerfSim: A Performance Simulator for Cloud Native Microservice Chains
Open this publication in new window or tab >>PerfSim: A Performance Simulator for Cloud Native Microservice Chains
2023 (English)In: IEEE Transactions on Cloud Computing, ISSN 2168-7161, no 2, p. 1395-1413Article in journal (Refereed) Published
Abstract [en]

Cloud native computing paradigm allows microservice-based applications to take advantage of cloud infrastructure in a scalable, reusable, and interoperable way. However, in a cloud native system, the vast number of configuration parameters and highly granular resource allocation policies can significantly impact the performance and deployment cost of such applications. For understanding and analyzing these implications in an easy, quick, and cost-effective way, we present PerfSim, a discrete-event simulator for approximating and predicting the performance of cloud native service chains in user-defined scenarios. To this end, we proposed a systematic approach for modeling the performance of microservices endpoint functions by collecting and analyzing their performance and network traces. With a combination of the extracted models and user-defined scenarios, PerfSim can simulate the performance behavior of service chains over a given period and provides an approximation for system KPIs, such as requests' average response time. Using the processing power of a single laptop, we evaluated both simulation accuracy and speed of PerfSim in 104 prevalent scenarios and compared the simulation results with the identical deployment in a real Kubernetes cluster. We achieved ~81-99% simulation accuracy in approximating the average response time of incoming requests and ~16-1200 times speed-up factor for the simulation.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
performance simulator, performance modeling, cloud native computing, service chains, simulation platform
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-83686 (URN)10.1109/TCC.2021.3135757 (DOI)001004238600023 ()2-s2.0-85121842188 (Scopus ID)
Funder
Knowledge Foundation, 20200067
Note

Article published as manuscript entitled "PerfSim: A Performance Simulator for Cloud Native Computing" in Gokan Khan's (2021) licentiate thesis: Performance Modelling and Simulation of Service Chains for Telecom Clouds

Available from: 2021-04-16 Created: 2021-04-16 Last updated: 2026-06-09Bibliographically approved
6. Service Function Chain Placement for Joint Cost and Latency Optimization
Open this publication in new window or tab >>Service Function Chain Placement for Joint Cost and Latency Optimization
Show others...
2020 (English)In: Mobile Networks and Applications, ISSN 1383-469X, E-ISSN 1572-8153, Vol. 25, no 6, p. 2191-2205Article in journal (Refereed) Published
Abstract [en]

Network Function Virtualization (NFV) is an emerging technology to consolidate network functions onto high volume storages, servers and switches located anywhere in the network. Virtual Network Functions (VNFs) are chained together to provide a specific network service, called Service Function Chains (SFCs). Regarding to Quality of Service (QoS) requirements and network features and states, SFCs are served through performing two tasks: VNF placement and link embedding on the substrate networks. Reducing deployment cost is a desired objective for all service providers in cloud/edge environments to increase their profit form demanded services. However, increasing resource utilization in order to decrease deployment cost may lead to increase the service latency and consequently increase SLA violation and decrease user satisfaction. To this end, we formulate a multi-objective optimization model to joint VNF placement and link embedding in order to reduce deployment cost and service latency with respect to a variety of constraints. We, then solve the optimization problem using two heuristic-based algorithms that perform close to optimum for large scale cloud/edge environments. Since the optimization model involves conflicting objectives, we also investigate pareto optimal solution so that it optimizes multiple objectives as much as possible. The efficiency of proposed algorithms is evaluated using both simulation and emulation. The evaluation results show that the proposed optimization approach succeed in minimizing both cost and latency while the results are as accurate as optimal solution obtained by Gurobi (5%).

Place, publisher, year, edition, pages
Springer, 2020
Keywords
Cloud/edge computing, Network function virtualization, Optimization, Service chain placement, Cost reduction, Embeddings, Heuristic algorithms, Multiobjective optimization, Optimal systems, Pareto principle, Quality of service, Transfer functions, Conflicting objectives, Emerging technologies, Latency optimizations, Multi-objective optimization models, Optimization approach, Optimization modeling, Pareto optimal solutions, Qualityof-service requirement (QoS)
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-83100 (URN)10.1007/s11036-020-01661-w (DOI)000591258800001 ()2-s2.0-85096385460 (Scopus ID)
Available from: 2021-02-21 Created: 2021-02-21 Last updated: 2026-06-09Bibliographically approved
7. Graph Attention Networks and Deep Q-Learning for Service Mesh Optimization: A Digital Twinning Approach
Open this publication in new window or tab >>Graph Attention Networks and Deep Q-Learning for Service Mesh Optimization: A Digital Twinning Approach
2024 (English)In: Proceedings- IEEE International Conference on Communications / [ed] Valenti M., Reed D., Torres M., Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 2913-2918Conference paper, Published paper (Refereed)
Abstract [en]

In the realm of cloud native environments, Ku-bernetes has emerged as the de facto orchestration system for containers, and the service mesh architecture, with its interconnected microservices, has become increasingly prominent. Efficient scheduling and resource allocation for these microservices play a pivotal role in achieving high performance and maintaining system reliability. In this paper, we introduce a novel approach for container scheduling within Kubernetes clusters, leveraging Graph Attention Networks (GATs) for representation learning. Our proposed method captures the intricate dependencies among containers and services by constructing a representation graph. The deep Q-learning algorithm is then employed to optimize scheduling decisions, focusing on container-to-node placements, CPU request-response allocation, and adherence to node affinity and anti-affinity rules. Our experiments demonstrate that our GATs-based method outperforms traditional scheduling strategies, leading to enhanced resource utilization, reduced service latency, and improved overall system throughput. The insights gleaned from this study pave the way for a new frontier in cloud native performance optimization and offer tangible benefits to industries adopting microservice-based architectures.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
component, formatting, insert, style, styling
National Category
Computer Sciences Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kau:diva-97430 (URN)10.1109/ICC51166.2024.10622616 (DOI)2-s2.0-85202817543 (Scopus ID)978-1-7281-9055-6 (ISBN)978-1-7281-9054-9 (ISBN)
Conference
IEEE International Conference on Communications (ICC), Denver, USA, June 9-13, 2024.
Note

This article was included as a manuscript in the doctoral thesis entitled "Unchaining Microservice Chains: Machine Learning Driven Optimization in Cloud Native Systems" KUS 2023:35.

Available from: 2023-11-20 Created: 2023-11-20 Last updated: 2026-06-09Bibliographically approved

Open Access in DiVA

fulltext(9363 kB)750 downloads
File information
File name FULLTEXT02.pdfFile size 9363 kBChecksum SHA-512
a3c9745709d5f3bc8071cf6c86499999afa58b2bf5f07ded7bcf2021b36203f57c0d1c30ed8135ec11127f84b4726fc60de72064b8b10e0df967eecc731037ba
Type fulltextMimetype application/pdf
Forskningspodden with Michel Gokan Khan(27796 kB)46 downloads
File information
File name AUDIO01.mp3File size 27796 kBChecksum SHA-512
59f07226ed14bf75fa83b2b70f6c3d78cd453756c2d44a91df99a254430824b028b2cc7ef99318ad6b3a7418c5ead44c5453b177da8602f9329017ea59ecb7f5
Type audioMimetype audio/mpeg

Authority records

Gokan Khan, Michel

Search in DiVA

By author/editor
Gokan Khan, Michel
By organisation
Department of Mathematics and Computer Science (from 2013)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 751 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2410 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf