Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Quality Assurance for LLM-RAG Systems: Empirical Insights from Tourism Application Testing
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013).ORCID iD: 0000-0001-9051-7609
Ludwig Maximilian University Munich, Germany.
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013).ORCID iD: 0000-0003-0683-2783
Karlstad University, Faculty of Arts and Social Sciences (starting 2013), Karlstad Business School (from 2013). Karlstad University, Faculty of Arts and Social Sciences (starting 2013), Service Research Center (from 2013).ORCID iD: 0000-0002-3281-7942
Show others and affiliations
2025 (English)In: Proceedings of International Conference on Software Testing, Verification and Validation Workshops / [ed] Fasolino A.R., Panichella S., Aleti A., Mesbah A., Institute of Electrical and Electronics Engineers (IEEE), 2025, p. 200-207Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a comprehensive framework for testing and evaluating quality characteristics of Large Language Model (LLM) systems enhanced with Retrieval-Augmented Generation (RAG) in tourism applications. Through systematic empirical evaluation of three different LLM variants across multiple parameter configurations, we demonstrate the effectiveness of our testing methodology in assessing both functional correctness and extra-functional properties. Our framework implements 17 distinct metrics that encompass syntactic analysis, semantic evaluation, and behavioral evaluation through LLM judges. The study reveals significant information about how different architectural choices and parameter configurations affect system performance, particularly highlighting the impact of temperature and top-p parameters on response quality. The tests were carried out on a tourism recommendation system for the Varmland region in Sweden, utilizing standard and RAG-enhanced configurations. The results indicate that the newer LLM versions show modest improvements in performance metrics, though the differences are more pronounced in response length and complexity rather than in semantic quality. The research contributes practical insights for implementing robust testing practices in LLM-RAG systems, providing valuable guidance to organizations deploying these architectures in production environments. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025. p. 200-207
Keywords [en]
Application programs, Recommender systems, Redundancy, Reliability analysis, Reliability theory, Software testing, AI quality assurance, Extra-functional properties, Language model, Large language model, ML system testing, Quality testing, Retrieval-augmented generation, Software Quality, Software quality testing, System testing, Software quality
National Category
Computer Systems Computer Sciences
Research subject
Business Administration; Computer Science
Identifiers
URN: urn:nbn:se:kau:diva-104820DOI: 10.1109/ICSTW64639.2025.10962487ISI: 001483187700031Scopus ID: 2-s2.0-105004730418ISBN: 979-8-3315-3467-7 (electronic)ISBN: 979-8-3315-3468-4 (print)OAI: oai:DiVA.org:kau-104820DiVA, id: diva2:1965076
Conference
International Conference on Software Testing, Verification and Validation Workshops, ICSTW, Naples, Italy, March 31-April 4, 2025.
Funder
Region VärmlandKarlstad UniversityAvailable from: 2025-06-06 Created: 2025-06-06 Last updated: 2025-10-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Ahmed, Bestoun S.Bayram, FirasJagstedt, SiriMagnusson, Peter

Search in DiVA

By author/editor
Ahmed, Bestoun S.Bayram, FirasJagstedt, SiriMagnusson, Peter
By organisation
Department of Mathematics and Computer Science (from 2013)Karlstad Business School (from 2013)Service Research Center (from 2013)
Computer SystemsComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 212 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf