https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On the relationship between similar requirements and similar software: A case study in the railway domain
RISE Res Inst Sweden, Västerås, Sweden.
CNR ISTI, Pisa, Italy.
Berger Levrault, Montpellier, France.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0003-2416-4205
Show others and affiliations
2023 (English)In: Requirements Engineering, ISSN 0947-3602, E-ISSN 1432-010X, Vol. 28, p. 23-47Article in journal (Refereed) Published
Abstract [en]

Recommender systems for requirements are typically built on the assumption that similar requirements can be used as proxies to retrieve similar software. When a stakeholder proposes a new requirement, natural language processing (NLP)-based similarity metrics can be exploited to retrieve existing requirements, and in turn, identify previously developed code. Several NLP approaches for similarity computation between requirements are available. However, there is little empirical evidence on their effectiveness for code retrieval. This study compares different NLP approaches, from lexical ones to semantic, deep-learning techniques, and correlates the similarity among requirements with the similarity of their associated software. The evaluation is conducted on real-world requirements from two industrial projects from a railway company. Specifically, the most similar pairs of requirements across two industrial projects are automatically identified using six language models. Then, the trace links between requirements and software are used to identify the software pairs associated with each requirements pair. The software similarity between pairs is then automatically computed with JPLag. Finally, the correlation between requirements similarity and software similarity is evaluated to see which language model shows the highest correlation and is thus more appropriate for code retrieval. In addition, we perform a focus group with members of the company to collect qualitative data. Results show a moderately positive correlation between requirements similarity and software similarity, with the pre-trained deep learning-based BERT language model with preprocessing outperforming the other models. Practitioners confirm that requirements similarity is generally regarded as a proxy for software similarity. However, they also highlight that additional aspect comes into play when deciding software reuse, e.g., domain/project knowledge, information coming from test cases, and trace links. Our work is among the first ones to explore the relationship between requirements and software similarity from a quantitative and qualitative standpoint. This can be useful not only in recommender systems but also in other requirements engineering tasks in which similarity computation is relevant, such as tracing and change impact analysis.

Place, publisher, year, edition, pages
SPRINGER , 2023. Vol. 28, p. 23-47
Keywords [en]
Requirements similarity, Software similarity, Correlation, Perception of similarity, Language models
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:mdh:diva-57193DOI: 10.1007/s00766-021-00370-4ISI: 000744367400001Scopus ID: 2-s2.0-85123067513OAI: oai:DiVA.org:mdh-57193DiVA, id: diva2:1634304
Available from: 2022-02-02 Created: 2022-02-02 Last updated: 2025-10-10Bibliographically approved
In thesis
1. Enhancing Industrial Requirements Processing and Reuse
Open this publication in new window or tab >>Enhancing Industrial Requirements Processing and Reuse
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We live in a world that depends on software. From the moment we log in to a banking system or when we take the bus to work, we are surrounded by software-intensive systems. These systems are often not built from scratch, but as further iterations of existing systems, adapted for different customers and market segments.

The development of such complex software and variant-intensive systems is centered around customer needs that are usually described in long documents, full of detail, and written in natural language. Companies must read through, interpret, and extract the relevant requirements, decide which teams should develop and test them, and simultaneously identify what can be reused from earlier projects. This process is often manual, carries a risk of mistakes, and demands great experience and precision.

This thesis explores how Artificial Intelligence (AI), and in particular natural language processing (NLP), can help make the process both faster and more reliable. The work is based on six scientific articles, which make four contributions, as follows. First, we study how requirements management and reuse are handled today to identify opportunities for enhancement. Next, we focus on automating the identification and allocation of requirements, so that correct requirements are identified and directed to the right teams from the start. We also develop methods for discovering which parts of previous projects can be reused, to avoid redundant development efforts. Finally, we create a pedagogical resource that enables teachers, students, and professionals to apply the technical solutions in practice.

Through these contributions, the thesis demonstrates how AI can become a powerful support in processing requirements and supporting reuse in complex software development.

Abstract [sv]

Vi lever i en värld som är beroende av programvara. Från det att vi loggar in på banken eller att vi tar bussen till jobbet är vi omgivna av programvaruintensiva system. Ofta byggs dessa system inte från grunden, utan som vidareutvecklingar av redan befintliga lösningar, anpassade för olika kunder och marknader.

Kundernas behov beskrivs vanligen i långa dokument, fulla av detaljer och skrivna på vanligt språk. Företagen måste läsa igenom, tolka och plocka ut de relevanta kraven, bestämma vilka team som ska utveckla och testa dem, och samtidigt se vad som kan återanvändas från tidigare projekt. Det sparar tid och pengar, men är också ett pussel som kräver stor erfarenhet och noggrannhet. I praktiken tar det ofta lång tid, innebär risk för misstag och är beroende av ett fåtal experter.

Den här avhandlingen undersöker hur artificiell intelligens (AI), och i synnerhet naturlig språkbehandling (NLP), kan hjälpa till att göra processen både snabbare och mer tillförlitlig.

Arbetet bygger på sex vetenskapliga artiklar och bidrar inom fyra områden: Först kartlägger vi hur arbetet med kravhantering och återanvändning går till idag, och var det finns störst potential till förbättring. Därefter fokuserar vi på att automatisera själva identifieringen och fördelningen av krav, så att de hamnar hos rätt team från början. Vi utvecklar också metoder för att upptäcka vilka delar av tidigare projekt som kan återanvändas, för att undvika att uppfinna hjulet på nytt. Slutligen skapar vi en pedagogisk resurs som gör det möjligt för lärare, studenter och yrkesverksamma att använda de tekniska lösningarna i praktiken.

Med hjälp av dessa insatser visar avhandlingen hur AI kan bli ett kraftfullt stöd i arbetet med att förstå, organisera och återanvända den kunskap som ryms i komplex programvaruutveckling.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2025. p. 290
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 438
National Category
Software Engineering
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-72983 (URN)978-91-7485-715-3 (ISBN)
Public defence
2025-10-27, Alfa, Mälardalens universitet, Västerås, 13:15 (English)
Opponent
Supervisors
Funder
VinnovaKnowledge FoundationEuropean Commission
Available from: 2025-08-20 Created: 2025-08-19 Last updated: 2025-10-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Abbas, MuhammadEnoiu, Eduard PaulSaadatmand, MehrdadSundmark, Daniel

Search in DiVA

By author/editor
Abbas, MuhammadEnoiu, Eduard PaulSaadatmand, MehrdadSundmark, Daniel
By organisation
Embedded Systems
In the same journal
Requirements Engineering
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1724 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf