A Framework for Clone Detection in UML Models (UMCD)
2024 (English)In: ACM International Conference Proceeding Series, Association for Computing Machinery , 2024, p. 7-14Conference paper, Published paper (Refereed)
Abstract [en]
Clone detection plays a vital role in managing the quality and maintainability of software systems. In the process of software development, the initial phase is to specify and visualize the software design using UML models. These models serve as a blueprint to guide through all the phases of the software development process. Therefore, if there are clones in these UML models they will induce clones in further stages of software development as well. Subsequently, these clones will propagate and amplify the clone-related issues throughout the software development process. For this reason, detection, tracking, and removal of the clones in UML models is as crucial as in code. Furthermore, Model Driven Software Engineering (MDSE) aims to automatically generate code from models such as UML models. Consequently, increasing the importance of Model clone detection. This study focuses on the application of Natural Language Processing (NLP) to detect clones within UML models especially targeting UML state-machine models. Initially, a UML model is created, and exported in XML format, to represent the model in textual form. Since the XML code of UML diagrams carries a lot of structural information that is irrelevant for clone detection and is also not balanced. Therefore, the XML code is parsed to extract the relevant features of the model. The extracted features are further preprocessed to represent them in a suitable format. Furthermore, the extracted data is labeled to represent clone and nonclone pairs. Moreover, for the detection of clones Natural Language processing techniques are used since, UML models have a lot of textual information e.g., names of elements, constraints, etc. Therefore, NLP techniques can efficiently identify duplicates in UML Models. The proposed framework is applied to several case studies. These case studies validate the effectiveness of our approach in model clone detection.
Place, publisher, year, edition, pages
Association for Computing Machinery , 2024. p. 7-14
Keywords [en]
Extensible Markup Language (XML), MDSE (Model Driven Software Engineering), NLP (Natural Language Processing), State Machine (SM), UMCD (UML Model Clone Detection), UML (Unified modeling language), Application programs, Computer software selection and evaluation, Problem oriented languages, Software quality, Unified Modeling Language, XML, Clone detection, Language model, Language processing, Model driven software engineering, Model-driven software engineerings, Natural language processing, Natural languages, State machine, State-machine, Unified Modeling, Unified modeling language model clone detection, Software design
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-68648DOI: 10.1145/3686812.3686814Scopus ID: 2-s2.0-85205341761ISBN: 9798400717215 (print)OAI: oai:DiVA.org:mdh-68648DiVA, id: diva2:1904891
Conference
16th International Conference on Computer Modeling and Simulation, ICCMS 2024. Dalian21 June 2024 through 23 June 2024. Code 202801
2024-10-102024-10-102025-10-10Bibliographically approved