Robotics development depends on large, varied, and well-labelled datasets for training, testing, and evaluating perception and automation systems. However, collecting real-world data can be costly, time-consuming, difficult to scale, and sometimes unsafe or impractical, especially when rare scenarios, varied object poses, lighting conditions, cluttered environments, or safety-critical situations are required. Synthetic data pipelines address this challenge by enabling engineers to generate, configure, inspect, annotate, and evaluate artificial data in a controlled and repeatable workflow.
This thesis examines engineers’ trust in Large Language Model (LLM)-based support within the SCAILAB synthetic data pipeline. The purpose of the study is to understand how engineers experience existing LLM-supported functions, identify mismatches between expectations and actual use, explore where additional support could be introduced, and develop human-centred design recommendations for more trustworthy integration. The study is based on a qualitative case study with four internal engineers who had direct experience of the pipeline and its LLM-supported features. Data were collected through semi-structured interviews, task-based think-aloud sessions, and screen-recorded workflow interactions. Interview and think-aloud data were analysed thematically, while screen recordings were examined through behavioural coding to identify verification effort, workflow friction, and possible support opportunities.
The findings show that engineers’ trust is conditional, task-dependent, and closely connected to verification and control. Trust increased when outputs were easy to inspect, when the system provided clear context or explanations, and when engineers could decide how changes were applied. Trust decreased when outputs were difficult to verify, when the system misunderstood task context, when results changed unpredictably, or when errors could affect later pipeline stages. The study also identified possible future support areas, including requirement configuration, pipeline construction, parameter tuning, validation, output quality checking, dataset preparation, annotation review, evaluation, and monitoring.
The thesis concludes that trustworthy LLM support in engineering workflows should be staged, reviewable, transparent, and designed around engineer control. Rather than replacing human judgement, such support should reduce effort while preserving verification, accountability, and confidence in downstream data quality.