Human Activity Recognition (HAR) using radar has emerged as a promising al-
ternative to vision- and wearable-based systems, particularly for privacy-preserving
and robust monitoring in indoor environments. This thesis explores early-level
feature fusion of multi-domain radar data collected using synchronized Frequency-
Modulated Continuous Wave (FMCW) sensors. The radar data was processed
into three key representations, Range-Doppler (RD), Range-Azimuth (RA), and
Range-Elevation (RE) maps, and fed into a deep learning pipeline composed of
TimeDistributed CNN blocks and Bidirectional LSTM layers with attention.
A custom dataset was collected from eleven participants with six classes in a
realistic room setup, using two FMCW sensors mounted on orthogonal walls.
The data was preprocessed, segmented into frame sequences, and used to train
an early fusion model evaluated with a Leave-One-Participant-Out (L1PO) strat-
egy. The final model achieved an accuracy of 81.27% and a weighted F1 score
of 81.04% Class-wise analysis revealed strong performance for dynamic activi-
ties like walking and in-place motion, while static postures such as lying down
were more prone to confusion, particularly between visually similar classes.
An additional evaluation was performed using only RD and RE features from
both sensors, reducing the input dimensionality while maintaining a high ac-
curacy of 81.69%. This result suggests that azimuthal data may not always be
necessary for effective HAR, although further testing is required due to signs of
overfitting observed in both fusion setups.
Overall, the findings demonstrate that early-level fusion of radar features from
multiple spatial perspectives can significantly enhance HAR performance, of-
fering a viable path toward robust, non-intrusive activity monitoring in smart
environments. The study also highlights the need for continued research in data
balancing, sensor placement, model regularization, and scalable deployment for
real-world applications.
2025. , p. 54