Analysis and Improvement of Deep Reinforcement  Learning-Based Mobile Robot Navigation in Dynamic  Environments

Wilson, Sona Preethy

Analysis and Improvement of Deep Reinforcement Learning-Based Mobile Robot Navigation in Dynamic Environments

dc.contributor.author	Wilson, Sona Preethy
dc.contributor.faculty	fi=Tekniikan ja innovaatiojohtamisen yksikkö\|en=School of Technology and Innovations\|
dc.contributor.organization	fi=Vaasan yliopisto\|en=University of Vaasa\|
dc.date.accessioned	2026-06-08T13:30:00Z
dc.date.issued	2026-05-13
dc.description.abstract	Autonomous mobile robot navigation in dynamic environments has been studied widely because of the increasing necessity for safe and efficient movement in complex settings such as warehouses, hospitals, transport terminals, and public spaces. Even though classical navigation approaches have shown stronger performance in controlled static environments, real-world continuous applications require repeated replanning, map updates, and simplified assumptions about obstacle position and dynamics. These requirements often limit the application of classical approaches in complex scenarios. Therefore, deep reinforcement learning is progressively adopted as a promising approach for navigation tasks requiring continuous control and adaptation to changing environments. In this thesis, deep reinforcement learning-based mobile robot navigation in dynamic environments has been analyzed and improved with a focus on safety, stability, task completion, and generalization. The research problem is aligned with the difficulty of learning reliable navigation policies, mainly in dynamic environments, where the performance can be reduced due to sparse rewards, unsafe exploration, and limited transferability to unseen settings. The study has been rooted in reinforcement learning theory, the Markov decision processes, continuous control, and safety-aware reward design function. Notably, DDPG, TD3, and SAC have been used as proposed algorithms for continuous control. Using structured navigation environments, a simulation-based experimental framework has been developed. The baseline DDPG has first been evaluated and conducted both qualitative and quantitative analysis on its failure behavior. After this, an iterative reward refinement was performed to improve collision avoidance, goal reaching, and motion stability. Later, algorithmic enhancement was performed, where TD3 and SAC were compared under controlled static and dynamic settings, and SAC was selected for curriculum-based training because of its generalization behavior. Finally, a curriculum-based training has been implemented with progressive complex environments followed by three generalization assessments, including zero-shot evaluation, warm-up fine-tuning, and OOD stress sweep. The results show that the improved model is more stable and has effective navigation behavior. The curriculum-trained policy achieved consistent task completions and has shown reasonable transferability to unseen environments, although performance degrades in more difficult out-of distribution cases. Overall, the study has demonstrated that a combination of reward engineering, stable learning, and curriculum-based training can improve safe navigation in dynamic environments.
dc.description.notification	fi=Opinnäytetyö kokotekstinä PDF-muodossa.\|en=Thesis fulltext in PDF format.\|sv=Lärdomsprov tillgängligt som fulltext i PDF-format\|
dc.format.content	fi=kokoteksti\|en=fulltext\|
dc.format.extent	79
dc.identifier.uri	https://osuva.uwasa.fi/handle/11111/20739
dc.identifier.urn	URN:NBN:fi-fe2026051344747
dc.language.iso	eng
dc.rights	CC BY 4.0
dc.subject.degreeprogramme	Master’s Programme in Smart Energy
dc.subject.discipline	Automation and Robotics
dc.subject.yso	reinforcement learning
dc.subject.yso	machine learning
dc.subject.yso	deep learning
dc.subject.yso	algorithms
dc.subject.yso	learning environment
dc.subject.yso	evaluation
dc.subject.yso	robotics
dc.subject.yso	navigation
dc.subject.yso	simulation
dc.subject.yso	efficiency (properties)
dc.title	Analysis and Improvement of Deep Reinforcement Learning-Based Mobile Robot Navigation in Dynamic Environments
dc.type.ontasot	fi=Pro gradu -tutkielma\|en=Master's thesis\|sv=Pro gradu -avhandling\|

Tiedostot

Näytetään 1 - 1 / 1

Name:: Uwasa_2026_Wilson_Sona-Preethy.pdf
Size:: 2.35 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt