COMPARATIVE ANALYSIS OF DOUBLE DEEP Q-NETWORK AND PROXIMAL POLICY OPTIMIZATION FOR LANE-KEEPING IN AUTONOMOUS DRIVING

Ariful Islam Sabbir

doi:http://doi.org/10.25045/jpis.v16.i1.02

About the Journal Editorial Board For Authors Publication ethics Archive Abstracting & Indexing Contact us

№1, 2025

COMPARATIVE ANALYSIS OF DOUBLE DEEP Q-NETWORK AND PROXIMAL POLICY OPTIMIZATION FOR LANE-KEEPING IN AUTONOMOUS DRIVING

Ariful Islam Sabbir

027121125@sues.edu.cn

Lane-keeping is a vital function in autonomous driving, important for vehicle safety, stability, and adherence to traffic flow. The intricacy of lane-keeping control resides in balancing precision and responsiveness across varied driving circumstances. This article gives a comparative examination of two reinforcement learning (RL) algorithms—Double Deep Q-Network and Proximal Policy Optimization—for lane-keeping across discrete and continuous action spaces. Double DQN, an upgrade of standard Deep Q-Networks, eliminates overestimation bias in Q-values, demonstrating its usefulness in discrete action spaces. This method shines in low-dimensional environments like highways, where lane-keeping requires frequent, discrete modifications. In contrast, PPO, a strong policy-gradient method built for continuous control, performs well in high-dimensional situations, such as urban roadways and curved highways, where continual, accurate steering changes are necessary. The methods were tested in MATLAB/Simulink simulations that simulate both highway and urban driving circumstances. Each model integrates vehicle dynamics and neural network topologies to build control techniques. Results demonstrate that Double DQN consistently maintains lane position in highway settings, exploiting its ability to minimize overestimations in Q-values, thereby attaining stable lane centering. PPO outshines in dynamic and unpredictable settings, managing continual control adjustments well, especially under difficult traffic conditions and on curving roadways. This study underscores the importance of matching RL algorithms to the action-space requirements of specific driving environments, with Double DQN excelling in discrete tasks and PPO in continuous adaptive control, contributing valuable insights toward enhancing the flexibility and safety of autonomous vehicles (pp.12-25).

Keywords:Autonomous Driving Lane-Keeping, Reinforcement Learning Double Deep, Q-Network Proximal Policy, Optimization, Action Space, Proximal Policy Optimization

DOI:

http://doi.org/10.25045/jpis.v16.i1.02

View article(954)

References

Akrour R., Abdolmaleki A., Abdulsamad H., Peters J. and Neumann G. (2018). Model-free trajectory-based policy optimization with monotonic improvement. J. Mach. Learn. Res., 19(1): 565-589.
Chen G., Peng Y., and Zhang M. (2018). An Adaptive Clipping Approach for Proximal Policy Optimization, http://arxiv.org/abs/1804.06461
Chen W., Xiao H., Wang Q., Zhao L., and Zhu M. (2016). Lateral Vehicle Dynamics and Control. doi: 10.1002/9781118380000.ch5.
Criens C. et al., (2008). Chapter 2 Vehicle Dynamics Modeling, Simulation, 86(13):10–28,
https://vtechworks.lib.vt.edu/bitstream/handle/10919/36615/Chapter2a.pdf?sequence=4%0Ahttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5531319%0Ahttp://digital-library.theiet.org/content/conferences/10.1049/cp.2013.1920%0Ahttp://www.mate.tue
Duan Y., Chen X., Houthooft R., Schulman J. and Abbeel P. (2016). Benchmarking deep reinforcement learning for continuous control, Proc. Int. Conf. Mach. Learn., 1329-1338.
Dunn A.M., Hofmann O.S., Waters B., and Witchel E. (2011). Proximal policy optimization via enhanced exploration efficiency.https://huggingface.co/blog/deep-rl-ppo
Jayakody D. (2024). Double Deep Q-Networks - A Quick Intro (with Code). https://dilithjay.com/blog/ddqn
Kamat S. (2019). Lane Keeping of Vehicle Using Model Predictive Control. IEEE 5th Int. Conf. Converge. Technol. I2CT 2019, 1.
Doi: 10.1109/I2CT45611.2019.9033958
Kamat S. and Junnuri R. (2016). Model Predictive Control approaches for permanent magnet synchronous motor in virtual environment. 2016 IEEE 1st International Conference on Power Electronics Intelligent Control and Energy Systems
Kamat S. (2019). Lane Keeping of Vehicle Using Model Predictive Control, IEEE 5th Int. Conf. Converg. Technol. I2CT, 1–6,doi: 10.1109/I2CT45611.2019.9033958.
Kiran B. R. et al. (2022). “Deep Reinforcement Learning for Autonomous Driving: A Survey, IEEE Trans. Intell. Transp. Syst., 23(6):4909–4926, doi:10.1109/TITS.2021.3054625.
Kothari P., Perone C., Bergamini L., Alahi A., and Ondruska P. (2021). DriverGym: Democratising Reinforcement Learning for Autonomous Driving, http://arxiv.org/abs/2111.06889
Küçükoğlu B., Borkent W., Rueckauer B., Ahmad N., Güçlü U., van Gerven M. (2024). “Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization,” Neurons, Behav. Data Anal. Theory, 1–24, doi: 10.51628/001c.123366.
Lane Keeping Assist System Using Model Predictive Control - MATLAB & Simulink - MathWorks 中” Accessed: Nov. 04, 2024.
https://ww2.mathworks.cn/help/mpc/ug/lane-keeping-assist-system-using-model-predictive-control.html
Li C. (2019). Deep Reinforcement Learning. Frontiers of Artificial Intelligence.
Liu G., Ren H., Chen S., and Wang W. (2013). The 3-DoF bicycle model with the simplified piecewise linear tire model, Proc. - 2013 Int. Conf. Mechatron. Sci. Electr. Eng. Comput. MEC, 3530–3534,
doi: 10.1109/MEC.2013.6885617.
Lugner P. (2019). Vehicle Dynamics of Modern Passenger Cars.
Meng W., Zheng Q., Pan G., and Yin Y. (2023). Off-Policy Proximal Policy Optimization, Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, 37: 9162–9170, doi:10.1609/aaai.v37i8.26099.
Petrazzini I. G. B. and Antonelo E.A. (2021). Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution, 2021 IEEE Symp. Ser. Comput. Intell. SSCI, doi: 10.1109/SSCI50451.2021.9660123.
Prasad A., Gupta S.S., and Tyagi R.K. (2019). Advances in Engineering Design Select Proceedings of FLAME 101–102.
Proximal Policy Optimization — Spinning Up documentation, (2024). https://spinningup.openai.com/en/latest/algorithms/ppo.html
Schulman J., Wolski F., Dhariwal P., Radford A., and Klimov O., “Proximal Policy Optimization Algorithms, 1–12.
Shi H., Chen J., Zhang F., Liu M., and Zhou M. (2024). Achieving Robust Learning Outcomes in Autonomous Driving with DynamicNoise Integration in Deep Reinforcement Learning, Drones, 8(9): 470, doi: 10.3390/drones8090470.
Simonini T. (2022). Proximal Policy Optimization (PPO), Hugging Face.
Song Z., Parr R.E., and Carin L. (2019). Revisiting the softmax bellman operator: New benefits and new perspective, 36th Int. Conf. Mach. Learn. ICML, 10368–10383.
Train DQN Agent for Lane Keeping Assist - MATLAB & Simulink - MathWorks 中国. 2024. https://ww2.mathworks.cn/help/reinforcement-learning/ug/train-dqn-agent-for-lane-keeping-assist.html
Vehicle Body 3DOF - 3DOF rigid vehicle body to calculate longitudinal, lateral, and yaw motion - Simulink - MathWorks 中国. Accessed: Nov. 04, 2024. https://ww2.mathworks.cn/help/vdynblks/ref/vehiclebody3dof.html
Wu C., Chen X., Feng J., and Wu Z. (2017). Mobile Networks and Management, vol. 191. http://link.springer.com/10.1007/978-3-319-52712-3
Yoon Ch. (2024). “Dueling Deep Q Networks. Dueling Network Architectures for Deep… | by | Towards Data Science.”
https://towardsdatascience.com/dueling-deep-q-networks-81ffab672751
Zhu W. and Rosendo A. (2021). A Functional Clipping Approach for Policy Optimization Algorithms, IEEE Access, 9:(96056–96063), doi: 10.1109/ACCESS.2021.3094566.