A Study of Proximal Policy Optimization (PPO) Algorithm in Carla

WONG, IOK KEONG(黃育強)

Please use this identifier to cite or link to this item: http://oaps.umac.mo/handle/10692.1/247

Full metadata record

DC Field	Value	Language
dc.contributor.author	WONG, IOK KEONG(黃育強)	-
dc.date.accessioned	2021-07-05T03:47:19Z	-
dc.date.available	2021-07-05T03:47:19Z	-
dc.date.issued	2021	-
dc.identifier.citation	Wong, I. K. (2021). A Study of Proximal Policy Optimization (PPO) Algorithm in Carla (OAPS)). Retrieved from University of Macau, Outstanding Academic Papers by Students Repository.	en_US
dc.identifier.uri	http://oaps.umac.mo/handle/10692.1/247	-
dc.description.abstract	In this report, we study the use of the latest algorithms of deep reinforcement learning to train self-driving cars and achieve driving safety. In the coming years, self-driving has been an important topic for researchers, governments, and enterprises. In one of the most densely populated cities in the world, Macau has many different traffic problems. Examples include traffic congestion and use a lot of time spent on traffic, including traffic problems caused by personal and government urban construction issues. Caused a high number of car accidents. According to the current development of autonomous driving technology, many organizations in the industry have established some autonomous driving algorithms based on this problem, driving several kilometers on public roads without any accidents. But whether it is manual driving or automatic driving, the real driving scene is very complicated. Therefore, this report uses reinforcement learning algorithms to train vehicles to learn how to drive autonomously in a simulated environment and better solve traffic problems in the real world. Through our work, we use the Carla environment and provide three types of operating models; the first is a new algorithm based on PPO, which includes setting up a fixed route based on the starting point and end point, setting the speed at which car approaches the target speed, and the second is The PPO-based reinforcement learning algorithm optimizes the training after the rewards design, so that the vehicle can safely drive on the route, and the last model will be based on the vehicle on the route ahead to avoidance vehicles. When training these environmental models, we have designed different decision making methods for various environments, such as training formulas with different rewards, or using environments with or without checkpoints, which will affect the learning results of the generated agent.	en_US
dc.language.iso	en	en_US
dc.title	A Study of Proximal Policy Optimization (PPO) Algorithm in Carla	en_US
dc.type	OAPS	en_US
dc.contributor.department	Department of Computer and Information Science	en_US
dc.description.instructor	Prof. Leong Hou U	en_US
dc.contributor.faculty	Faculty of Science and Technology	en_US
dc.description.course	Bachelor of Science in Computer Science	en_US
dc.description.programme	Bachelor of Science in Computer Science	en_US
Appears in Collections:	FST OAPS 2021

Files in This Item:

File	Description	Size	Format
OAPS_2021_FST_DB726106_Wong IokKeong_A Study of Proximal Policy Optimization (PPO) Algorithm in Carla.pdf		29.96 MB	Adobe PDF	View/Open

Show simple item record