كسري مزارعي سعدآبادي

عنوان

كنترل و رديابي مسير بهينه غيرخطي ربات همه سو گرد چهارچرخ مكانوم با استفاده از يادگيري تقويتي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق كنترل

سال تحصيل

1397-1400

تاريخ دفاع

25/8/1400

استاد راهنما

دكتر سهيل گنجه فر

دانشكده

مهندسي برق

چكيده

در اين پايان‌نامه، راهكاري براي رديابي بهينه‌ي مسير مطلوب به شكل برخط، مبتني بر يادگيري تقويتي، توسط ربات متحرك همه‌جهته به‌عنوان يك سيستم غيرخطي و داراي پيچيدگي زياد ديناميكي، ارائه‌شده است. چالش اساسي در پياده‌سازي كنترل‌كننده‌ي بهينه بر روي ربات‌هاي متحرك همه‌جهته‌ي چهارچرخ مكانوم، پيچيدگي سيستم و متعاقباً نامعيني قابل‌توجه در مدل رياضي حركت ربات مي‌باشد. علاوه بر آن، عمده‌ي رويكردهاي كنترلي بهينه به‌صورت برون‌خط براي سيستم طراحي مي‌شوند كه عملاً براي يك سيستم با معيني بالا غيركاربردي هستند. يادگيري Q قادر است بدون نياز به مدل رياضي سيستم و تنها با داده‌هاي اندازه‌گيري شده از محيط پاسخ معادله بهينه‌سازي هميلتون-بلمن-ژاكوبي را براي سيستم غيرخطي به‌صورت برخط و در زمان واقعي بيابد. به‌منظور عدم نياز به مدل، از يك شبكه‌ي عصبي به نام شبكه عصبي نقاد به جهت تخمين تابع Q استفاده مي‌شود. وزن‌هاي اين شبكه با الگوريتم حداقل مربعات بازگشتي در هرلحظه محاسبه مي‌شوند. قانون كنترلي بهينه از تقريب تابع Q بهينه حاصل‌شده توسط شبكه عصبي حاصل و به ربات جهت رديابي مسير مطلوب از پيش تعيين‌شده اعمال مي‌شود. در هر گام تابع Q و قانون كنترلي تا زمان همگرايي، با استفاده از الگوريتم تكرار سياست و به روش برون-سياست حساب مي‌شوند. بدين معنا كه يك قانون كنترلي اوليه پايدارساز به سيستم اعمال‌شده و ربات متحرك شروع به رديابي مسير مطلوب مي‌كند. هم‌زمان، فرآيند يادگيري بر طبق الگوريتم يادگيري Q تا زمان همگرايي قانون كنترلي بهينه ادامه يافته و درنهايت اين سياست جايگزين قانون كنترلي اوليه در سيستم كنترلي شده و رديابي مسير از آن لحظه به‌صورت بهينه صورت مي‌پذيرد. نتايج شبيه‌سازي گواه بر آن است كه در اين حالت انرژي سيگنال كنترلي به‌مراتب كمتر از كنترل‌كننده‌هاي متداول ديگر، ازجمله كنترل‌كننده تناسبي-انتگرالي است. علاوه بر آن رديابي مسير نيز با دقتي بيشتر صورت مي‌پذيرد كه نشان مي‌دهد مي‌تواند براي پياده‌سازي عملي مناسب باشد.

تاريخ ورود اطلاعات

1400/10/12

عنوان به انگليسي

Four-wheeled Mecanum-wheel Mobile Robot Optimal Path Tracking using Q-learning-based Control

تاريخ بهره برداري

11/16/2022 12:00:00 AM

دانشجوي وارد كننده اطلاعات

كسري مزارعي سعدآبادي

Name: كسري مزارعي سعدآبادي
Author: كسري مزارعي سعدآبادي

چكيده به لاتين

In this dissertation, an adaptive-optimal tracking for the desired path, based on Q-learning, is presented for a four-wheeled Mecanum-wheel mobile robot, which is an omnidirectional mobile robot with high complexity in dynamics. The main challenge in implementing optimal controllers on Mecanum-wheel mobile robots is the significant uncertainty in the mathematical model of the robot's motion. In addition, most optimal control approaches are designed offline for the system, which is practically inapplicable for a highly uncertain system. Q-learning can find the solution for the Hamilton-Bellman-Jacobi optimization equation for a nonlinear system online and in real-time without requiring a mathematical model of the system only with measured data from the environment. A neural network called the critic neural network is used to estimate the Q function. The weights of this network are calculated with the recursive least squares method at every learning stage. The optimal control policy is obtained from the approximation of the optimal Q function obtained by the neural network. Then, it is applied to the robot to track the desired predetermined path. The Q function and the control policy are calculated until convergence using the policy iteration algorithm and in an off-policy method at each step; this means that an initial stabilizing control policy is used for the system and the mobile robot begins to track the desired path. At the same time, the learning process according to the Q learning algorithm continues until the convergence of the optimal control policy. Eventually, this policy replaces the initial control policy in the control system, so the path is followed optimally since that moment. The simulation results show that in this case, the energy of the control signal is much lower than other conventional controllers, including the proportional-integral controller. In addition, path tracking is more accurate and smooth, which indicates that it can be suitable for practical implementation.

كليدواژه هاي فارسي

ربات متحرك چهارچرخ مكانوم , كنترل مبتني بر يادگيري Q , رديابي مسير

كليدواژه هاي لاتين

Mecanum-wheel mobile robot , Q-learning-based control , path tracking

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=25800&Field=0&DTC=6