محمدرضا سهرابي

عنوان

كنترل رديابي سيستم‌هاي غير خطي چندعاملي گسسته با زمان با استفاده از روش يادگيري تقويتي مدل آزاد

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق

سال تحصيل

1400

تاريخ دفاع

1403/07/15

استاد راهنما

حسين بلندي

استاد مشاور

دانشكده

مهندسي برق

چكيده

كنترل رديابي در سيستم‌هاي چندعاملي به شاخه‌اي از نظريه كنترل اشاره دارد كه به هماهنگي چندين عامل مستقل براي پيروي جمعي از يك مسير يا سيگنال مرجع مطلوب مي‌پردازد و هدف اصلي آن، تضمين پيروي كل گروه از مسير تعيين‌شده يا دستيابي به يك آرايش دلخواه، ضمن حفظ پايداري و هماهنگي ميان عامل‌هاي تشكيل‌دهنده آن است. در اين پژوهش رويكرد نويني مبتني بر داده و مدل آزاد براي كنترل توافق از نوع رهبر - پيرو با هدف رديابي عامل رهبر توسط ديگر عامل‌هاي پيرو در سيستم‌هاي چندعاملي با ديناميك غيرخطي و توپولوژي ارتباطي يك‌طرفه را مورد بررسي قرار مي‌دهيم كه از تلفيق يادگيري تقويتي، شبكه‌هاي عصبي و مكانيسم‌هاي رويداد محور پويا بهره مي‌برد، و علاوه بر اين موارد به سبب ماهيت داده‌محوري نيازي به مدل ديناميكي سيستم ندارد، ازاين‌رو براي كاربرد در سيستم‌هايي با ديناميك پيچيده مي‌تواند كارآمدتر باشد. در واقع ما يك چارچوب كنترل توزيع‌شده پيشنهاد مي‌كنيم كه در آن هر عامل از يك الگوريتم يادگيري تقويتي براي بهينه‌سازي سياست كنترل خود استفاده كرده و پارامتر‌هاي كنترل‌كننده را تنظيم مي‌كند؛ بعلاوه از شبكه‌هاي عصبي براي تقريب توابع ارزش حالت عمل بهينه، سياست‌‌هاي كنترلي بهينه و تخمين تابع پاداش (جريمه) استفاده مي‌شود كه كارايي فرايند آموزش و عملكرد كنترل‌كننده را در غياب اطلاعات مدل و ويژگي‌هاي غير خطي آن بهبود مي‌بخشد. ادغام مكانيسم رويداد محور ضمن حفظ دقت كنترل‌گر، به طور قابل‌توجهي بار محاسباتي را كاهش مي‌دهد. اين پژوهش با ارائه راهكاري مقياس‌پذير، تطبيقي و كارآمد براي ايجاد توافق و رديابي در محيط‌هاي غيرخطي و نامعين، به حوزه سامانه‌هاي چندعاملي كمك مي‌نمايد و مسير را براي كاربرد در سناريوهاي پيچيده دنياي واقعي كه مدل‌هاي دقيق سيستم در دسترس نيستند يا به‌دست‌آوردن آن‌ها دشوار و چالش‌برانگيز است، هموار مي‌سازد.

تاريخ ورود اطلاعات

1403/10/22

عنوان به انگليسي

Tracking Control of Discrete-Time Multi-Agent Nonlinear Systems Using Model Free Reinforcement Learning Method

تاريخ بهره برداري

1/1/1900 12:00:00 AM

دانشجوي وارد كننده اطلاعات

محمدرضا سهرابي

Name: محمدرضا سهرابي
Author: محمدرضا سهرابي

چكيده به لاتين

Tracking control in multi-agent systems refers to a branch of control theory focused on coordinating multiple independent agents to collectively follow a desired trajectory or reference signal. Its primary goal is to ensure that the entire group adheres to a predefined path or achieves a desired formation while maintaining stability and coordination among the constituent agents. In this study, we examine a novel data-driven and model-free approach for leader-follower consensus control aimed at enabling follower agents to track a leader agent in multi-agent systems with nonlinear dynamics and unidirectional communication topology. This approach leverages the integration of reinforcement learning, neural networks, and dynamic event-triggered mechanisms. Additionally, due to its data-driven nature, it eliminates the need for a dynamic model of the system, making it more effective for applications involving systems with complex dynamics. Specifically, we propose a distributed control framework in which each agent uses a reinforcement learning algorithm to optimize its control policy and adjust its controller parameters. Neural networks are employed to approximate optimal state-action value functions, optimal control policies, and reward (or penalty) functions, thereby enhancing the efficiency of the training process and the controller's performance in the absence of model information and under nonlinear conditions. The integration of event-triggered mechanisms significantly reduces computational load while maintaining control accuracy. This research offers a scalable, adaptive, and efficient solution for achieving consensus and tracking in uncertain and nonlinear environments. It contributes to the field of multi-agent systems by paving the way for applications in complex real-world scenarios where precise system models are either unavailable or difficult and challenging to obtain.

كليدواژه هاي فارسي

توافق , رديابي , يادگيري Q , سيستم‌هاي غير‌خطي , توپولوژي ارتباطي يك‌طرفه , مكانيزم رويداد محور

كليدواژه هاي لاتين

Consensus , tracking , Q-Learning , Nonlinear systems , unidirectional communication topology , event-triggered mechanism

Author

Mohammadreza Sohrabi

SuperVisor

Hossein Bolandi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=31938&Field=0&DTC=6