علي الله بخشي

عنوان

طراحي مسير صفحه اي فضاپيماي تراست پايين در ماموريت ملاقات مداري با استفاده از روش يادگيري تقويتي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

هوافضا

سال تحصيل

1399

تاريخ دفاع

1402/07/12

استاد راهنما

مجيد بختياري

دانشكده

فنوري هاي نوين

چكيده

اين پاوهش به بررساي ير روش نوين براي حل مسائله مرتب با ملاقات مداري دو فضااپيما ميپردازد. در اين مسائله، فضااپيماي پيرو كه از نوع تراسات پايين اسات، به فضااپيماي هدف نزدير ميشاود تا در فاصالهي مناساب براي مأموريت اتصاا اماده شاود. فضااپيماي هدف ميتواند ير ايساتگاه فضاايي يا ير ماهواره در مدار ديگر و يا دو ماهواره با اختلاف فاز باشاد. روش اراهه شاده از طريب يادگيري تقويتي به اين مسائله ميپردازد. در اين روش، از ير الگوريتم يادگيري تقويتي مبتني بر گراديانهاي بهينهساازي ساياسات تقريبي اساتفاده شادهاسات تا مساير فضااپيماي تراسات پايين بهينه شاود. اين الگوريتم شاامل دو شابكه عصابي بازيگر و منتقد ميباشاد. بازيگر با توجه به وضاعيت فعلي فضااپيما و بزرگي رانش نكه در اين پاوهش رابت ميباشاد ، زاويه تراسات را تنييم ميكند تا به نقطهاي مشاخ در مدار فضااپيماي هدف در مدت زمان نزدير به بهينه برساد. شابكه منتقد، عملكرد بازيگر را ارزيابي ميكند. در فرايند اموزش، اين الگوريتم در محي شابيهساازيشاده، توانايي پيدا كرد تا مساير مناسابي براي ملاقات با فضاااپيماي هدف را بدون داشااتن دانش قبلي از دينامير محي پيدا كند. در اين پاوهش براي نخساتين بار از الگوريتم بهينهساازي ساياسات تقريبي در يادگيري تقويتي براي حل مسائله ملاقات مداري اساتفاده شادهاسات. در اين پاوهش تغييرات زاويه تراسات و تأرير ان بر بهينه شادن مسير و همينين تأرير اندازه تراست بر روي زمان انجام مأموريت بررساي ميشاود و در نهايت بهترين مقادير براي پارامترهاي الگوريتم يادگيري تقويتي براي اموزش الگوريتم و حل مسئله اراهه خواهد شد.

تاريخ ورود اطلاعات

1402/08/29

عنوان به انگليسي

Low-thrust spacecraft planar trajectory design in orbital rendezvous mission using reinforcement learning method

تاريخ بهره برداري

10/3/2024 12:00:00 AM

دانشجوي وارد كننده اطلاعات

علي الله بخشي هفشجاني

Name: علي الله بخشي هفشجاني
Author: علي الله بخشي

چكيده به لاتين

This thesis surveys a novel approach to solving a problem related to orbital rendezvous of two spacecraft. In this problem, a chasing spacecraft of low – thrust type approaches a target spacecraft to be ready for docking at an appropriate distance for the mission. The target spacecraft can be a space station, a satellite in a different orbit, or two satellites with a phase difference. The proposed method tackles this problem through reinforcement learning. In this approach, a proximal policy optimization-based reinforcement learning algorithm is employed to optimize the trajectory of the chasing spacecraft. This algorithm consists of two neural networks, an actor and a critic. The actor adjusts the thrust angle based on the current state of the spacecraft and the magnitude of the thrust, which is assumed to be constant in this study, to reach a specific point in the orbit of the target spacecraft at the near-optimal time. The critic network eva‎luates the performance of the actor. In the training process, this algorithm demonstrated the capability to find a suitable trajectory for rendezvous with the target spacecraft in a simulated environment without prior knowledge of the dynamics of the environment. For the first time in this research, a policy optimization algorithm is used in reinforcement learning to solve the orbital rendezvous problem. This study investigates the changes in thrust angle and their impact on the trajectory optimization, as well as the effect of thrust magnitude on the mission completion time. Finally, the optimal values for the parameters of the reinforcement learning algorithm for training and problem-solving will be presented.

كليدواژه هاي فارسي

ملاقات مداري، تراست پايين ، يادگيري تقويتي، بهينه سازي سياست تقريبي،شبكه عصبي، بازيگر منتقد

كليدواژه هاي لاتين

orbital rendezvous, low – thrust, reinforcement learning, proximal policy optimization, neural network, actor – critic,

Author

Ali Allahbakhshi

SuperVisor

Majid Bakhtiari

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=30313&Field=0&DTC=6