حامد سليماني

عنوان

طراحي مسير بهينه تراست پايين از مدار ارتفاع پايين به مدار زمين‌آهنگ با استفاده از يادگيري تقويتي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي هوافضا

سال تحصيل

1399

تاريخ دفاع

1401/09/23

استاد راهنما

دكتر مجيد بختياري - دكتر كامران دانشجو

دانشكده

فناوري هاي نوين

چكيده

مزيت اصلي معماري تراست پايين نسبت به سيستم‌هاي شيميايي سنتي اين است كه امكان طراحي ماهواره‌هاي كوچك‌تر و سبك‌تر را فراهم مي‌كند كه مي‌توانند به طور بالقوه پرتاب شوند و در عين حال هزينه‌هاي پرتاب را كاهش دهند. در سال هاي اخير، با پيشرفت گسترده در حوزه سخت افزار هاي محاسباتي سمت و سوي بسياري از تحقيقات و طراحي هاي حوزه فضا به خصوص مسئله افزايش مدار به سمت استفاده از روش هاي نوين مبتني بر هوش مصنوعي و يادگيري ماشين مورد رفته است. از جمله مهم ترين انواع الگوريتم هاي يادگيري ماشين يادگيري تقويتي است. يادگيري تقويتي نوعي الگوريتم يادگيري ماشين نوظهور و محبوب است و در سيستم‌ هاي مختلف خودگردان مانند خودرو‌ها و رباتيك صنعتي استفاده مي‌شود. اين نوع از يادگيري به دليل ماهيت تعاملي كه دارد بيشترين شباهت را به يادگيري از طبيعت كه در انسان ها و حيوانات ديده مي‌شود، دارد. با توجه به ماهيت مسئله، نياز است از يك عامل كه داراي محيطي با متغير حالت پيوسته و فضاي عمل پيوسته است استفاده شود. در اين تحقيق، با در نظر گرفتن ديناميك حركت دوجسمي به عنوان محيط براي برقراري تعامل عامل، يك فضاي پيوسته براي متغير‌هاي مسئله كه عناصر مداري 6 گانه اعتدالي فضاپيما است تعريف مي‌شود و فضاي عمل ها تحت يك سياست به محيط اعمال شده و عامل توسط شبكه عصبي بازيگر-منتقد آموزش داده مي‌شود تا قادر به انجام انتقال مداري موردنظر شود. تلاش بر اين است كه ديناميك مرحله انتقال مداري يعني بلوك انتقال مداري ابتدا در محيط نرم افزار متلب طراحي و پياده سازي شود. سپس مسير بهينه توسط يك الگوريتم يادگيري تقويتي بر مبناي مدل شبكه بازيگر-منتقد مطابق با شرايط اوليه و قيود ماموريت(مانند اندازه نيروي تراست) جستجو مي‌شود و در نهايت پروفيل هاي زاويه تراست بهينه براي دو حالت مانور انتقال مداري و يك حالت مانور تغيير زاويه ميل مدار به دست خواهند آمد. همچنين تأثير تغيير برخي از پارامتر‌هاي اصلي الگوريتم بر روند يادگيري نيز تجريه و تحليل مي‌شود. در نهايت با در نظر گرفتن نتياج حاصل از تحليل‌ها، به مزيت هاي اصلي اين رويكرد در مدل‌ سازي ديناميك‌هاي پيچيده اشاره مي‌شود كه به طور قابل قبولي در شبيه سازي اين‌گونه مسائل عمل مي‌نمايد.

تاريخ ورود اطلاعات

1401/10/11

عنوان به انگليسي

Optimal Low-Thrust Trajectory Design from LEO to GEO via Reinforcement Learning

تاريخ بهره برداري

12/14/2023 12:00:00 AM

دانشجوي وارد كننده اطلاعات

حامد سليماني

Name: حامد سليماني
Author: حامد سليماني

چكيده به لاتين

The main advantage of the low-thrust architecture over traditional chemical systems is that it allows the design of smaller and lighter satellites that can potentially be launched while reducing launch costs. In recent years, with the extensive progress in the field of computing hardware, many researches and designs in the field of space, especially the problem of increasing the circuit, have gone towards the use of new methods based on artificial intelligence and machine learning. One of the most important types of machine learning algorithms is reinforcement learning. Reinforcement learning is an emerging and popular type of machine learning algorithm used in various autonomous systems such as cars and industrial robotics. Due to its interactive nature, this type of learning is the most similar algorithm to learning from nature which can be seen in humans and animals. According to the nature of the problem, it is necessary to use an agent which has an environment with a continuous state variable and a continuous action space. In this study, considering the dynamics of two-body motion as the environment for the interaction of the agent, a continuous space is defined for the variables of the problem, which are the 6 orbital elements of the spacecraft, the space of actions is applied to the environment under a policy, and the agent defined by The actor-critic neural network is trained to be able to perform the desired orbit transfer. First, The dynamics of the orbital transfer stage, that is, the orbital transfer block, is designed and implemented in MATLAB. Then, according to the initial conditions and mission constraints, the optimal path is searched by a reinforcement learning algorithm based on the actor-critic network model (such as the magnitude of the thrust ). Finally, optimal thrust angle profiles will be obtained for two cases of orbit-raising maneuvers and one for orbit inclination change. Also, the effect of changing some of the main parameters of the algorithm on the learning process is investigated. Finally, taking into account the results of the analysis, the main advantages of this approach in modeling complex dynamics are pointed out, which works acceptably in the simulation of such problems.

كليدواژه هاي فارسي

تراست پايين -يادگيري تقويتي-عناصر مداري اعتدالي- عامل– سياست- شبكه بازيگر-منتقد

كليدواژه هاي لاتين

Low Thrust, Reinforcement Learning, Modified equinoctial orbital elements, Agent, Policy, Actor-Critic Network

Author

Hamed Soleymani

SuperVisor

Dr. Majid Bakhtiari , Dr. Kamran Daneshjo

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=27628&Field=0&DTC=6