زهرا جلاليان

عنوان

زمان بندي خودمختار وظايف در سامانه هاي توزيعي پردازش سريع داده ها

مقطع تحصيلي

دكتري تخصصي (PhD)

رشته تحصيلي

مهندسي كامپيوتر- سيستم‌هاي نرم‌افزاري

سال تحصيل

1392

تاريخ دفاع

1401/8/22

استاد راهنما

محسن شريفي

دانشكده

مهندسي كامپيوتر

چكيده

با توجه به رشد سريع توليد و انتشار داده¬هاي¬ حجيم از منابع مختلف، ناگزير سرعت پردازش داده¬ها¬ نيز بايد افزايش يابد. در سيستم¬هاي توزيعي پردازش داده¬هاي¬ حجيم مانند محاسبات ابري، تخصيص مجموعه بزرگي از وظايف گوناگون به تعداد زيادي از گره¬هاي محاسباتي )كه ممكن است ناهمسان هم باشند) به عهده زمان¬بند وظايف است. انتخاب گره محاسباتي توسط زمان¬بند جهت ارسال و اجراي سريع وظيفه، بايد درراستاي تامين اهداف متعددي (مانند بهره¬وري منابع، كاهش زمان اجراي مجموعه¬اي از وظايف، كاهش تبادل داده بين گره¬هاي پردازشي و تعادل بار بين گره¬هاي محاسباتي) صورت ¬پذيرد. امروزه تلاش مي¬شود كه زمان¬بندها به قسمت اعظمي از اين اهداف دست يابند. راهبرد¬هاي زمان¬بندي كه سعي دارند در يك مرحله به اين اهداف دست يابند، عملكرد ضعيف¬تري نسبت به راهبردهاي چند مرحله¬اي دارند. دراين رساله يك زمان¬بند وظايف، به منظور دستيابي به پردازش سريع داده¬هاي¬ حجيم با عملكرد بهتر درجهت نيل به اهداف ذكرشده ارائه مي¬شود. با استفاده از منابع مورد نياز وظايف كه در اجراهاي قبلي به دست آمده و الگوريتم خوشه بنديk-means دركنار يك معادله تعادل بار براي افزايش كارايي منابع در مرحله اول، و سپس با بكارگيري الگوريتم تكامل تفاضلي براي كاهش زمان اجراي خوشه¬ها، يك طرح زمان¬بند وظيفه چند منظوره سلسله مراتبي پيشنهاد مي¬گردد. به منظور بهره¬وري بهتر منابع، از حالت پوياي گره¬هاي ¬محاسباتي جهت ارسال و اجراي مجموعه¬اي از وظايف استفاده مي¬شود. همچنين با ارسال وظايف وابسته به يك گره محاسباتي، از انتقال داده¬ها بين گره¬هاي محاسباتي خودداري مي¬شود. طرح پيشنهادي پياده¬سازي و نتايج به¬دست¬آمده در نرم¬افزارCloudsim، مورد آزمايش قرار گرفته است. در اين آزمايش‌ها طرح پيشنهادي در مقايسه با رويكرد يادگيري تقويت كننده Mai و روش اجراي موازي Bugerya، تقريباً 10% كاهش زمان اجراي مجموعه وظايف و 4% افزايش كارايي پردازشگر را نشان مي دهد. هزينه انتقال اطلاعات بين وظايف متوالي نيز در مقايسه با ديگر روش¬ها 10% كاهش داشته¬است. با توجه به نتايج حاصل و اين واقعيت كه طرح زمان¬بند وظايف پيشنهادي كه از روش iHadoop براي اجراي موازي الهام گرفته، براي استفاده در سيستم هاي توزيعي پردازش سريع داده¬هاي حجيم مناسب¬تر است. اطلاعات مربوط به اجراي قبلي وظايف و وضعيت فعلي گره هاي محاسباتي، در نگاشت كارآمد وظايف به گره¬هاي محاسباتي بسيار تأثيرگذار است. در طرح پيشنهادي نشان داده شده است، كه با توجه به ميزان منابع مورد نياز وظايف حين اجرا و خوشه¬بندي وظايف و همچنين، بهينه¬سازي خوشه¬ها به منظور كاهش كل زمان اجراي وظايف موجود در خوشه¬ها، با درنظرگرفتن ظرفيت¬هاي موجود در گره¬هاي محاسباتي، مي‌تواند در انتخاب بهينه گره¬هاي ¬محاسباتي و در نتيجه پردازش سريعتر داده¬ها مفيد باشد.

تاريخ ورود اطلاعات

1401/12/06

عنوان به انگليسي

Autonomous Task Scheduling in Fast Data Processing Distributed Systems

تاريخ بهره برداري

11/13/2023 12:00:00 AM

دانشجوي وارد كننده اطلاعات

زهرا جلاليان

Name: زهرا جلاليان
Author: زهرا جلاليان

چكيده به لاتين

Due to progress in technology and communications equipment, large volumes of diverse data at high speed are produced each moment. Storing, processing, and management of this volume of data are big challenges. Most of the data contain valuable information which can be reached by processing them. With regard to the huge volume of data and also the capacity and power of the current individual computers, storing and then retrieving data for processing, are complicated and long-term actions. One solution to solve this problem is to process data as soon as receive it. Because of the high production velocity and dissemination of data, therefore fast processing of data should be used. Unfortunately, due to the limited throughput and capacity of individual computers, fast data processing systems quickly get to use distributed processing technology on a set of connected computers or clusters of processing nodes. The data set is divided into subsets which will be distributed to the processing nodes along with tasks in the cluster. One of the challenges in distributed systems is assigning data and tasks in processing nodes. In the available distributed systems, in order to use existing resources in clusters, various methods have been used for allocating tasks to the nodes, which are generally global algorithms. On the other side, for utilizing existing resources in the cluster, instead of static allocation of applications to processing nodes, we can use dynamic allocation. As a result, different applications are executed simultaneously in the cluster processing nodes. If all nodes in a cluster are not the same and have different throughput and capacities, applying a similar algorithm to schedule various tasks on different nodes does not work efficiently. The aim of this thesis proposal is to offer a new task-scheduling mechanism for fast data processing. In this mechanism, due to the internal state of the processing nodes and the type of task and data, it uses autonomous task scheduling for assigning tasks to processing nodes. Therefore, the allocation of tasks to nodes is not a universal comprehensive solution and is based on current conditions, and internal state of the node, and the type of task which will be executed. The allocation of a greater awareness of the current situation and the obligation to supply takes place, resulting in higher performance and faster processing allocation is expected.

كليدواژه هاي فارسي

پردازش سريع داده حجيم , ، زمان بندي وظايف , انتساب بهينه وظايف , خوشه¬بندي وظايف

كليدواژه هاي لاتين

Fast Big Data Processing , Task Scheduling , Optimizing Task Allocation , Task Clustering

Author

Zahra Jalalian

SuperVisor

Dr. Mohsen Sharifi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=27899&Field=0&DTC=6