سينا نوريان

عنوان

ارائه يك تكنيك پردازش توزيع شده ي گراف جهت انطباق پذير نمودن سامانه هاي پردازش داده هاي جرياني

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

نرم افزار

سال تحصيل

۱۳۹۵

تاريخ دفاع

۱۳۹۷/۰۸/۲۲

استاد راهنما

دكتر محسن شريفي

دانشكده

كامپيوتر

چكيده

با افزايش حجم و سرعت توليد و انتقال داده‌ها، روش‌هاي سنتي ذخيره و پردازش داده‌ها ديگر پاسخگوي نيازهاي امروزي نمي‌باشند و در بسياري از حوزه‌ها به راه‌حل‌هايي براي پردازش بي‌درنگ داده‌ها با سرعت و دقت بالا بيشتر احساس مي‌شود. لذا سامانه‌هاي مديريت و پردازش داده‌هاي جرياني بايد بتوانند بي‌شمار داده را بصورت بي‌درنگ با توان عملياتي بالا پردازش كنند. اين سامانه‌ها معمولاً از تركيب چندين منابع توليد و ورود داده، گره‌هاي پردازشي و محاسباتي و اجزاء تحليل داده تشكيل شده است. سامانه پردازش داده‌هاي جرياني را مي‌توان به يك گراف بدون دور و جهت‌دار تبديل كرد. در اين گراف يال‌ها نشان‌دهنده‌ي جريان داده‌ها از گره‌اي به گره ديگر هستند و هر گره داراي يك پرس‌وجو است كه عملياتي را بر روي داده‌هاي ورودي انجام مي‌دهد. رويكردهاي متمركز به دليل عدم انطباق‌پذير بودن با شرايط مختلف مانند سرعت ورود و زمان پردازش متغير داده‌ها، ممكن است كيفيت و دقت مورد انتظار را برآورده نكنند. بنابراين به راهكارهاي توزيع شده براي استفاده از چندين منابع پردازشي جهت افزايش كيفيت و سرعت عمل و همچنين راهكارهاي انطباق‌پذيري خودكار جهت مديريت و تطبيق با شرايط متغير محيط نياز است. راهكار پيشنهادي ما در اين پايان‌نامه، يك تكنيك براي پردازش گراف جريان داده بصورت توزيع شده با استفاده از روش‌هاي كنترلي براي نظارت بر كيفيت سامانه است كه باعث انطباق‌پذير شدن اين سامانه‌ها با شرايط متغير نرخ ورود و زمان خدمت‌دهي داده‌ها مي‌شود. براي رسيدن به اين هدف، واحد كنترلي سامانه، معيارهاي كارايي از نحوه كاركرد سامانه را كه همواره از گره‌هاي محاسباتي دريافت مي‌كند را بررسي مي‌كند و در صورت نياز، با تغيير تعداد گره‌هاي محاسباتي در گراف به صورت پويا، سامانه را با بار كاري پيش رو منطبق مي‌كند. در صورت افزايش يا كاهش نرخ ورود و زمان سرويس، تعداد گره‌هاي محاسباتي افزايش يا كاهش مي‌يابند و بار كاري سامانه متعادل مي‌شود و به دليل تطبيق ميزان منابع با بار كاري، در مصرف انرژي نيز صرفه‌جويي مي‌شود. در سامانه‌هاي ابري استفاده بهينه از منابع پردازشي يكي از مهم‌ترين اصول است. بنابراين با استفاده از تكنيك ارائه شده، مي‌توان اين سامانه‌ها را در محيط‌هاي ابري بيشتر مورد استفاده قرار داد. آزمايش‌هاي انجام شده از مقايسه تكنيك انطباق‌پذيري ارائه شده در اين پايان‌نامه با روش‌هاي غيرانطباق‌پذير در چهار روش توزيع بار متفاوت بين دو نوع از واحدهاي محاسباتي بدون حالت و داراي حالت داخلي نشان مي‌دهد كه استفاده از تكنيك ارائه شده باعث افزايش توان عملياتي و كاهش زمان پاسخ‌دهي شده و از منابع بصورت بهينه استفاده مي‌شود.

تاريخ ورود اطلاعات

1397/11/09

عنوان به انگليسي

A Distributed Graph Processing Technique for Adaptability of Data Stream Processing Systems

تاريخ بهره برداري

11/13/2018 12:00:00 AM

دانشجوي وارد كننده اطلاعات

سينا نوريان

Name: سينا نوريان
Author: سينا نوريان

چكيده به لاتين

The rapid growth of sensing devices that continuously generate an abundant amount of raw data items creates unseen opportunities for the development of new, yet pervasive, services that can improve the quality of everyday life of human being. Examples can be found in many domains including energy management systems, financial markets, transportation, health-care and IoT sensors. Real-time processing of high-volume streams of data makes it impractical to follow traditional store-and-process strategies. A Data Stream Processing (DSP) system often consists of a set of middle-ware, tools, and controlling algorithms to process unbounded, often high-volume, data streams once the inputs are generated. High throughput and low latency are the requirements of DSP applications. A data stream job in DSP can be modeled as a directed acyclic graph (DAG) while the streams of data can only flow across its edge among the vertices of the graph. Vertices of the graph are the processing elements that continuously receives the incoming streams from the other elements and generates some new outgoing streaming after executing the requested processing logics. Another important challenge in such platforms is to fulfill the quality of service (QoS), as specified in the service level agreement (SLA). Centralized and non-adaptable deployments of DSP applications can become a bottleneck in case of dealing with high-volume of data which results in high latency. Therefore, distributed deployment of DSP applications to utilize several processing units is a must to avoid violating QoS constraints and service level agreement. Also, due to the unpredictability of data arrival rate and data service rate, adaptable approaches need to be deployed in order to allocate resources dynamically to the DSP application for adapting to varying workload and avoiding over and under provisioning of resources that causes violation of QoS constraints. Our proposed technique in this research is a distributed graph processing for adaptability of DSP systems. In order to achieve adaptability, controller units in DSP system must receive performance metrics from processing elements and decide if a reconfiguration is needed in order to cope with current situation and workload. The controller unit sends its decision as a command to processing elements to scale up when the data arrival rate or service time increases and the current number of processing elements can not handle the workload, or scale down if the current workload is lower than expected by removing unnecessary processing elements to use less resources and decrease power usage. In cloud computing, optimizing resource usage is an important aspect. By using our proposed adaptable technique, DSP systems can be used in cloud environments easily with less configurations than non-adaptable DSP systems. Our results show a great improvement in response time and resource usage compared with non-adaptable solutions.

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=20005&Field=0&DTC=6