چكيده به لاتين
The rapid growth of sensing devices that continuously generate an abundant amount of raw data items creates unseen opportunities for the development of new, yet pervasive, services that can improve the quality of everyday life of human being. Examples can be found in many domains including energy management systems, financial markets, transportation, health-care and IoT sensors. Real-time processing of high-volume streams of data makes it impractical to follow traditional store-and-process strategies. A Data Stream Processing (DSP) system often consists of a set of middle-ware, tools, and controlling algorithms to process unbounded, often high-volume, data streams once the inputs are generated. High throughput and low latency are the requirements of DSP applications. A data stream job in DSP can be modeled as a directed acyclic graph (DAG) while the streams of data can only flow across its edge among the vertices of the graph. Vertices of the graph are the processing elements that continuously receives the incoming streams from the other elements and generates some new outgoing streaming after executing the requested processing logics. Another important challenge in such platforms is to fulfill the quality of service (QoS), as specified in the service level agreement (SLA). Centralized and non-adaptable deployments of DSP applications can become a bottleneck in case of dealing with high-volume of data which results in high latency. Therefore, distributed deployment of DSP applications to utilize several processing units is a must to avoid violating QoS constraints and service level agreement. Also, due to the unpredictability of data arrival rate and data service rate, adaptable approaches need to be deployed in order to allocate resources dynamically to the DSP application for adapting to varying workload and avoiding over and under provisioning of resources that causes violation of QoS constraints. Our proposed technique in this research is a distributed graph processing for adaptability of DSP systems. In order to achieve adaptability, controller units in DSP system must receive performance metrics from processing elements and decide if a reconfiguration is needed in order to cope with current situation and workload. The controller unit sends its decision as a command to processing elements to scale up when the data arrival rate or service time increases and the current number of processing elements can not handle the workload, or scale down if the current workload is lower than expected by removing unnecessary processing elements to use less resources and decrease power usage. In cloud computing, optimizing resource usage is an important aspect. By using our proposed adaptable technique, DSP systems can be used in cloud environments easily with less configurations than non-adaptable DSP systems. Our results show a great improvement in response time and resource usage compared with non-adaptable solutions.