چكيده به لاتين
Due to progress in technology and communications equipment, large volumes of diverse data at high speed are produced each moment. Storing, processing, and management of this volume of data are big challenges. Most of the data contain valuable information which can be reached by processing them. With regard to the huge volume of data and also the capacity and power of the current individual computers, storing and then retrieving data for processing, are complicated and long-term actions. One solution to solve this problem is to process data as soon as receive it. Because of the high production velocity and dissemination of data, therefore fast processing of data should be used. Unfortunately, due to the limited throughput and capacity of individual computers, fast data processing systems quickly get to use distributed processing technology on a set of connected computers or clusters of processing nodes. The data set is divided into subsets which will be distributed to the processing nodes along with tasks in the cluster. One of the challenges in distributed systems is assigning data and tasks in processing nodes. In the available distributed systems, in order to use existing resources in clusters, various methods have been used for allocating tasks to the nodes, which are generally global algorithms. On the other side, for utilizing existing resources in the cluster, instead of static allocation of applications to processing nodes, we can use dynamic allocation. As a result, different applications are executed simultaneously in the cluster processing nodes. If all nodes in a cluster are not the same and have different throughput and capacities, applying a similar algorithm to schedule various tasks on different nodes does not work efficiently. The aim of this thesis proposal is to offer a new task-scheduling mechanism for fast data processing. In this mechanism, due to the internal state of the processing nodes and the type of task and data, it uses autonomous task scheduling for assigning tasks to processing nodes. Therefore, the allocation of tasks to nodes is not a universal comprehensive solution and is based on current conditions, and internal state of the node, and the type of task which will be executed. The allocation of a greater awareness of the current situation and the obligation to supply takes place, resulting in higher performance and faster processing allocation is expected.