چكيده به لاتين
Nowadays, with the increasing use of social networks, in the event of any crisis, the victims who are directly and indirectly affected by the disasters, often use large amounts of data (such as text, images, speech, video) in social networks. This is because social media has recently become a major channel of communication between people to report to people or to crisis management centers. Crisis management centers and various emergency response organizations in recent years have often considered using this vast and accessible resource to raise awareness of the situation in order to respond to disasters. However, with a catastrophe, within minutes, social media platforms are filled with different types of data that these centers are faced with. In addition, in this huge data, the majority of the data may contain additional and irrelevant content. By being so, it becomes challenging to be rational and make decisions about available data for more effective crisis management. Despite recent advances in technology, processing and analyzing big data related to disaster remains a challenging task. Therefore, in this dissertation, we present a framework for analyzing Twitter social network data during and after disasters using state-of-the-art data processing and machine learning methods for analyzing social media big data for efficient management. First, the raw Twitter data is divided into two parts, the text and the rest of the features, and preprocessing is done on each part. In the next step, based on non-textual data, the social network (graph) was built based on the relationship of retweets, users' location was also discovered according to the location announced on the personal page, Twitter features such as number of followers, likes and etc. were placed next to the cleaned text as 4 types of information sources. Then the research enters the phase of data mining and knowledge discovery. According to a reference data set, models specific to the scope of the incident were constructed in 3 areas including emotion, informing or not, and 12 classes of humanitarian content. The structure of the retweet network was discovered as a power law distribution and was proved by statistical testing. The networks formed each day were dynamically subjected to community detection algorithms, and an algorithm for community tracking was developed and validated with a set of reference data. Finally, all data mining outputs were analyzed statically and dynamically.