محمد هادي دادي زاده درگيري

عنوان

ك روش كارآمد، مقياس پذير و توزيع شده براي تشخيص انجمن در گرافهاي بزرگ

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي نرم افزار

سال تحصيل

1400

تاريخ دفاع

30/10/1403

استاد راهنما

دكتر محمد رضا كنگاوري

استاد مشاور

ندارم

دانشكده

مهندسي كامپيوتر

چكيده

در دنياي امروز، شبكه‌ها نقش مهمي در حوزه‌هايي مثل شبكه‌هاي اجتماعي، ارتباطات، سيستم‌هاي توصيه‌گر و زيست‌شناسي دارند. يكي از مهم‌ترين وظايف در تحليل شبكه‌ها، تشخيص انجمن است كه كاربردهاي زيادي در تحليل داده‌هاي شبكه‌اي دارد. اما به دليل پيچيدگي گراف‌ها و ماهيت الگوريتم‌هاي تشخيص انجمن، بسياري از اين الگوريتم‌ها به صورت ترتيبي اجرا مي‌شوند و براي گراف‌هاي بزرگ كارايي كافي ندارند. افزايش حجم داده‌ها و پيچيدگي شبكه‌ها نياز به الگوريتم‌هاي مقياس‌پذير و توزيع‌شده را پررنگ‌تر كرده است. همين موضوع انگيزه‌اي شد تا چارچوبي براي ساده‌تر كردن موازي‌سازي الگوريتم‌هاي تشخيص انجمن در گراف‌هاي بزرگ ارائه شود. در اين پژوهش چارچوبي معرفي شده كه بر اساس تقسيم گراف و ايجاد همپوشاني بين زيرگراف‌ها عمل مي‌كند. اين چارچوب با پردازش‌هاي محلي و مستقل، امكان پردازش موازي و توزيع‌شده را فراهم مي‌كند. هدف اصلي آن است كه الگوريتم‌هاي ترتيبي راحت‌تر موازي‌سازي شوند و نتايج حاصل به خروجي حالت غيرموازي نزديك بمانند. اين مدل به‌گونه‌اي طراحي شده كه به نوع الگوريتم وابسته نيست و براي طيف گسترده‌اي از روش‌ها كاربرد دارد. آزمايش روي دو مجموعه‌داده DBLP و Amazon نشان داد كه با استفاده از استراتژي همپوشاني، زيرگراف‌هايي به اندازه يك‌هفتم و يك‌بيستم گراف اصلي توليد شدند. اين زيرگراف‌ها با الگوريتم FOX و 64 پردازش مستقل و همزمان اجرا شدند. اجراي موازي باعث شد سرعت محاسبات سه برابر روي DBLP و شش برابر روي Amazon افزايش يابد. پس از تركيب نتايج پردازش‌هاي محلي، مدل با خروجي الگوريتم اصلي (در حالت غيرموازي) مقايسه شد. معيار F1-Score به ترتيب براي DBLP برابر 98٪ و براي Amazon برابر 97٪ بود. همچنين درصد شناسايي انجمن‌هاي يكسان، 93٪ براي DBLP و 90٪ براي Amazon به دست آمد. اين نتايج نشان مي‌دهد چارچوب پيشنهادي هم سرعت را بالا مي‌برد و هم دقت الگوريتم‌ها را در مقايسه با حالت غيرموازي حفظ مي‌كند.

تاريخ ورود اطلاعات

1404/07/02

عنوان به انگليسي

Ascalable an‎d efficient distributed method for community detection in large graphs

تاريخ بهره برداري

1/1/1900 12:00:00 AM

دانشجوي وارد كننده اطلاعات

محمدهادي دادي زاده درگيري

Name: محمدهادي دادي زاده درگيري
Author: محمد هادي دادي زاده درگيري

چكيده به لاتين

In today’s world, networks play a fundamental role in many fields such as social networks, communications, recommendation systems, an‎d biology. One of the most important tasks in network analysis is community detection, which has widespread applications in the analysis of network data. However, due to the complex structure of graphs an‎d the nature of community detection algorithms, many of these algorithms operate sequentially, which is inefficient for large graphs. The ever-increasing volume of data an‎d complexity of networks highlights the need for scalable an‎d distributed algorithms. This challenge has motivated the development of new approaches to enhance scalability an‎d efficiency in processing large graphs. In this study, a scalable approach for executing community detection algorithms is presented, based on graph partitioning an‎d creating overlap among subgraphs. This approach, by creating independent an‎d local processes, enables parallel an‎d distributed processing. The main goal of this method is to provide a model with high scalability, an‎d its results are designed to be as close as possible to the output of the algorithm in its sequential form. This model is designed to work independently of the type of community detection algorithm an‎d can be applied to a wide range of algorithms. The performance eva‎luation of the proposed method on the DBLP an‎d Amazon datasets shows that by using a 2 Hop overlap strategy, subgraphs equivalent to one-seventh an‎d one-twentieth of the original graph size were generated for these two datasets. These subgraphs were processed using the FOX algorithm an‎d 64 independent local processes in parallel. Parallel execution of the processes resulted in a threefold speedup for the DBLP dataset an‎d a sixfold speedup for the Amazon dataset. After merging the results of the local processes, the model’s performance was eva‎luated using the results of the original algorithm in its sequential form as the ground truth. The F1-Score metric for the DBLP an‎d Amazon datasets reached 98% an‎d 97%, respectively. Additionally, the percentage of identical community detection was reported to be 93% for DBLP an‎d 90% for Amazon. These results highlight the high efficiency of the proposed method in reducing computational complexity while maintaining accuracy an‎d performance quality compared to the sequential form.

كليدواژه هاي فارسي

گراف , تشخيص انجمن , مقياس پذيري , پردازش موازي , پردازش توزيع شده

كليدواژه هاي لاتين

Graph , Community detection , Scalability , Parallel Processing , Distributed Processing

Author

mohamad hadi dadizadeh

SuperVisor

Dr mohammad reza kangavari

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=33676&Field=0&DTC=6