چكيده به لاتين
Not too long ago, the wide adoption of social media and the ready availability of the internet have altered the scene of text data for many languages, and in this case, Arabic, it is quite evident because the volume of users from different industries has tremendously increased along with the Arabic spoken in different dialects. Morphology, syntax, vocabulary, and pronunciation differ for each dialect, which makes each of them distinct. Thus, researchers working in the area of language identification and natural language processing face a great challenge in classifying the different Arabic dialects. The diversity among Arabic dialects poses a significant challenge for identification. In our study, Which aims to identify the Iraqi dialect from the rest of the Arabic dialects, a set of tests were completed on data and comments derived from the Twitter website, which consists of Arabic dialects such as Egyptian, Gulf, Jordanian, Yemeni and Iraqi, and included 18 countries in the north AFRICA Region and MIDDLE east . Our study also introduced a new approach to dialect recognition, specifically targeting the Iraqi Arabic dialect compared to other Arabic dialects using LONG Short-term memory (LSTM) networks. The proposed system achieved an F1 Accuracy of 81.14%, indicating stable performance without further optimization. We also used SVM Model on the same data and the accuracy was low compared to LSTM and was only 60%. Therefore, LSTM was preferred for its high accuracy in results for test data and unknown data. The combination of LSTM and a dictionary-based model significantly improved accuracy, when the strengths of the two models were combined, while canceling out the weaknesses of each, and thus served as a good candidate in text classification when combined with techniques to prevent over fitting .The model achieved 96% accuracy, 96.2% f1 score, and 96.4% precision.