چكيده به لاتين
The widespread use of WordNet as a human-readable and online lexical database has had significant effects on countless Natural Language Processing (NLP) tasks in recent years, and still, it is the center of attention. However, many researchers are reluctant to use WordNet because of some drawbacks in that. The first reason that researchers are unwilling to benefit from WordNet is that WordNet is too fine-grained, which can reduce the efficiency of some downstream tasks. Thus, they seek some methods which can conceptualize (coarsen) WordNet. The other reason that researchers are not enthusiastic about using WordNet is the incompleteness of WordNet. As a result, they follow some methods to enrich (conceptualize) WordNet with other external resources. In this thesis, we have offered two approaches to resolve the drawbacks. In the first approach, we have proposed a new method to create supersenses of WordNet. On the other hand, WordNet provides us with supersenses in which whole synsets have been clustered to 45 supersenses by linguists. Also, the number of our supersenses is as same as WordNet’s supersenses. In the second approach, we have introduced a new method to develop WordNet without using any external resources. We have taken full advantage of the texts in each synset instead of using other resources. For evaluation of the quality of our supersenses and enriched WordNet, we have used UKB as a Word Sense Disambiguation, which is state-of-the-art in knowledge-based WSD models. Since we have benefited from our 45 supersenses instead of 117K synsets of WordNet in the first approach, the running time of UKB is 556 times as fast as UKB with standard WordNet and F1-score has decreased marginally by approximately 1%. Furthermore, our results show that F1-score has increased by 0.1% after enriching and injecting a specific number of relations into WordNet in the second approach.