چكيده به لاتين
In recent years, the infertility ratio in young couples has increased considerably in Iran. Treatment cost of infertility is very high for these couples. Data mining techniques have been shown to be very effective in extracting novel patterns from medical data. From the other side, most real data sets are imperfect. as in practice, data usually come with outlier, which is likely to negatively impact the performance of data classification. Ensembles methods are usually employed to overcome classification problems in presence of outliers.In this study, we proposed a model for predicting the best infertility treatment and for every given case. This model consists of three steps: First, our propose method uses the discriminant analysis to find contributing factors for choosing the best infertility treatment. Second, it detects the outlier samples and then computes the correlation between these samples and the choice of treatment method. Third, it uses ensemble methods to increase the precision of classifiers. This study used information of 527 infertile couples, collected by Avicenna specialized infertility center. This model proved successful in discovering effective factors, which included male age, infertility duration, immotile sperm, decreasing of sperm concentration, total sperm count, morphology, sperm motility, sperm with rapid progressive-a motility and sperm with slow progressive-b motility. Additionally, this model demonstrates that if any one of the four features of 1-sperm concentration, 2-IGM toxoplasma, 3-T3 hormone and 4-TPO is outlier, then the prediction of treatment will be more accurate. Finally, using ensemble methods has increased F-measure of model up to 76%.