چكيده به لاتين
Software debugging process is one of the most difficult, tedious, and time-consuming steps of software development. In this regard, several automated techniques have been developed to reduce the burden on the developer during debugging. Conducted research in recent years has shown that statistical techniques in many cases perform better than other techniques in terms of the amount of code the developer must examine to locate the fault. However, these techniques are still faced with major limitations. Statistical methods of automatic fault localization are biased by data collected from different executions of the program. This biasness could result in unstable statistical models which may vary dependent on the test data provided for trial executions of the program. These methods consider an equal fault-proneness likelihood for different portions of the code; whereas, the location of an entity inside the program and the fault-proneness of the programming structures could have strong impact on the suspiciousness assessment of the program elements. Moreover, statistical methods attempt to find the failure-correlated statements, while the fault localization is a causal problem and a statistical causal method is required. The overall goal of this thesis is to apply a statistical causal analysis combined with program analysis and consider the program structure and static fault-proneness likelihood of statements while locating the causes of failures. In this regard, in the first phase of the dissertation, two new methods, so called FPA-FL and Inforence, are proposed which are able to efficiently locate program faults by taking into account the static structure of the program and the static fault-proneness likelihoods of statements. Both methods use program static structure as a roadmap in order to avoid building a blind and inaccurate model which solely relies on dynamic runtime data. In the second phase of the dissertation, we investigate the methods of reducing the dependence of statistical fault localization on data obtained from test executions. In this regard, first, a probabilistic method based on program slicing is proposed to identify and handle the coincidentally correct test cases, in both single and multiple-bug settings. Since it is impossible to accurately recognize the coincidentally correct tests, a new method based on cooperative game theory is presented that is able to effectively diminish the negative impact of coincidentally correct tests on fault localization effectiveness and can pinpoint the failure causes in existence of these tests. Finally, we have proposed a novel statistical technique for automatic test case generation, Bayes-TDG, to assist the fault localization model in finding the location of unknown faults. To verify the effectiveness of proposed methods, we provide the results of our experiments with different subject programs, containing seeded and real faults. The experimental results are then compared with those provided by different fault localization techniques for the both single-fault and multiple-fault programs. The experimental results prove the outperformance of our proposed methods compared to the state-of-the-art techniques. Due to the combinatorial analysis capability, our proposed method performs highly effective in the case of inappropriate test suites, containing a great number of coincidentally correct test cases.