چكيده به لاتين
In fault localization process, the presence of a statement in faulty execution could be affected by the errors in other statements. The Fault Localization must to be investigated from a causal point of view, because statistical correlation between a statement and the Incorrect output can indicate the existence of a causal relationship between the two. Based on the previous research on causal-statistical analysis for estimating fault suspiciousness of a statement, a regression model for each statement from the program has been created. These methods have a high computational overhead and a high capacity memory, which makes them not scalable. Therefore, in order to reduce the cost of analysis, the causal-statistical inference be done solely on a small subset of the program statements.
In this thesis, we first specify the range of fault suspiciousness of a statement in the form of an execution branch. For this purpose, we negate conditions of faulty execution path from end to beginning, and t we generate test data using the Z3 solver for the path. Then we execute the program again in the form of a concolic execution with test data. Thus, depending on the pass or fail the program execution, we determine which branch is fault suspiciousness. In this way, we minimize the range of statement for applying the causal-statistical approach. In fact, for the first time, we have convert the problem of automatically fault localization into three sub-steps, finding the faulty execution path, finding the faulty branch and finally finding the fault suspiciousness of a statement in the faulty branch. In this approach, Test data is generated purposefully and with the least possible amount to determine the faulty branch and then the fault suspiciousness of a statement within the faulty branch. In this way, we were able to eliminate for the first time the problem of dependence of statistical approach on test data.
The proposed approach, TD-CAFL, was examined on Defects4j test suite. To show the degree of improvement of the approche, we have used common criteria such as the number of statements examined and the accuracy of the fault localization. The results show that the proposed approach performs better in terms of selected evaluation criteria compared to other related strategies. Finally, we showed that the use of causal-statistical analysis along with the proposed approach will increase the accuracy of fault localization.