Inferring test models from user bug reports using multi-objective search

Empir Softw Eng. 2023;28(4):95. doi: 10.1007/s10664-023-10333-8. Epub 2023 Jun 20.

Abstract

Bug reports are used by software testers to identify abnormal software behaviour. In this paper, we propose a multi-objective evolutionary approach to automatically generate finite state machines (FSMs) based on bug reports written in natural language, to automatically capture incorrect software behaviour. These FSMs can then be used by testers to both exercise the reported bugs and create tests that can potentially reveal new bugs. The FSM generation is guided by a Multi-Objective Evolutionary Algorithm (MOEA) that simultaneously minimises three objectives: size of the models, number of unrealistic states (over-generalisation), and number of states not covered by the models (under-generalisation). We assess the feasibility of our approach for 10 real-world software programs by exploiting three different MOEAs (NSGA-II, NSGA-III and MOEA/D) and benchmarking them with the baseline tool KLFA. Our results show that KLFA is not practical to be used with real-world software, because it generates models that over generalise software behaviour. Among the three MOEAs, NSGA-II obtained significantly better results than the other two for all 10 programs, detecting a greater number of bugs for 90% of the programs. We also studied the differences in quality and model performance when MOEAs are guided by only two objectives rather than three during the evolution. We found that the use of under-approximation (or over-approximation) and size as objectives generates infeasible solutions. On the other hand, using as objectives over-approximation and under-approximation generates feasible solutions yet still worse than those obtained using all three objectives for 100% of the cases. The size objective acts as a diversity factor. As a consequence, an algorithm guided by all three objectives avoids local optima, controls the size of the models, and makes the results more diverse and closer to the optimal Pareto set.

Keywords: Model inference; Multi-objective optimisation; Search-based software engineering.