Extended Feature Set Construction for Efficient Triaging of Bug Reports of Open-Source Software

Main Article Content

Kulbhushan Bansal, et. al.


Bug report Triaging is an active area of research in the last few years. This is due to the rapid proliferation of the software and apps which are critically required in the market within a short period, and therefore, released without thorough testing. This is essential as the “time-to-market” has a profound effect on profit margins, whereas software testing is a time-consuming process. However, without thorough testing, software occasionally fails during its working. At this time, the user is prompted to write a bug report, and, in another case, a report is automatically generated and sent to a central database. This report has the details of what happened that cause the software to malfunction. This form of testing, which is done by software users is popularly known as Beta Testing, in contrast to alpha testing, which is done by software testing companies/ departments. A triager is a person or program who reads the bug reports and classifies them related to the bug identified. This classification is a central issue of beta testing and studied in detail by many researchers. Automatic classification involves machine learning over natural language processing. Also, as prior training is required to be delivered to the machine, a training data set is required to be constructed to calibrate the machine. The mathematical models of machine learning classifiers work on feature sets, extracted from raw data. In this research, techniques are presented for the extraction of textual and Contextual features of the reports which lead to the construction of a feature set with 40 features. Support Vector Machine (SVM) is implemented over R package as the machine learning classier and the results are compared with those of the benchmark techniques.

Article Details