Learning to rank relevant files for bug reports using domain knowledge

265Citations
Citations of this article
150Readers
Mendeley users who have this article in their library.
Get full text

Abstract

When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time consuming. A tool for ranking all the source files of a project with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and potentially could lead to a substantial increase in productivity. This paper introduces an adaptive ranking approach that leverages domain knowledge through functional decompositions of source code files into methods, API descriptions of library components used in the code, the bug-fixing history, and the code change history. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features encoding domain knowledge, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluated our system on six large scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the newly introduced learning-to-rank approach significantly outperforms two recent state-of-the-art methods in recommending relevant files for bug reports. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70% of the bug reports in the Eclipse Platform and Tomcat projects.

References Powered by Scopus

Optimizing search engines using clickthrough data

3095Citations
N/AReaders
Get full text

Training linear SVMs in linear time

1551Citations
N/AReaders
Get full text

Data mining static code attributes to learn defect predictors

1251Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Deep code search

493Citations
N/AReaders
Get full text

Practitioners' expectations on automated fault localization

281Citations
N/AReaders
Get full text

From word embeddings to document similarities for improved information retrieval in software engineering

269Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Ye, X., Bunescu, R., & Liu, C. (2014). Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (Vol. 16-21-November-2014, pp. 689–699). Association for Computing Machinery. https://doi.org/10.1145/2635868.2635874

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 86

78%

Lecturer / Post doc 10

9%

Professor / Associate Prof. 7

6%

Researcher 7

6%

Readers' Discipline

Tooltip

Computer Science 102

92%

Engineering 7

6%

Social Sciences 1

1%

Earth and Planetary Sciences 1

1%

Save time finding and organizing research with Mendeley

Sign up for free