figshare
Browse
1/1
3 files

Enhancing Change Prediction Models using Developer-Related Factors

Download all (5.5 MB)
Version 2 2018-02-06, 10:20
Version 1 2017-07-25, 15:31
dataset
posted on 2018-02-06, 10:20 authored by Gemma Catolino, Fabio PalombaFabio Palomba, Andrea De Lucia, Filomena Ferrucci, Andy ZaidmanAndy Zaidman

Continuous changes applied by developers during software maintenance and evolution risk to deteriorate the internal structure of a system and threat its maintainability. In this context, predicting the portions of source code where specific maintenance operations such as peer-code review and refactoring should be focused on may be crucial for developers in order to prevent the introduction of maintainability issues. In the past, researchers devoted effort on the definition of change prediction models based on structural properties extracted from the source code, while recent papers have shown that process metrics can be successfully adopted to predict change-prone classes. Despite the steps ahead made by recent work, we believe that existing approaches still miss an important piece of information, i.e., developer-related factors that are able to capture how complex is the development process under different perspectives. In this paper, we firstly investigate three change prediction models that exploit developer-related factors (e.g., number of developers working on a class) as predictor of change-proneness of classes and then we compare them with existing models relying on product and evolution metrics. Our findings reveal that developer-based factors might improve in some cases the capabilities of change prediction models in the identification of classes of a software system more likely to be changed in the future. Moreover, we observed interesting complementarities among the investigated prediction models. Based on these findings, we devised a novel change prediction model synergistically exploiting developer-related factors as well as product and evolution metrics. The empirical evaluation shows that such combined model is up to 20% more effective than the five single models in the identification of change-prone classes.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC