Skip to Main content Skip to Navigation
Journal articles

Automated patch assessment for program repair at scale

Abstract : In this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to help scientists to do overfitting analysis when they evaluate program repair systems; Third, we improve the external validity of the program repair knowledge with the largest study ever.
Document type :
Journal articles
Complete list of metadata

https://hal-uphf.archives-ouvertes.fr/hal-03396298
Contributor : Kathleen Torck Connect in order to contact the contributor
Submitted on : Friday, October 22, 2021 - 4:31:01 PM
Last modification on : Wednesday, October 27, 2021 - 1:16:08 PM

Links full text

Identifiers

Collections

Citation

He Ye, Matias Martinez, Martin Monperrus. Automated patch assessment for program repair at scale. Empirical Software Engineering, Springer Verlag, 2021, 26 (2), ⟨10.1007/s10664-020-09920-w⟩. ⟨hal-03396298⟩

Share

Metrics

Record views

8