Skip to Main content Skip to Navigation
Journal articles

Automated patch assessment for program repair at scale

Abstract : In this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to help scientists to do overfitting analysis when they evaluate program repair systems; Third, we improve the external validity of the program repair knowledge with the largest study ever.
Document type :
Journal articles
Complete list of metadata
Contributor : Kathleen TORCK Connect in order to contact the contributor
Submitted on : Monday, April 25, 2022 - 11:00:43 AM
Last modification on : Tuesday, April 26, 2022 - 6:22:21 PM
Long-term archiving on: : Tuesday, July 26, 2022 - 6:47:20 PM


Files produced by the author(s)




He Ye, Matias Martinez, Martin Monperrus. Automated patch assessment for program repair at scale. Empirical Software Engineering, Springer Verlag, 2021, 26 (2), pp.20. ⟨10.1007/s10664-020-09920-w⟩. ⟨hal-03396298⟩



Record views


Files downloads