Skip to Main content Skip to Navigation
Journal articles

Automated patch assessment for program repair at scale

Abstract : In this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to help scientists to do overfitting analysis when they evaluate program repair systems; Third, we improve the external validity of the program repair knowledge with the largest study ever.
Document type :
Journal articles
Complete list of metadata

https://hal-uphf.archives-ouvertes.fr/hal-03396298
Contributor : Kathleen TORCK Connect in order to contact the contributor
Submitted on : Monday, April 25, 2022 - 11:00:43 AM
Last modification on : Tuesday, April 26, 2022 - 6:22:21 PM
Long-term archiving on: : Tuesday, July 26, 2022 - 6:47:20 PM

File

1909.13694.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

He Ye, Matias Martinez, Martin Monperrus. Automated patch assessment for program repair at scale. Empirical Software Engineering, Springer Verlag, 2021, 26 (2), pp.20. ⟨10.1007/s10664-020-09920-w⟩. ⟨hal-03396298⟩

Share

Metrics

Record views

13

Files downloads

10