Automated patch assessment for program repair at scale - Université Polytechnique des Hauts-de-France Accéder directement au contenu
Article Dans Une Revue Empirical Software Engineering Année : 2021

Automated patch assessment for program repair at scale

Résumé

In this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to help scientists to do overfitting analysis when they evaluate program repair systems; Third, we improve the external validity of the program repair knowledge with the largest study ever.
Fichier principal
Vignette du fichier
1909.13694.pdf (1.06 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03396298 , version 1 (25-04-2022)

Identifiants

Citer

He Ye, Matias Martinez, Martin Monperrus. Automated patch assessment for program repair at scale. Empirical Software Engineering, 2021, 26 (2), pp.20. ⟨10.1007/s10664-020-09920-w⟩. ⟨hal-03396298⟩
19 Consultations
72 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More