Skip to Main content Skip to Navigation
Journal articles

Valuing free-form text data from maintenance logs through transfer learning with CamemBERT

Abstract : Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from operators, and are imbalanced, as commonplace issues happen more often than critical problems. This hinders the application of machine learning methods to exploit this data. Thus, this study explores the use of a recent model named CamemBERT to tackle these difficulties through transfer learning. More specifically, the purpose is to predict the criticality and duration of a maintenance issue from the description provided. Findings suggest that fine-tuning CamemBERT outperforms other classical and feature-based approaches. Furthermore, the class imbalance problem is addressed from a data pre-processing and training perspective: firstly, k-means with silhouette diagrams allowed the creation of more homogenous classes, and secondly, the use of resampling enabled an improvement in the model’s performance.
Document type :
Journal articles
Complete list of metadata
Contributor : Kathleen TORCK Connect in order to contact the contributor
Submitted on : Friday, November 12, 2021 - 3:34:03 PM
Last modification on : Saturday, November 13, 2021 - 3:53:18 AM




Juan Pablo Usuga Cadavid, Bernard Grabot, Samir Lamouri, Robert Pellerin. Valuing free-form text data from maintenance logs through transfer learning with CamemBERT. Enterprise Information Systems, Taylor & Francis, 2020, pp.1-29. ⟨10.1080/17517575.2020.1790043⟩. ⟨hal-03426807⟩



Record views