West China Journal of Stomatology ›› 2023, Vol. 41 ›› Issue (6): 686-693.doi: 10.7518/hxkq.2023.2023124

• Clinical Research • Previous Articles     Next Articles

Prediction model of dental caries in 12-year-old children in Sichuan Province based on machine learning

Yan Xinmiao1(), Sun Taolan1, Lu Yuhang1, Tan Xin1, Wang Zhuo2, Li Miaojing3()   

  1. 1.School of Public Health, Chengdu Medical College, Chengdu 610500, China
    2.Sichuan Center for Disease Control and Prevention, Chengdu 610500, China
    3.College of Health and Intelligent Engineering, Chengdu Medical College, Chengdu 610500, China
  • Received:2023-04-19 Revised:2023-08-14 Online:2023-12-01 Published:2023-11-27
  • Contact: Li Miaojing E-mail:yanxinmiao@cmc.edu.cn;limiaojing@aliyun.com
  • Supported by:
    Scientific Research Project of Sichuan Provincial Health Commission(20PJ122);Correspondence: Li Miaojing, E-mail: limiaojing@aliyun.com

Abstract:

Objective The machine learning algorithm was used to construct a prediction model of children’s dental caries to determine the risk factors of dental caries in children and put forward targeted measures and policy suggestions to improve children’s oral health. Methods Stratified cluster random sampling was adopted in this study. In accordance with different policies and measures in Sichuan Province, 12-year-old students from 3-4 middle schools in eight cities of Sichuan Province were randomly selected for questionnaire survey, oral examination, and physical examination. Multivariate logistic regression analysis of risk factors for dental caries in 12-year-old children was conducted. The dataset was randomly divided into training set and validation set at a ratio of 7∶3. Four machine learning algorithms, including random forest, decision tree, extreme gradient boosting (XGBoost), and Logistic regression, were constructed using R version 4.1.1, and the prediction effects of the four prediction models were evaluated using the area under receiver operating characteristic curve (AUC). Results A total of 4 439 children aged 12 years were included in this study. The incidence of permanent teeth caries was 50.93%. The results of multivariate logistic regression analysis showed that body mass index, highest educational background of the father, highest educational background of the mother, whether to brush teeth, how many times a day, use of toothpaste when brushing teeth, duration of brushing teeth, mouthwash after meals, eating before going to bed after brushing teeth, sweet drinks, snacks, going to dental clinic to examine teeth, and age of brushing teeth were the factors influencing children’s dental caries (P<0.05). The AUC values predicted by random forest, decision tree, Logistic regression, and XGBoost were 0.840, 0.755, 0.799, and 0.794, respectively. In the random forest model, the variable with the highest contribution was eating before bed after brushing. Conclusion A prediction model of dental caries in children was established on the basis of random forest, showing good prediction effect. Taking preventive measures for the main factors affecting the occurrence of dental caries in children is beneficial.

Key words: children caries, machine learning, random forest, influencing factor, prediction model

CLC Number: