BERKALA SAINSTEK

Classification of Nutritional Status of Toddlers on Imbalanced Data Using SMOTE with Decision Tree and Random Forest Models
(Klasifikasi Status Gizi Balita pada Data Tidak Seimbang Menggunakan SMOTE dengan Model Decision Tree dan Random Forest)

Firda Fadri

Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Jember, Jl. Kalimantan 37, Jember, Indonesia

Adiliya Itsari Khoirunnisa

Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Jember, Jl. Kalimantan 37, Jember, Indonesia

DOI: https://doi.org/10.19184/bst.v14i1.60005

ABSTRACT

Nutritional status of toddlers was considered a key indicator for assessing nutritional problems among children. According to the Banyuwangi Regency Health Profile, the prevalence of undernutrition based on the weight-for-height index at the Singotrunan Community Health Center increased from 0.7% in 2022 to 2.3% in 2023. This increase indicated the need for more effective approaches to monitor and classify the nutritional status of toddlers. Conventional classification methods had several limitations, including inefficiency and the risk of human error. Machine learning approach was considered more effective for improving classification accuracy and reliability. The dataset used in this research showed an imbalanced class distribution, where several nutritional status categories contained significantly fewer samples than others. To address this issue, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to balance the data distribution and improve model performance. This research aimed to compare the performance of the Decision Tree and Random Forest algorithms in classifying nutritional status of toddlers into five categories: undernutrition, normal nutrition, risk of overnutrition, overnutrition, and obesity. The analysis results showed that the Random Forest model performed better than the Decision Tree model. Random Forest achieved an accuracy of 0.92, precision of 0.88, recall of 0.87, and an F1-score of 0.87. Decision Tree model obtained an accuracy of 0.83, precision of 0.73, recall of 0.87, and an F1-score of 0.78. These findings indicated that the Random Forest model produced more stable and accurate classification results for identifying nutritional status of toddlers. The application of SMOTE also helped handle the imbalanced data distribution, which improved the reliability of the classification model. The combination of SMOTE and Random Forest can thus be considered a reliable approach for supporting the early identification of toddlers’ nutritional status.

Keywords: Classification; Nutritional status of toddlers; SMOTE; Decision Tree; Random Forest.

PDF

Published

31-03-2026

Issue

Vol. 14 No. 1 2026: BERKALA SAINSTEK

Pages

39-45

License

Page updated

Report abuse