Classification of Nutritional Status of Toddlers on Imbalanced Data Using SMOTE with Decision Tree and Random Forest Models
(Klasifikasi Status Gizi Balita pada Data Tidak Seimbang Menggunakan SMOTE dengan Model Decision Tree dan Random Forest)
Classification of Nutritional Status of Toddlers on Imbalanced Data Using SMOTE with Decision Tree and Random Forest Models
(Klasifikasi Status Gizi Balita pada Data Tidak Seimbang Menggunakan SMOTE dengan Model Decision Tree dan Random Forest)
Firda Fadri
Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Jember, Jl. Kalimantan 37, Jember, Indonesia
Adiliya Itsari Khoirunnisa
Department of Mathematics, Faculty of Mathematics and Natural Sciences, University of Jember, Jl. Kalimantan 37, Jember, Indonesia
DOI: https://doi.org/10.19184/bst.v14i1.60005
ABSTRACT
Nutritional status of toddlers was considered a key indicator for assessing nutritional problems among children. According to the Banyuwangi Regency Health Profile, the prevalence of undernutrition based on the weight-for-height index at the Singotrunan Community Health Center increased from 0.7% in 2022 to 2.3% in 2023. This increase indicated the need for more effective approaches to monitor and classify the nutritional status of toddlers. Conventional classification methods had several limitations, including inefficiency and the risk of human error. Machine learning approach was considered more effective for improving classification accuracy and reliability. The dataset used in this research showed an imbalanced class distribution, where several nutritional status categories contained significantly fewer samples than others. To address this issue, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to balance the data distribution and improve model performance. This research aimed to compare the performance of the Decision Tree and Random Forest algorithms in classifying nutritional status of toddlers into five categories: undernutrition, normal nutrition, risk of overnutrition, overnutrition, and obesity. The analysis results showed that the Random Forest model performed better than the Decision Tree model. Random Forest achieved an accuracy of 0.92, precision of 0.88, recall of 0.87, and an F1-score of 0.87. Decision Tree model obtained an accuracy of 0.83, precision of 0.73, recall of 0.87, and an F1-score of 0.78. These findings indicated that the Random Forest model produced more stable and accurate classification results for identifying nutritional status of toddlers. The application of SMOTE also helped handle the imbalanced data distribution, which improved the reliability of the classification model. The combination of SMOTE and Random Forest can thus be considered a reliable approach for supporting the early identification of toddlers’ nutritional status.
Keywords: Classification; Nutritional status of toddlers; SMOTE; Decision Tree; Random Forest.
Published
31-03-2026
Issue
Vol. 14 No. 1 2026: BERKALA SAINSTEK
Pages
39-45
License
Copyright (c) 2026 BERKALA SAINSTEK