OpenHub Repository

Clustering and classifying global food insecurity index and crop production using machine learning algorithms

Show simple item record

dc.contributor.author Pieterse, Jaden
dc.date.accessioned 2025-08-11T12:42:07Z
dc.date.available 2025-08-11T12:42:07Z
dc.date.issued 2024
dc.identifier.uri http://hdl.handle.net/20.500.12821/578
dc.description Food production efficiency, scientific principles en_US
dc.description.abstract Food insecurity on a global scale still affects millions of individuals, necessitating researchers to identify and address the underlying factors. Despite numerous efforts to mitigate food insecurity, significant gaps remain in understanding the effects of socioeconomic and climatic factors on major crop production, such as maize which affects food insecurity. Numerous statistical and machine learning approaches have been utilized in addressing the problem, however, some of these approaches cannot accurately and robustly model the underlying structures of the data. For example, while machine learning approaches may produce accurate predictions, they are less interpretable than traditional statistical models, thus a need to identify alternative more interpretable models that produce accurate and robust predictions. Therefore, the study aims to compare the performance of the K-Means algorithm and the Gaussian Mixture Model (GMM) for clustering, as well as the K-Nearest Neighbor (KNN) and Random Forest (RF) algorithms for classifying global agricultural production of maize. The K-Means and Gaussian Mixture Model (GMM) cluster countries based on maize production and food insecurity, evaluated using metrics like the Silhouette Coefficient, Dunn’s Index, and Davies-Bouldin Index, while K-Nearest Neighbor (KNN) and Random Forest (RF) classify production categories, assessed with accuracy, precision, recall, F1-score, ROC, and AUC. Features such as agricultural land, food insecurity level, population, and climatic factors including CO2 emissions, temperature, and precipitation were collected from multiple online datasets and databases for the year 2022. The findings of the study indicate that the K-means outperformed the GMM algorithm and the RF produced better results than the KNN algorithm in predicting maize production categories. Furthermore, from the two distinct country clusters, cluster one has higher maize yields and lower food insecurity while cluster two has lower maize yields and higher food insecurity levels. These results might assist policy interventions on mitigating climate impacts and also suggest sustainable agricultural practices in high-risk regions, like Sub-Saharan Africa, Southern Asia, and Central America. en_US
dc.language.iso en en_US
dc.publisher Sol Plaatje University en_US
dc.subject Food insecurity level en_US
dc.subject Agricultural yield en_US
dc.subject Maize production en_US
dc.subject Clustering en_US
dc.subject K-Means en_US
dc.subject Gaussian mixture model en_US
dc.subject Random forest en_US
dc.subject K-nearest neighbor en_US
dc.subject Agricultural food production, technological innovations en_US
dc.subject Agricultural food production, computer applications en_US
dc.subject Computer modelling software, maize production en_US
dc.subject Food production efficiency, technological Innovation agriculture en_US
dc.subject Food production efficiency, computer applications en_US
dc.title Clustering and classifying global food insecurity index and crop production using machine learning algorithms en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search OpenHub


Browse

My Account