Machine Learning
Obesity Classification Using Machine Learning
This project involved building a multi-class classification pipeline to predict obesity levels from
a dataset
of 2,111 individuals. The workflow covered exploratory data analysis, statistical feature selection,
one-hot encoding of categorical variables, and comparison of five ML algorithms.
The final model — an optimized Random Forest classifier — achieved 95.3% accuracy
on the
holdout test set. Key engineering decisions included handling class imbalance with SMOTE and tuning
hyperparameters via GridSearchCV.
Python
Pandas
Scikit-learn
Matplotlib
SMOTE
Random Forest