This is one of my earlier machine learning projects. The Wine Quality Dataset encompasses a rich collection of features related to the chemical composition of wines, offering a real-world context for honing data analysis and modeling skills. In this portfolio section, I share the story of dissecting this dataset, conducting thorough exploratory data analysis, and developing machine learning models to predict wine quality. I collected this dataset from Kaggle, found here.
The primary goal of my portfolio is to identify the wine quality using features like fixed acidity, volatile acidity, citric acidity, pH level and others. I used this project to enhance my skills as a data analyst/scientist and it was a great learning curve.
To predict, I have used various ML algorithms including, Gradient Boosting Classifier, Random Forest Classifier, K Neighbors Classifier. I also used the test train split function to split the dataset to two halves, where 20 percent of the data set is test set.
Following are the results of the models:
Random Forest Classifier:
Accuracy Score: 90 percent
Weighted Average: 0.89
Gradient Boosting Classifier:
Accuracy Score: 88 percent
Weighted Average: 0.87
K Nearest Neighbors:
Accuracy Score: 88 percent
Weighted Average: 0.87