The quality of red wine is crucial for both consumers and producers, influencing purchasing decisions and product improvements. This study aims to enhance red wine quality prediction models through effective feature selection and model optimization. By employing feature engineering to construct and assess feature contributions, the study identifies the best feature combinations and utilizes a five-dimensional evaluation framework of accuracy, precision, recall, F1 score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC) to screen various models. The research integrates new feature combinations with the optimal model and compares performance before and after feature selection through cross-validation, focusing on stability and generalization. The findings reveal that the Random Forest model, when combined with feature selection, outperforms models using original features in terms of generalization and stability. Key features such as alcohol content and free Sulphur dioxide significantly enhance prediction accuracy. However, new feature construction does not always improve model performance and may introduce noise. These results not only offer practical insights for production and quality control but also underscore the importance of careful feature selection in model prediction, contributing valuable academic knowledge to the field.
Research Article
Open Access