Combining YOLO and Scikit-Learn Improves Real-World Audience Classification

Tri Luu; Thinh Tong; Tuong Dang; Vuong Pham; Minh Phan

doi:https://doi.org/10.59232/AIR-V3I2P101

Research Article | Open Access | Download Full Text

Volume 3 | Issue 2 | Year 2025 | Article Id: AIR-V3I2P101 DOI: https://doi.org/10.59232/AIR-V3I2P101

Combining YOLO and Scikit-Learn Improves Real-World Audience Classification

Tri Luu, Thinh Tong, Tuong Dang, Vuong Pham, Minh Phan

Received	Revised	Accepted	Published
02 Apr 2025	04 May 2025	06 Jun 2025	30 Jun 2025

Citation

Tri Luu, Thinh Tong, Tuong Dang, Vuong Pham, Minh Phan. “Combining YOLO and Scikit-Learn Improves Real-World Audience Classification.” DS Journal of Artificial Intelligence and Robotics, vol. 3, no. 2, pp. 1-25, 2025.

Abstract

This study addresses the problem of automatic object classification by leveraging the strengths of both deep learning and traditional machine learning. The main goal of this project is to develop a prototype application capable of efficiently and accurately recognizing and classifying objects in images. To tackle this, the YOLOv10 model for object detection was used, then extracted features such as bounding-box size [3] and average color. If an image is of poor quality or YOLOv10 fails to detect any object, this study applies PCA to enhance image quality. These extracted features are then used to train a Random Forest classifier from the scikit-learn library. The performance of the Random Forest classifier is optimized using GridSearchCV [2] and evaluated using StratifiedKFold [5]. The results showed that the YOLO + Random Forest combination system achieved an overall accuracy of 93%, with a higher average Precision and F1-score than using YOLOv10 alone. The combined model significantly improves the ability to classify glass and organic objects, although the number of samples of these two types is limited. The study concluded that the combination of YOLOv10 and Random Forest is an effective approach to building an automated object classification system, taking advantage of the detection speed of deep learning and the characterization-based classification capabilities of traditional machine learning, contributing to intelligent object management.

Keywords

Feature extraction, Image processing, Object classification, Random forest, YOLOv10.

References

[1] GeeksforGeeks, Random Forest Algorithm in Machine Learning, 2024. [Online]. Available: https://www.geeksforgeeks.org/random-forest-algorithm-in-machine-learning/

[2] Mark Everingham et al., “The Pascal Visual Object Classes (VOC) Challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010.

[CrossRef] [Google Scholar] [Publisher Link]

[3] Principal Component Analysis (PCA), GeeksforGeeks, 2025. [Online]. Available: https://www.geeksforgeeks.org/principal-component-analysis-pca/

[4] Ankan Ghosh, YOLOv10: The Dual-Head OG of YOLO Series, 2024. [Online]. Available: https://learnopencv.com/yolov10/

[5] Rafael Gonzalez, and Richard Woods, Digital Image Processing, 4^th ed., Pearson, New York, 2017.

[Google Scholar] [Publisher Link]

[6] Histograms - 2: Histogram Equalization, OpenCV Documentation, 2025. [Online]. Available:

https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html

[7] P. Potrimba, What is YOLOv10? An Architecture Deep Dive, Roboflow Blog, 2024. [Online]. Available:

https://blog.roboflow.com/what-is-yolov10/

[8] GridSearchCV, Scikit-learn, 2025. [Online]. Available:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

[9] Principal Component Analysis (PCA), Scikit-learn, 2025. [Online]. Available:

https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

[10] StratifiedKFold, Scikit-learn, 2025. [Online]. Available:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html

[11] Yolov 10, Scribd, 2025. [Online]. Available: https://www.scribd.com/document/738483444/Yolov10

[12] Ao Wang et al., “YOLOv10: Real-Time End-to-End Object Detection,” arXiv Preprint, pp. 1-21, 2024.

[CrossRef] [Google Scholar] [Publisher Link]

[13] Matthew D. Zeiler, and Rob Fergus, “Visualizing and Understanding Convolutional Networks,” Computer Vision - ECCV 2014, Zurich, Switzerland, pp. 818-833, 2014.

[CrossRef] [Google Scholar] [Publisher Link]

[14] YOLOv10: Real-time Endpoint Object Recognition, Ultralytics, 2024. [Online]. Available:

https://docs.ultralytics.com/vi/models/yolov10/