FINE-GRAINED VISUAL ART ANALYSIS VIA YOLOV9 AND K-MEANS

Authors

  • Huafei Ma Faculty of Architecture Art and Design, Naresuan University, Phitsanulok, Thailand
  • Tatiya Theppituck Faculty of Architecture Art and Design, Naresuan University, Phitsanulok, Thailand

DOI:

https://doi.org/10.29121/shodhkosh.v7.i7s.2026.8199

Keywords:

Fine-Grained Visual Art, YOLOv9, Feature Quantification, K, Means Clustering, Cultural Heritage Technology

Abstract [English]

Using the most recent YOLOv9 model and optimized K-means clustering for adaptive feature extraction and objective classification, this research suggests a universal framework for fine-grained visual art analysis. With an emphasis on eliminating subjectivity in conventional visual art analysis, the framework converts qualitative artistic qualities into quantifiable parameters by quantifying 12 fundamental morphological features, such as line curvature (σ=0.85), symmetry index (using Hu moments), and contour regularity (using Fourier descriptors). Tested on a variety of 1,000 high-resolution images (including jade carvings, ceramic patterns, and pieces of murals), the framework partitions samples into five different categories: Normative Geometric, Free-Form Geometric, Natural Bionic, Minimalist Line, and Symbolic Pattern. It achieves automated clustering with an overall accuracy of 92.7% and a silhouette coefficient of 0.91. Three art historians' cross-validation shows 94.8% consistency with expert classifications, greatly exceeding conventional AI techniques (e.g., ResNet50 with 84.2% accuracy on the same dataset). In support of applications in digital archiving, stylistic evolution research, and cultural heritage preservation, this work develops a strong technical instrument for extensive digital analysis of visual arts.

References

Ali, M. L., & Zhang, Z. (2024). The YOLO framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers, 13(12), 336. https://doi.org/10.3390/computers13120336

Fang, A. (2024, December 31). Carving out a symbol of virtue: Modern jade craftsmanship. People’s Daily Online. https://en.people.cn/n3/2024/1231/c90000-20260543.html

Florin, G. (2024). An analysis of research trends using artificial intelligence in cultural heritage. Electronics, 13(18), 3738. https://doi.org/10.3390/electronics13183738

Gao, C., Zhang, Q., Tan, Z., Zhao, G., Gao, S., Kim, E., & Shen, T. (2024). Applying optimized YOLOv9 for heritage conservation: Enhanced object detection in Jiangnan traditional private gardens. Heritage Science, 12(1), 31. https://doi.org/10.1186/s40494-024-01144-1

Hu, M. K. (1962). Visual pattern recognition by moment invariants. IRE Transactions on Information Theory, 8(2), 179–187. https://doi.org/10.1109/TIT.1962.1057692

Hu, W., & Pei, C. (2025). Digital presentation of intangible cultural heritage: A case study of Beijing jade carving. Frontiers in Art Research, 7(3), 11–17. https://doi.org/10.25236/FAR.2025.070303

Lam, H. L. E. (2019). Representation of heaven and beyond: The Bi disc imagery in the Han burial context. Asian Studies, 7(2), 115–151. https://doi.org/10.4312/as.2019.7.2.115-151

Li, J., Yao, L., Hendriks, E., & Wang, J. Z. (2011). Rhythmic brushstrokes distinguish van Gogh from his contemporaries: Findings via automated brushstroke extraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1159–1176. https://doi.org/10.1109/TPAMI.2011.203

Liang, M., & Tan, C. (2024). Jade and algorithm: Toward digital taxonomy of Chinese ornaments. Digital Applications in Archaeology and Cultural Heritage, 32, e00351. https://doi.org/10.1016/j.daach.2024.e00351

Lim, W. M. (2024). A typology of validity: Content, face, convergent, discriminant, nomological and predictive validity. Journal of Trade Science, 12(3), 155–179. https://doi.org/10.1108/JTS-03-2024-0016

Liu, R., Zhang, H., & Wang, L. (2022). A survey on deep learning in art and cultural heritage. ACM Computing Surveys, 55(9), 1–35. https://doi.org/10.1145/3510426

Lu, Y., & Liu, F. (2023). Evaluating motif density in digital jade carvings. Heritage Science, 11(1), 27. https://doi.org/10.1186/s40494-023-00811-w

Ma, H., & Theppituck, T. (2025). Classification of Chinese traditional jade carvings. International Journal of Sociologies and Anthropologies Science Reviews, 5(6), 657–666. https://doi.org/10.60027/ijsasr.2025.7579

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7

Sun, S., Dustdar, S., Ranjan, R., Morgan, G., Dong, Y., & Wang, L. (2022). Remote sensing image interpretation with semantic graph-based methods: A survey. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 4544–4558. https://doi.org/10.1109/JSTARS.2022.3176612

Terven, J., Córdova-Esparza, D. M., & Romero-González, J. (2023). A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv9 and YOLO-NAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. https://doi.org/10.3390/make5040083

Wang, M., & Shi, G. (2020). The evolution of Chinese jade carving craftsmanship. Gems & Gemology, 56(1), 30–53. https://doi.org/10.5741/GEMS.56.1.30

Westlake, N., Cai, H., & Hall, P. (2016). Detecting people in artwork with CNNs. In European conference on computer vision (pp. 825–841). Springer International Publishing. https://doi.org/10.1007/978-3-319-46604-0_57

Wu, X., Yuan, Q., Qu, P., et al. (2025). Image-driven batik product knowledge graph construction. NPJ Heritage Science, 13, 20. https://doi.org/10.1038/s40494-025-01823-4

Xie, H., Liu, X., Chen, Y., & Sun, F. (2021). Combining CNNs and handcrafted features for ancient character recognition. Information Sciences, 562, 437–449. https://doi.org/10.1016/j.ins.2021.02.059

Yao, S., & Wu, Q. (2023). Neural-based generation of jade motif patterns. Pattern Recognition Letters, 169, 150–159. https://doi.org/10.1016/j.patrec.2023.03.007

Yu, T., Lin, C., Zhang, S., et al. (2022). Artificial intelligence for Dunhuang cultural heritage protection: The project and the dataset. International Journal of Computer Vision, 130, 2646–2673. https://doi.org/10.1007/s11263-022-01665-x

Zafeiropoulos, C., Tzortzis, I. N., Rallis, I., et al. (2021). Evaluating unsupervised clustering in cultural heritage monitoring. Journal on Computing and Cultural Heritage, 14(4), 1–17. https://doi.org/10.1145/3469006

Zhan, J., Meng, Y., Zhang, L., Li, K., & Yan, F. (2025). Research on computer vision in intelligent damage monitoring of heritage conservation: The case of Yungang cave paintings. NPJ Heritage Science, 13, 45. https://doi.org/10.1038/s40494-025-01945-9

Zhang, K., Liu, J., & Ma, Y. (2022). Research on clustering techniques for cultural object classification. Pattern Analysis and Applications, 25(3), 639–654. https://doi.org/10.1007/s10044-022-01011-w

Zhao, L., & Liu, T. (2023). Style transfer and preservation of Chinese jade carvings using neural networks. Journal of Cultural Heritage, 62, 195–204. https://doi.org/10.1016/j.culher.2022.10.005

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12993–13000. https://doi.org/10.1609/aaai.v34i07.6999

Downloads

Published

2026-05-19

How to Cite

Ma, H., & Theppituck, T. (2026). FINE-GRAINED VISUAL ART ANALYSIS VIA YOLOV9 AND K-MEANS. ShodhKosh: Journal of Visual and Performing Arts, 7(7s), 516–536. https://doi.org/10.29121/shodhkosh.v7.i7s.2026.8199