FINE-GRAINED VISUAL ART ANALYSIS VIA YOLOV9 AND K-MEANS
DOI:
https://doi.org/10.29121/shodhkosh.v7.i7s.2026.8199Keywords:
Fine-Grained Visual Art, YOLOv9, Feature Quantification, K, Means Clustering, Cultural Heritage TechnologyAbstract [English]
Using the most recent YOLOv9 model and optimized K-means clustering for adaptive feature extraction and objective classification, this research suggests a universal framework for fine-grained visual art analysis. With an emphasis on eliminating subjectivity in conventional visual art analysis, the framework converts qualitative artistic qualities into quantifiable parameters by quantifying 12 fundamental morphological features, such as line curvature (σ=0.85), symmetry index (using Hu moments), and contour regularity (using Fourier descriptors). Tested on a variety of 1,000 high-resolution images (including jade carvings, ceramic patterns, and pieces of murals), the framework partitions samples into five different categories: Normative Geometric, Free-Form Geometric, Natural Bionic, Minimalist Line, and Symbolic Pattern. It achieves automated clustering with an overall accuracy of 92.7% and a silhouette coefficient of 0.91. Three art historians' cross-validation shows 94.8% consistency with expert classifications, greatly exceeding conventional AI techniques (e.g., ResNet50 with 84.2% accuracy on the same dataset). In support of applications in digital archiving, stylistic evolution research, and cultural heritage preservation, this work develops a strong technical instrument for extensive digital analysis of visual arts.
References
Ali, M. L., & Zhang, Z. (2024). The YOLO framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers, 13(12), 336. https://doi.org/10.3390/computers13120336
Fang, A. (2024, December 31). Carving out a symbol of virtue: Modern jade craftsmanship. People’s Daily Online. https://en.people.cn/n3/2024/1231/c90000-20260543.html
Florin, G. (2024). An analysis of research trends using artificial intelligence in cultural heritage. Electronics, 13(18), 3738. https://doi.org/10.3390/electronics13183738
Gao, C., Zhang, Q., Tan, Z., Zhao, G., Gao, S., Kim, E., & Shen, T. (2024). Applying optimized YOLOv9 for heritage conservation: Enhanced object detection in Jiangnan traditional private gardens. Heritage Science, 12(1), 31. https://doi.org/10.1186/s40494-024-01144-1
Hu, M. K. (1962). Visual pattern recognition by moment invariants. IRE Transactions on Information Theory, 8(2), 179–187. https://doi.org/10.1109/TIT.1962.1057692
Hu, W., & Pei, C. (2025). Digital presentation of intangible cultural heritage: A case study of Beijing jade carving. Frontiers in Art Research, 7(3), 11–17. https://doi.org/10.25236/FAR.2025.070303
Lam, H. L. E. (2019). Representation of heaven and beyond: The Bi disc imagery in the Han burial context. Asian Studies, 7(2), 115–151. https://doi.org/10.4312/as.2019.7.2.115-151
Li, J., Yao, L., Hendriks, E., & Wang, J. Z. (2011). Rhythmic brushstrokes distinguish van Gogh from his contemporaries: Findings via automated brushstroke extraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1159–1176. https://doi.org/10.1109/TPAMI.2011.203
Liang, M., & Tan, C. (2024). Jade and algorithm: Toward digital taxonomy of Chinese ornaments. Digital Applications in Archaeology and Cultural Heritage, 32, e00351. https://doi.org/10.1016/j.daach.2024.e00351
Lim, W. M. (2024). A typology of validity: Content, face, convergent, discriminant, nomological and predictive validity. Journal of Trade Science, 12(3), 155–179. https://doi.org/10.1108/JTS-03-2024-0016
Liu, R., Zhang, H., & Wang, L. (2022). A survey on deep learning in art and cultural heritage. ACM Computing Surveys, 55(9), 1–35. https://doi.org/10.1145/3510426
Lu, Y., & Liu, F. (2023). Evaluating motif density in digital jade carvings. Heritage Science, 11(1), 27. https://doi.org/10.1186/s40494-023-00811-w
Ma, H., & Theppituck, T. (2025). Classification of Chinese traditional jade carvings. International Journal of Sociologies and Anthropologies Science Reviews, 5(6), 657–666. https://doi.org/10.60027/ijsasr.2025.7579
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Sun, S., Dustdar, S., Ranjan, R., Morgan, G., Dong, Y., & Wang, L. (2022). Remote sensing image interpretation with semantic graph-based methods: A survey. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 4544–4558. https://doi.org/10.1109/JSTARS.2022.3176612
Terven, J., Córdova-Esparza, D. M., & Romero-González, J. (2023). A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv9 and YOLO-NAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. https://doi.org/10.3390/make5040083
Wang, M., & Shi, G. (2020). The evolution of Chinese jade carving craftsmanship. Gems & Gemology, 56(1), 30–53. https://doi.org/10.5741/GEMS.56.1.30
Westlake, N., Cai, H., & Hall, P. (2016). Detecting people in artwork with CNNs. In European conference on computer vision (pp. 825–841). Springer International Publishing. https://doi.org/10.1007/978-3-319-46604-0_57
Wu, X., Yuan, Q., Qu, P., et al. (2025). Image-driven batik product knowledge graph construction. NPJ Heritage Science, 13, 20. https://doi.org/10.1038/s40494-025-01823-4
Xie, H., Liu, X., Chen, Y., & Sun, F. (2021). Combining CNNs and handcrafted features for ancient character recognition. Information Sciences, 562, 437–449. https://doi.org/10.1016/j.ins.2021.02.059
Yao, S., & Wu, Q. (2023). Neural-based generation of jade motif patterns. Pattern Recognition Letters, 169, 150–159. https://doi.org/10.1016/j.patrec.2023.03.007
Yu, T., Lin, C., Zhang, S., et al. (2022). Artificial intelligence for Dunhuang cultural heritage protection: The project and the dataset. International Journal of Computer Vision, 130, 2646–2673. https://doi.org/10.1007/s11263-022-01665-x
Zafeiropoulos, C., Tzortzis, I. N., Rallis, I., et al. (2021). Evaluating unsupervised clustering in cultural heritage monitoring. Journal on Computing and Cultural Heritage, 14(4), 1–17. https://doi.org/10.1145/3469006
Zhan, J., Meng, Y., Zhang, L., Li, K., & Yan, F. (2025). Research on computer vision in intelligent damage monitoring of heritage conservation: The case of Yungang cave paintings. NPJ Heritage Science, 13, 45. https://doi.org/10.1038/s40494-025-01945-9
Zhang, K., Liu, J., & Ma, Y. (2022). Research on clustering techniques for cultural object classification. Pattern Analysis and Applications, 25(3), 639–654. https://doi.org/10.1007/s10044-022-01011-w
Zhao, L., & Liu, T. (2023). Style transfer and preservation of Chinese jade carvings using neural networks. Journal of Cultural Heritage, 62, 195–204. https://doi.org/10.1016/j.culher.2022.10.005
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12993–13000. https://doi.org/10.1609/aaai.v34i07.6999
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Huafei Ma, Tatiya Theppituck

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.






















