OBJECT DETECTION IN PHOTOGRAPHY USING DEEP LEARNING
DOI:
https://doi.org/10.29121/shodhkosh.v6.i4s.2025.6835Keywords:
Object Detection, Deep Learning Photography, YOLO/Faster R-CNN, Image Annotation, Detection ArchitectureAbstract [English]
Object detection in photography has developed fast due to deep learning and has changed the manner in which visual content is photographed, arranged, and understood. This paper is a detailed examination of the current detection systems and how they can apply to the photographic process. Starting with the description of classical approaches like HOG, Haar cascades, and SVM-based networks, the paper compares the drawbacks of the mentioned methods with the advancement of CNN-based frameworks. R-CNN to Faster R-CNN is talked about and efficiency of region proposal and representational richness are improved. The single-shot detectors that are investigated are YOLO, SSD, and RetinaNet as they can offer high-speed inference, thus they are applicable to the real-time or mobile photography case. The study also examines photography-focused datasets like COCO, Open Images and expert-curated collections, which are annotation formats and augmentation strategies, which are taken into account in artistic variability, lighting and composition issues common to both professional and amateur photography. A new architecture based on applying modern backbones: ResNet, EfficientNet, and Swin Transformer and flexible detection heads is proposed. The loss functions that encompass robust localization, classification refinement, and variants of the IoU are combined so that they optimize the performance in various photographic scenes. Applications have shown very strong effect: automated tagging and image organization, real-time detection of both DSLR/mobile systems, and intelligent aid to the creation of art and subject-awareness to enhance composition.
References
Albekairi, M., Mekki, H., Kaaniche, K., and Yousef, A. (2023). An Innovative Collision-Free Image-Based Visual Servoing Method for Mobile Robot Navigation Based on the Path Planning in the Image Plan. Sensors, 23(24), 9667. https://doi.org/10.3390/s23249667
Asayesh, S., Darani, H. S., Chen, M., Mehrandezh, M., and Gupta, K. (2023). Toward Scalable Visual Servoing Using Deep Reinforcement Learning and Optimal Control. Arxiv Preprint Arxiv:2310.01360.
Fu, G., Chu, H., Liu, L., Fang, L., and Zhu, X. (2023). Deep Reinforcement Learning for the Visual Servoing Control of UAVs with FOV Constraint. Drones, 7(6), 375. https://doi.org/10.3390/drones7060375
Jin, Z., Wu, J., Liu, A., Zhang, W. A., and Yu, L. (2022). Policy-Based Deep Reinforcement Learning for Visual Servoing Control of Mobile Robots with Visibility Constraints. IEEE Transactions on Industrial Electronics, 69(2), 1898–1908. https://doi.org/10.1109/TIE.2021.3057005
Li, J., Peng, X., Li, B., Sreeram, V., Wu, J., Chen, Z., and Li, M. (2023). Model Predictive Control for Constrained Robot Manipulator Visual Servoing Tuned by Reinforcement Learning. Mathematical Biosciences and Engineering, 20(9), 10495–10513. https://doi.org/10.3934/mbe.2023463
Machkour, Z., Ortiz-Arroyo, D., and Durdevic, P. (2022). Classical and Deep Learning-Based Visual Servoing Systems: A Survey on State of the Art. Journal of Intelligent and Robotic Systems, 104(1), 11. https://doi.org/10.1007/s10846-021-01540-w
Peng, X., Li, J., Li, B., and Wu, J. (2022). Constrained Image-Based Visual Servoing of Robot Manipulator with Third-Order Sliding-Mode Observer. Machines, 10(6), 465. https://doi.org/10.3390/machines10060465
Ramani, P., Varghese, A., and Balachandar, N. (2024). Image-Based Visual Servoing for Tele-Operated Ground Vehicles. AIP Conference Proceedings, 2802(1), 110001. https://doi.org/10.1063/5.0181872
Reis, D., Kupec, J., Hong, J., and Daoudi, A. (2024). Real-Time Flying Object Detection with YOLOv8. Arxiv Preprint ArXiv:2305.09972.
Rekavandi, A. M., Rashidi, S., Boussaid, F., Hoefs, S., Akbas, E., and Bennamoun, M. (2023). Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art. Arxiv Preprint arXiv:2309.04902.
Ribeiro, E. G., Mendes, R. Q., Terra, M. H., and Grassi, V. (2024). Second-Order Position-Based Visual Servoing of a Robot Manipulator. IEEE Robotics and Automation Letters, 9(1), 207–214. https://doi.org/10.1109/LRA.2023.3331894
Yang, K., Bai, C., She, Z., and Quan, Q. (2024). High-Speed Interception Multicopter Control by Image-Based Visual Servoing. ArXiv Preprint arXiv:2404.08296. https://doi.org/10.1109/TCST.2024.3451293
Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L. M., and Shum, H. Y. (2022). DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection. Arxiv Preprint arXiv:2203.03605.
Zhang, Y., Yang, Y., and Luo, W. (2023). Occlusion-Free Image-Based Visual Servoing Using Probabilistic Control Barrier Certificates. IFAC-PapersOnLine, 56(2), 4381–4387. https://doi.org/10.1016/j.ifacol.2023.10.1818
Zhu, T., Mao, J., Han, L., and Zhang, C. (2024). Fuzzy Adaptive Model Predictive Control for Image-Based Visual Servoing of Robot Manipulators with Kinematic Constraints. International Journal of Control, Automation and Systems, 22(2), 311–322. https://doi.org/10.1007/s12555-022-0205-6
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Saniya Khurana, Mr. Akash Kumar Bhagat, Dr. Rajesh Uttam Kanthe, Dipali Kapil Mundada, Dr. Tanmoy Parida, Dr. S.Prayla Shyry, Kumar Ambar Pandey

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























