OBJECT DETECTION IN PHOTOGRAPHY USING DEEP LEARNING

Authors

  • Saniya Khurana Centre of Research Impact and Outcome, Chitkara University, Rajpura- 140417, Punjab, India
  • Mr. Akash Kumar Bhagat Assistant Professor, Department of Computer Science and IT, Arka Jain University Jamshedpur, Jharkhand, India
  • Dr. Rajesh Uttam Kanthe Director, Bharati Vidyapeeth (Deemed to be University) Institute of Management, Kolhapur -416003, India
  • Dipali Kapil Mundada Department of Engineering, Science and Humanities, Vishwakarma Institute of Technology, Pune, Maharashtra, 411037 India
  • Dr. Tanmoy Parida Associate Professor, Department of Computer Science and Engineering, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India
  • Dr. S.Prayla Shyry Professor, Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India,
  • Kumar Ambar Pandey Assistant Professor, School of Journalism and Mass Communication, Noida, International University, 203201, India

DOI:

https://doi.org/10.29121/shodhkosh.v6.i4s.2025.6835

Keywords:

Object Detection, Deep Learning Photography, YOLO/Faster R-CNN, Image Annotation, Detection Architecture

Abstract [English]

Object detection in photography has developed fast due to deep learning and has changed the manner in which visual content is photographed, arranged, and understood. This paper is a detailed examination of the current detection systems and how they can apply to the photographic process. Starting with the description of classical approaches like HOG, Haar cascades, and SVM-based networks, the paper compares the drawbacks of the mentioned methods with the advancement of CNN-based frameworks. R-CNN to Faster R-CNN is talked about and efficiency of region proposal and representational richness are improved. The single-shot detectors that are investigated are YOLO, SSD, and RetinaNet as they can offer high-speed inference, thus they are applicable to the real-time or mobile photography case. The study also examines photography-focused datasets like COCO, Open Images and expert-curated collections, which are annotation formats and augmentation strategies, which are taken into account in artistic variability, lighting and composition issues common to both professional and amateur photography. A new architecture based on applying modern backbones: ResNet, EfficientNet, and Swin Transformer and flexible detection heads is proposed. The loss functions that encompass robust localization, classification refinement, and variants of the IoU are combined so that they optimize the performance in various photographic scenes. Applications have shown very strong effect: automated tagging and image organization, real-time detection of both DSLR/mobile systems, and intelligent aid to the creation of art and subject-awareness to enhance composition.

References

Albekairi, M., Mekki, H., Kaaniche, K., and Yousef, A. (2023). An Innovative Collision-Free Image-Based Visual Servoing Method for Mobile Robot Navigation Based on the Path Planning in the Image Plan. Sensors, 23(24), 9667. https://doi.org/10.3390/s23249667

Asayesh, S., Darani, H. S., Chen, M., Mehrandezh, M., and Gupta, K. (2023). Toward Scalable Visual Servoing Using Deep Reinforcement Learning and Optimal Control. Arxiv Preprint Arxiv:2310.01360.

Fu, G., Chu, H., Liu, L., Fang, L., and Zhu, X. (2023). Deep Reinforcement Learning for the Visual Servoing Control of UAVs with FOV Constraint. Drones, 7(6), 375. https://doi.org/10.3390/drones7060375

Jin, Z., Wu, J., Liu, A., Zhang, W. A., and Yu, L. (2022). Policy-Based Deep Reinforcement Learning for Visual Servoing Control of Mobile Robots with Visibility Constraints. IEEE Transactions on Industrial Electronics, 69(2), 1898–1908. https://doi.org/10.1109/TIE.2021.3057005

Li, J., Peng, X., Li, B., Sreeram, V., Wu, J., Chen, Z., and Li, M. (2023). Model Predictive Control for Constrained Robot Manipulator Visual Servoing Tuned by Reinforcement Learning. Mathematical Biosciences and Engineering, 20(9), 10495–10513. https://doi.org/10.3934/mbe.2023463

Machkour, Z., Ortiz-Arroyo, D., and Durdevic, P. (2022). Classical and Deep Learning-Based Visual Servoing Systems: A Survey on State of the Art. Journal of Intelligent and Robotic Systems, 104(1), 11. https://doi.org/10.1007/s10846-021-01540-w

Peng, X., Li, J., Li, B., and Wu, J. (2022). Constrained Image-Based Visual Servoing of Robot Manipulator with Third-Order Sliding-Mode Observer. Machines, 10(6), 465. https://doi.org/10.3390/machines10060465

Ramani, P., Varghese, A., and Balachandar, N. (2024). Image-Based Visual Servoing for Tele-Operated Ground Vehicles. AIP Conference Proceedings, 2802(1), 110001. https://doi.org/10.1063/5.0181872

Reis, D., Kupec, J., Hong, J., and Daoudi, A. (2024). Real-Time Flying Object Detection with YOLOv8. Arxiv Preprint ArXiv:2305.09972.

Rekavandi, A. M., Rashidi, S., Boussaid, F., Hoefs, S., Akbas, E., and Bennamoun, M. (2023). Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art. Arxiv Preprint arXiv:2309.04902.

Ribeiro, E. G., Mendes, R. Q., Terra, M. H., and Grassi, V. (2024). Second-Order Position-Based Visual Servoing of a Robot Manipulator. IEEE Robotics and Automation Letters, 9(1), 207–214. https://doi.org/10.1109/LRA.2023.3331894

Yang, K., Bai, C., She, Z., and Quan, Q. (2024). High-Speed Interception Multicopter Control by Image-Based Visual Servoing. ArXiv Preprint arXiv:2404.08296. https://doi.org/10.1109/TCST.2024.3451293

Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L. M., and Shum, H. Y. (2022). DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection. Arxiv Preprint arXiv:2203.03605.

Zhang, Y., Yang, Y., and Luo, W. (2023). Occlusion-Free Image-Based Visual Servoing Using Probabilistic Control Barrier Certificates. IFAC-PapersOnLine, 56(2), 4381–4387. https://doi.org/10.1016/j.ifacol.2023.10.1818

Zhu, T., Mao, J., Han, L., and Zhang, C. (2024). Fuzzy Adaptive Model Predictive Control for Image-Based Visual Servoing of Robot Manipulators with Kinematic Constraints. International Journal of Control, Automation and Systems, 22(2), 311–322. https://doi.org/10.1007/s12555-022-0205-6

Downloads

Published

2025-12-25

How to Cite

Khurana, S., Bhagat, A. K., Kanthe, R., Mundada, D. K., Parida, D. T., Shyry, S., & Pandey, K. A. (2025). OBJECT DETECTION IN PHOTOGRAPHY USING DEEP LEARNING. ShodhKosh: Journal of Visual and Performing Arts, 6(4s), 432–441. https://doi.org/10.29121/shodhkosh.v6.i4s.2025.6835