DEEP LEARNING APPROACHES TO EMOTION RECOGNITION IN PHOTOGRAPHIC IMAGES

Authors

  • Xma R Pote Assistant Professor, Dept of Electrical Engg, Yeshwantrao Chavan College of Engineering, Nagpur Maharashtra, India
  • Dr. Mahaveerakannan R Professor, Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, India
  • Dr. Priscilla Joy Assistant Professor, Division of CSE, Karunya Institute of Technology and Sciences, Coimbatore – 641035, India
  • Sheeba Santhosh Assistant Professor Grade 1, Department of ECE, Panimalar Engineering College, Chennai, Tamil Nadu, India
  • Dr. Narina Thakur Assistant Professor, Computer Science & Software Engineering, University of Stirling, RAK Campus, United Arab Emirates, Indaia
  • M. Vignesh Assistant Professor, Department of Artificial Intelligence and Data Science, Karpagam Institute of Technology, Coimbatore, Tamil Nadu, India

DOI:

https://doi.org/10.29121/shodhkosh.v6.i5s.2025.6974

Keywords:

Photo Emotion Recognition, Affective Computing, CNN, Vision Transformer, Feature Fusion, Macro-F1, Calibration, Explainable AI

Abstract [English]

Photo Emotion Recognition (PER) is supposed to learn what emotion is expressed or invoked by an image based on visual representations of color harmony, composition, object-scene semantics, human expressions in the presence when possible. In contrast to face-centric affect analysis, PER needs to analyze the emotions that frequently are a result of situational semantics and aesthetics, as opposed to explicit facial expression. This enhances ambiguity, label subjectivity, and overlapping of the classes. Additionally, the benchmarks of PER are often characterized by class imbalance and noisy annotations because of the different human perceptions. The paper is a complete analytical PER study with a proposed hybrid deep learning model (combines convolutional representations and transformer) to simultaneously identify low-level aesthetic representations and global semantic context. The proposed architecture includes CNN and transformer branches with regard to local texture color stimuli and long-range relational reasoning respectively, followed by the gated-feature fusion and using a balanced classification head. Class-balanced focal loss, label smoothing and emotion-preserving augmentation are used to construct a robust training pipeline, which prevents the distortions that are likely to alter affective meaning. The assessments of the results include macro-F1, per-class sensitivity, and the confusion behavior among the neighbouring emotions, calibration, and cross-domain strength. Numerous experiments of ablation prove that fusion and high-resistance loss decisions are always more effective on the macro-F1 and assist less in common confusions (e.g., fear vs. surprise, sadness vs. contentment/neutral). Lastly, it is a case of explainability analysis through gradient-based localization to determine whether the predictions are in agreement with the emotionally salient regions. Conclusion of the paper is deployment advice (latency, model size, and quantization) and ethical inferences of subjective affect modelling.

References

Ali, A., Oyana, C., and Salum, O. (2024). Domestic Cats Facial Expression Recognition Using CNN. International Journal of Engineering and Advanced Technology, 13, 45–52. https://doi.org/10.35940/ijeat.E4484.13050624 DOI: https://doi.org/10.35940/ijeat.E4484.13050624

Bhattacharjee, S., et al. (2021). Cluster Analysis of Cell Nuclei for Prostate Cancer Diagnosis. Diagnostics, 12, 15. https://doi.org/10.3390/diagnostics12010015 DOI: https://doi.org/10.3390/diagnostics12010015

Corujo, L. A., et al. (2021). Emotion Recognition in Horses With Cnns. Future Internet, 13, 250. Https://Doi.Org/10.3390/Fi13100250 DOI: https://doi.org/10.3390/fi13100250

Dalvi, C., et al. (2021). A Survey of Ai-Based Facial Emotion Recognition. IEEE Access, 9, 165806–165840. https://doi.org/10.1109/ACCESS.2021.3131733 DOI: https://doi.org/10.1109/ACCESS.2021.3131733

Feighelstein, M., et al. (2022). Automated Recognition of Pain in Cats. Scientific Reports, 12, 9575. https://doi.org/10.1038/s41598-022-13348-1 DOI: https://doi.org/10.1038/s41598-022-13348-1

Guo, R. (2023). Pre-trained Multi-Modal Transformer for Pet Emotion Detection. In Proceedings of SciTePress. https://doi.org/10.5220/0011961500003612 DOI: https://doi.org/10.5220/0011961500003612

He, K., et al. (2016). Deep Residual Learning for Image Recognition. In Proceedings of IEEE CVPR, Las Vegas, USA (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90 DOI: https://doi.org/10.1109/CVPR.2016.90

Huang, J., Xu, X., and Zhang, T. (2017). Emotion Classification Using Deep Neural Networks and Emotional Patches. In Proceedings of the IEEE BIBM, Kansas City, USA. https://doi.org/10.1109/BIBM.2017.8217786 DOI: https://doi.org/10.1109/BIBM.2017.8217786

Kujala, M. V., et al. (2020). Time-Resolved Classification of Dog Brain Signals. Scientific Reports, 10, 19846. https://doi.org/10.1038/s41598-020-76806-8 DOI: https://doi.org/10.1038/s41598-020-76806-8

Laganà, F., et al. (2024). Detect Carcinomas using Tomographic Impedance. Engineering, 5, 1594–1614. https://doi.org/10.3390/eng5030084 DOI: https://doi.org/10.3390/eng5030084

Le Jeune, F., et al. (2008). Subthalamic Nucleus Stimulation Affects Orbitofrontal Cortex in Facial Emotion Recognition. Brain, 131, 1599–1608. https://doi.org/10.1093/brain/awn084 DOI: https://doi.org/10.1093/brain/awn084

Li, S., and Deng, W. (2022). Deep Facial Expression Recognition: A Survey. IEEE Transactions on Affective Computing, 13, 1195–1215. https://doi.org/10.1109/TAFFC.2020.2981446 DOI: https://doi.org/10.1109/TAFFC.2020.2981446

Liu, H., et al. (2021). A Perspective on Pet Emotion Monitoring Using Millimeter Wave Radar. In Proceedings of ISAPE, Zhuhai, China. https://doi.org/10.1109/ISAPE54070.2021.9753337 DOI: https://doi.org/10.1109/ISAPE54070.2021.9753337

O’Shea, K. (2015). An Introduction to Convolutional Neural Networks. arXiv preprint arXiv:1511.08458.

Sinnott, R. O., et al. (2021). Run or Pat: Using Deep Learning to Classify the Species Type and Emotion of Pets. In Proceedings of the IEEE CSDE, Brisbane, Australia. https://doi.org/10.1109/CSDE53843.2021.9718465 DOI: https://doi.org/10.1109/CSDE53843.2021.9718465

Sumon, R. I., et al. (2023). Densely Convolutional Spatial Attention Network for Nuclei Segmentation. Frontiers in Oncology, 13, 1009681. DOI: https://doi.org/10.3389/fonc.2023.1009681

Sumon, R. I., et al. (2023). Enhanced Nuclei Segmentation Using Triple-Encoder Architecture. In Proceedings of IEEE UEMCON, New York, USA.

Sumon, R. I., et al. (2024). Exploring DL and ML for Histopathological Image Classification. In Proceedings of ICECET, Sydney, Australia.

Tanwar, V. (2024). CNN-Based Classification for Dog Emotions. In Proceedings of ICOSEC, India (964–969). https://doi.org/10.1109/ICOSEC61587.2024.10722523 DOI: https://doi.org/10.1109/ICOSEC61587.2024.10722523

Tokuhisa, R., Inui, K., and Matsumoto, Y. (2008). Emotion Classification Using Massive Examples Extracted from the Web. In Proceedings of COLING, Manchester, UK. https://doi.org/10.3115/1599081.1599192 DOI: https://doi.org/10.3115/1599081.1599192

Wu, Z. (2024). Recognition and Analysis of Pet Facial Expression Using Densenet. In Proceedings of SciTePress. https://doi.org/10.5220/0012800000003885 DOI: https://doi.org/10.5220/0012800000003885

Downloads

Published

2025-12-28

How to Cite

Pote, X. R., Mahaveerakannan R, Joy, P., Santhosh, S., Thakur, N., & M. Vignesh. (2025). DEEP LEARNING APPROACHES TO EMOTION RECOGNITION IN PHOTOGRAPHIC IMAGES. ShodhKosh: Journal of Visual and Performing Arts, 6(5s), 580–590. https://doi.org/10.29121/shodhkosh.v6.i5s.2025.6974