AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (13.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Article | Open Access

CLIP-ASN: A Multi-Model Deep Learning Approach to Recognize Dog Breeds

Asif Nawaz1( )Rana Saud Shoukat2Mohammad Shehab1Khalil El Hindi3Zohair Ahmed4
College of Information Technology, Amman Arab University, Amman, 11953, Jordan
University of Institute Information Technology, PMAS-Arid Agriculture University Rawalpindi, Rawalpindi, 46000, Pakistan
Department of Computer Science, College of Computer & Information Sciences, King Saud University, Riyadh, 11543, Saudi Arabia
School of Computer Science and Engineering, Central South University, Changsha, 410083, China
Show Author Information

Abstract

The kingdom Animalia encompasses multicellular, eukaryotic organisms known as animals. Currently, there are approximately 1.5 million identified species of living animals, including over 195 distinct breeds of dogs. Each breed possesses unique characteristics that can be challenging to distinguish. Each breed has its own characteristics that are difficult to identify. Various computer-based methods, including machine learning, deep learning, transfer learning, and robotics, are employed to identify dog breeds, focusing mainly on image or voice data. Voice-based techniques often face challenges such as noise, distortion, and changes in frequency or pitch, which can impair the model’s performance. Conversely, image-based methods may fail when dealing with blurred images, which can result from poor camera quality or photos taken from a distance. This research presents a hybrid model combining voice and image data for dog breed identification. The proposed method Contrastive Language-Image Pre-Training-Audio Stacked Network (CLIP-ASN) improves robustness, compensating when one data type is compromised by noise or poor quality. By integrating diverse data types, the model can more effectively identify unique breed characteristics, making it superior to methods relying on a single data type. The key steps of the proposed model are data collection, feature extraction based on Contrastive Language Image for image-based feature extraction and Audio stacked-based voice features extraction, co-attention-based classification, and federated learning-based training and distribution. From the experimental evaluation, it has been concluded that the performance of the proposed work in terms of accuracy 89.75% and is far better than the existing benchmark methods.

References

【1】
【1】
 
 
Computers, Materials & Continua
Pages 4777-4793

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Nawaz A, Shoukat RS, Shehab M, et al. CLIP-ASN: A Multi-Model Deep Learning Approach to Recognize Dog Breeds. Computers, Materials & Continua, 2025, 85(3): 4777-4793. https://doi.org/10.32604/cmc.2025.064088

64

Views

0

Downloads

0

Crossref

0

Web of Science

0

Scopus

Received: 05 February 2025
Accepted: 05 June 2025
Published: 23 October 2025
© The Author 2024.

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.