Journal Home > Volume 27 , Issue 5

Household garbage images are usually faced with complex backgrounds, variable illuminations, diverse angles, and changeable shapes, which bring a great difficulty in garbage image classification. Due to the ability to discover problem-specific features, deep learning and especially convolutional neural networks (CNNs) have been successfully and widely used for image representation learning. However, available and stable household garbage datasets are insufficient, which seriously limits the development of research and application. Besides, the state-of-the-art in the field of garbage image classification is not entirely clear. To solve this problem, in this study, we built a new open benchmark dataset for household garbage image classification by simulating different lightings, backgrounds, angles, and shapes. This dataset is named 30 classes of household garbage images (HGI-30), which contains 18 000 images of 30 household garbage classes. The publicly available HGI-30 dataset allows researchers to develop accurate and robust methods for household garbage recognition. We also conducted experiments and performance analyses of the state-of-the-art deep CNN methods on HGI-30, which serves as baseline results on this benchmark.


menu
Abstract
Full text
Outline
About this article

New Benchmark for Household Garbage Image Recognition

Show Author's information Zhize WuHuanyi Li( )Xiaofeng Wang( )Zijun WuLe ZouLixiang XuMing Tan
School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China
School of Energy Materials and Chemical Engineering, Hefei University, Hefei 230601, China

Abstract

Household garbage images are usually faced with complex backgrounds, variable illuminations, diverse angles, and changeable shapes, which bring a great difficulty in garbage image classification. Due to the ability to discover problem-specific features, deep learning and especially convolutional neural networks (CNNs) have been successfully and widely used for image representation learning. However, available and stable household garbage datasets are insufficient, which seriously limits the development of research and application. Besides, the state-of-the-art in the field of garbage image classification is not entirely clear. To solve this problem, in this study, we built a new open benchmark dataset for household garbage image classification by simulating different lightings, backgrounds, angles, and shapes. This dataset is named 30 classes of household garbage images (HGI-30), which contains 18 000 images of 30 household garbage classes. The publicly available HGI-30 dataset allows researchers to develop accurate and robust methods for household garbage recognition. We also conducted experiments and performance analyses of the state-of-the-art deep CNN methods on HGI-30, which serves as baseline results on this benchmark.

Keywords:

benchmark, household garbage, image classification, deep convolutional neural networks
Received: 07 June 2021 Revised: 20 August 2021 Accepted: 18 September 2021 Published: 17 March 2022 Issue date: October 2022
References(52)
[1]
Z. J. Ding, C. J. Zhu, J. J. Wang, Y. F. Qiu, and G. Cen, Garbage classification system based on AI and IoT, presented at the 15th IEEE International Conference on Computer Science & Education, Delft, the Netherlands, 2020.
[2]
Q. X. Zhang, G. H. Lin, Y. M. Zhang, G. Xu, and J. J. Wang, Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images, Procedia engineering, vol. 221, no. 3, pp. 441–446, 2018.
[3]
H. Z. Chen, A. Chen, L. L. Xu, H. Xie, H. L. Qiao, Q. Y. Lin, and K. Cai, A deep learning CNN architecture applied in smart near-infrared analysis of water pollution for agricultural irrigation resources, Agricultural Water Management, .
[4]
C. M. Han, G. F. Li, Y. X. Ding, F. L. Yan, and L. Y. Bai, Chimney detection based on Faster R-CNN and spatial analysis methods in high resolution remote sensing images, Sensors, .
[5]
G. S. Hu, H. Y. Wang, Y. Zhang, and M. Z. Wan, Detection and severity analysis of tea leaf blight based on deep learning, Computers & Electrical Engineering, .
DOI
[6]
D. Datta and S. B. Jamalmohammed, Image classification using CNN with multi-core and many-core architecture, Applications of Artificial Intelligence for Smart Technology, .
[7]
D. Zeng, S. Zhang, F. Chen, and Y. Wang, Multi-scale CNN based garbage detection of airborne hyperspectral data, IEEE Access, .
[8]
A. B. Ye, B. Pang, Y. C. Jin, and J. H. Cui, A YOLO-based neural network with VAE for intelligent garbage detection and classification, presented at the 3rd International Conference on Algorithms, Computing and Artificial Intelligence, Sanya, China, 2020.
[9]
Z. F. Nie, W. J. Duan, and X. D. Li, Domestic garbage recognition and detection based on Faster R-CNN, Journal of Physics: Conference Series, .
[10]
J. Q. Bai, S. G. Lian, Z. X. Liu, K. Wang, and D. J. Liu, Deep learning based robot for automatically picking up garbage on the grass, IEEE Transactions on Consumer Electronics, .
[11]
H. Liu, G. O. Owolab, and S. H. Kim, Automatic Classifications and Recognition for Recycled Garbage by Utilizing Deep Learning Technology, in Proc. the 2019 7th International Conference on Information Technology: IoT and Smart City, Shanghai, China, 2019, pp. 1–4.
[12]
G. Mitta, K. B. Yagnik, M. Garg, and N. C. Krishnan, Spotgarbage: Smartphone app to detect garbage using deep learning, in Proc. the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Heidelberg, Germany, 2016, pp. 940–945.
[13]
G. Y. Jia, Y. J. Zhu, G. J. Han, S. Chan, and L. Shu, STC: An intelligent trash can system based on both NB-IoT and edge computing for smart cities, Enterprise Information Systems, vol. 14, nos. 9&10, pp. 1422–1438, 2020.
DOI
[14]
S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, Common garbage classification using mobilenet, presented at the 10th IEEE International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), Baguio City, Philippines, 2018.
[15]
H. Y. Li, HGI-30 DATA Set [Dataset], http://doi.org/10.5281/zenodo.4646699, 2021.
[16]
T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002.
[17]
L. Zhang, R. Chu, S. Xiang, S. Liao, and S. Z. Li, Face detection based on multi-block lbp representation, presented at the International Conference on Biometrics, Seoul, Republic of Korea, 2007.
[18]
D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
[19]
C. Tao, Y. H. Tan, H. J. Cai, and J. W. Tian, Airport detection from large IKONOS images using clustered SIFT keypoints and region information, IEEE Geoscience and Remote Sensing Letters, vol. 8, no. 1, pp. 128–132, 2011.
[20]
N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, presented at the 22th IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005.
[21]
Y. W. Pang, Y. Yuan, X. L. Li, and J. Pan, Efficient HOG human detection, Signal Processing, .
[22]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, .
[23]
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. L, Imagenet: A large-scale hierarchical image database, presented at the 26th IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009.
[24]
K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, https://arxiv.org/abs/1409.1556, 2015.
[25]
C. Szegedy, W. Liu, Y. Q Jia,, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, presented at the 32th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015.
[26]
K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, presented at IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.
[27]
J. Hu, L. Shen, and G. Sun, Squeeze-and-excitation networks, presented at the 35th IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018.
[28]
Y. H. Liu, Feature extraction and image recognition with convolutional neural networks, Journal of Physics: Conference Series, .
[29]
M. Jogin, Mohana, M. S. Madhulika, G. D. Divya, R. K. Meghana, and S. Apoorva, Feature extraction using convolution neural networks (CNN) and deep learning, presented at the 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India, 2018.
DOI
[30]
C. Shorten and T. M. Khoshgoftaar, A survey on image data augmentation for deep learning, Journal of Big Data, .
[31]
R. S. Zhang, W. Z. Quan, L. B. Fan, L. M. Hu, and L. M. Yan, Distinguishing computer-generated images from natural images using channel and pixel correlation, Journal of Computer Science and Technology, .
[32]
X. J. Zhang, Y. F. Lu, and S. H. Zhang, Multi-task learning for food identification and analysis with deep convolutional neural networks, Journal of Computer Science and Technology, .
[33]
J. G. Jia, Y. F. Zhou, X. W. Hao, F. Li, C. Desrosiers, and C. M. Zhang, Two-stream temporal convolutional networks for skeleton-based human action recognition, Journal of Computer Science and Technology, .
[34]
S. Minaee, Y. Y. Boykov, F. Porikli, A. J. Plaza, N. Kehtarnavaz, and D. Terzopoulos, Image segmentation using deep learning: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, .
[35]
Y. Wu, J. W. Lim, and M. H. Yang, Online object tracking: A benchmark, presented at the 30th IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013.
[36]
Z. F. Xie, Y. C. Guo, S. H. Zhang, W. J. Zhang, and L. Z. Ma, Multi-exposure motion estimation based on deep convolutional networks, Journal of Computer Science and Technology, .
[37]
A. Caroppo, A. Leone, and P. Siciliano, Comparison between deep learning models and traditional machine learning approaches for facial expression recognition in ageing adults, Journal of Computer Science and Technology, .
[38]
P. Wang, E. Fan, and P. Wang, Comparative analysis of image classification algorithms based on traditional machine learning and deep learning, Pattern Recognition Letters, .
[39]
S. Q. Ren, K. M. He, R. Girshick, and J. Sun, Faster R-CNN towards real-time object detection with region proposal networks, http://arxiv.org/abs/1506.01497, 2016.
[40]
K. M. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask R-CNN, presented at the 16th IEEE Conference on Computer Vision (ICCV), Venice, Italy, 2017.
[41]
J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, and Y. C. Wei, Deformable convolutional networks, presented at the 16th IEEE Conference on Computer Vision (ICCV), Venice, Italy, 2017.
[42]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, SSD: Single shot multibox detector, presented at the 14th European Conference on Computer Vision, Amsterdam, the Netherlands, 2016.
[43]
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, You only look once: Unified, real-time object detection, presented at the 33th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016.
[44]
J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, presented at the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017.
[45]
J. Redmon and A. Farhadi, Yolov3: An incremental improvement, https://arxiv.org/abs/1804.02767, 2018.
[46]
Q. J. Zhao, T. Sheng, Y. T. Wang, Z. Tang, Y. Chen, L. Cai, and H. B. Ling, M2det: A single-shot object detector based on multi-level feature pyramid network, in Proc. the 33th AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 2019, pp. 9259–9266.
[47]
M. X. Tan, R. M. Pang, and Q. V. Le, EfficientDet: Scalable and efficient object detection, presented at the 37th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020.
[48]
Q. Q. Chen and Q. H. Xiong, Garbage classification detection based on improved YOLOv4, Journal of Computer and Communications, .
[49]
M. A. Islam, S. Jia, and N. D. B. Bruce, How much position information do convolutional neural networks encode? https://arxiv.org/abs/2001.08248, 2020.
[50]
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, and M. Isard, et al., Tensorflow: A system for large-scale machine learning, in Proc. the 12th USENIX Conference on Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 265–283.
[51]
Z. Z. Wu, S. H. Wan, X. F. Wang, M. Tan, L. Zou, X. L. Li, and Y. Chen, A benchmark data set for aircraft type recognition from remote sensing images, Applied Soft Computing, .
[52]
M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (VOC) challenge, International Journal of Computer Vision, .
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 07 June 2021
Revised: 20 August 2021
Accepted: 18 September 2021
Published: 17 March 2022
Issue date: October 2022

Copyright

© The author(s) 2022.

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Nos. 12001523, 11971046, 12131003, and 11871081), the Scientific Research Project of Beijing Municipal Education Commission (No. KM201910005012), and Beijing Natural Science Foundation Project (No. Z200002).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return