New Benchmark for Household Garbage Image Recognition

Zhize Wu; Huanyi Li; Xiaofeng Wang; Zijun Wu; Le Zou; Lixiang Xu; Ming Tan

doi:10.26599/TST.2021.9010072

Tsinghua Science and Technology 2022, 27(5): 793-803 https://doi.org/10.26599/TST.2021.9010072

Open Access | Issue | Published: 17 March 2022

New Benchmark for Household Garbage Image Recognition

Show Author's Information Hide Author's Information Zhize Wu, Huanyi Li(

), Xiaofeng Wang(

), Zijun Wu, Le Zou, Lixiang Xu, Ming Tan

School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China

School of Energy Materials and Chemical Engineering, Hefei University, Hefei 230601, China

Keywords:

image classification, benchmark, household garbage, deep convolutional neural networks

Cite this article:

Wu Z, Li H, Wang X, et al. New Benchmark for Household Garbage Image Recognition. Tsinghua Science and Technology, 2022, 27(5): 793-803. https://doi.org/10.26599/TST.2021.9010072

Download citation

EndNote(RIS)

BibTeX

790

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Household garbage images are usually faced with complex backgrounds, variable illuminations, diverse angles, and changeable shapes, which bring a great difficulty in garbage image classification. Due to the ability to discover problem-specific features, deep learning and especially convolutional neural networks (CNNs) have been successfully and widely used for image representation learning. However, available and stable household garbage datasets are insufficient, which seriously limits the development of research and application. Besides, the state-of-the-art in the field of garbage image classification is not entirely clear. To solve this problem, in this study, we built a new open benchmark dataset for household garbage image classification by simulating different lightings, backgrounds, angles, and shapes. This dataset is named 30 classes of household garbage images (HGI-30), which contains 18 000 images of 30 household garbage classes. The publicly available HGI-30 dataset allows researchers to develop accurate and robust methods for household garbage recognition. We also conducted experiments and performance analyses of the state-of-the-art deep CNN methods on HGI-30, which serves as baseline results on this benchmark.

Full text

Abstract

Full text

Outline

About this article

New Benchmark for Household Garbage Image Recognition

Show Author's information Hide Author's Information Zhize Wu, Huanyi Li(

), Xiaofeng Wang(

), Zijun Wu, Le Zou, Lixiang Xu, Ming Tan

School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China

School of Energy Materials and Chemical Engineering, Hefei University, Hefei 230601, China

Abstract

Keywords: image classification, benchmark, household garbage, deep convolutional neural networks

References(52)

[1]

Z. J. Ding, C. J. Zhu, J. J. Wang, Y. F. Qiu, and G. Cen, Garbage classification system based on AI and IoT, presented at the 15th IEEE International Conference on Computer Science & Education, Delft, the Netherlands, 2020.

[2]

Q. X. Zhang, G. H. Lin, Y. M. Zhang, G. Xu, and J. J. Wang, Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images, Procedia engineering, vol. 221, no. 3, pp. 441–446, 2018.

DOI Google Scholar

[3]

H. Z. Chen, A. Chen, L. L. Xu, H. Xie, H. L. Qiao, Q. Y. Lin, and K. Cai, A deep learning CNN architecture applied in smart near-infrared analysis of water pollution for agricultural irrigation resources, Agricultural Water Management, .

DOI Google Scholar

[4]

C. M. Han, G. F. Li, Y. X. Ding, F. L. Yan, and L. Y. Bai, Chimney detection based on Faster R-CNN and spatial analysis methods in high resolution remote sensing images, Sensors, .

DOI Google Scholar

[5]

G. S. Hu, H. Y. Wang, Y. Zhang, and M. Z. Wan, Detection and severity analysis of tea leaf blight based on deep learning, Computers & Electrical Engineering, .

DOI Google Scholar

[6]

D. Datta and S. B. Jamalmohammed, Image classification using CNN with multi-core and many-core architecture, Applications of Artificial Intelligence for Smart Technology, .

DOI Google Scholar

[7]

D. Zeng, S. Zhang, F. Chen, and Y. Wang, Multi-scale CNN based garbage detection of airborne hyperspectral data, IEEE Access, .

DOI Google Scholar

[8]

A. B. Ye, B. Pang, Y. C. Jin, and J. H. Cui, A YOLO-based neural network with VAE for intelligent garbage detection and classification, presented at the 3rd International Conference on Algorithms, Computing and Artificial Intelligence, Sanya, China, 2020.

DOI

[9]

Z. F. Nie, W. J. Duan, and X. D. Li, Domestic garbage recognition and detection based on Faster R-CNN, Journal of Physics: Conference Series, .

DOI Google Scholar

[10]

J. Q. Bai, S. G. Lian, Z. X. Liu, K. Wang, and D. J. Liu, Deep learning based robot for automatically picking up garbage on the grass, IEEE Transactions on Consumer Electronics, .

DOI Google Scholar

[11]

H. Liu, G. O. Owolab, and S. H. Kim, Automatic Classifications and Recognition for Recycled Garbage by Utilizing Deep Learning Technology, in Proc. the 2019 7th International Conference on Information Technology: IoT and Smart City, Shanghai, China, 2019, pp. 1–4.

[12]

G. Mitta, K. B. Yagnik, M. Garg, and N. C. Krishnan, Spotgarbage: Smartphone app to detect garbage using deep learning, in Proc. the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Heidelberg, Germany, 2016, pp. 940–945.

DOI

[13]

G. Y. Jia, Y. J. Zhu, G. J. Han, S. Chan, and L. Shu, STC: An intelligent trash can system based on both NB-IoT and edge computing for smart cities, Enterprise Information Systems, vol. 14, nos. 9&10, pp. 1422–1438, 2020.

DOI Google Scholar

[14]

S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, Common garbage classification using mobilenet, presented at the 10th IEEE International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), Baguio City, Philippines, 2018.

DOI

[15]

H. Y. Li, HGI-30 DATA Set [Dataset], http://doi.org/10.5281/zenodo.4646699, 2021.

[16]

T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002.

DOI Google Scholar

[17]

L. Zhang, R. Chu, S. Xiang, S. Liao, and S. Z. Li, Face detection based on multi-block lbp representation, presented at the International Conference on Biometrics, Seoul, Republic of Korea, 2007.

[18]

D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.

DOI Google Scholar

[19]

C. Tao, Y. H. Tan, H. J. Cai, and J. W. Tian, Airport detection from large IKONOS images using clustered SIFT keypoints and region information, IEEE Geoscience and Remote Sensing Letters, vol. 8, no. 1, pp. 128–132, 2011.

DOI Google Scholar

[20]

N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, presented at the 22th IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005.

[21]

Y. W. Pang, Y. Yuan, X. L. Li, and J. Pan, Efficient HOG human detection, Signal Processing, .

DOI Google Scholar

[22]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, .

DOI Google Scholar

[23]

J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. L, Imagenet: A large-scale hierarchical image database, presented at the 26th IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009.

DOI

[24]

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, https://arxiv.org/abs/1409.1556, 2015.

[25]

C. Szegedy, W. Liu, Y. Q Jia,, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, presented at the 32th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015.

DOI

[26]

K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, presented at IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.

DOI

[27]

J. Hu, L. Shen, and G. Sun, Squeeze-and-excitation networks, presented at the 35th IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018.

DOI

[28]

Y. H. Liu, Feature extraction and image recognition with convolutional neural networks, Journal of Physics: Conference Series, .

DOI Google Scholar

[29]

M. Jogin, Mohana, M. S. Madhulika, G. D. Divya, R. K. Meghana, and S. Apoorva, Feature extraction using convolution neural networks (CNN) and deep learning, presented at the 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India, 2018.

DOI

[30]

C. Shorten and T. M. Khoshgoftaar, A survey on image data augmentation for deep learning, Journal of Big Data, .

DOI Google Scholar

[31]

R. S. Zhang, W. Z. Quan, L. B. Fan, L. M. Hu, and L. M. Yan, Distinguishing computer-generated images from natural images using channel and pixel correlation, Journal of Computer Science and Technology, .

DOI Google Scholar

[32]

X. J. Zhang, Y. F. Lu, and S. H. Zhang, Multi-task learning for food identification and analysis with deep convolutional neural networks, Journal of Computer Science and Technology, .

DOI Google Scholar

[33]

J. G. Jia, Y. F. Zhou, X. W. Hao, F. Li, C. Desrosiers, and C. M. Zhang, Two-stream temporal convolutional networks for skeleton-based human action recognition, Journal of Computer Science and Technology, .

DOI Google Scholar

[34]

S. Minaee, Y. Y. Boykov, F. Porikli, A. J. Plaza, N. Kehtarnavaz, and D. Terzopoulos, Image segmentation using deep learning: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, .

DOI Google Scholar

[35]

Y. Wu, J. W. Lim, and M. H. Yang, Online object tracking: A benchmark, presented at the 30th IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013.

DOI

[36]

Z. F. Xie, Y. C. Guo, S. H. Zhang, W. J. Zhang, and L. Z. Ma, Multi-exposure motion estimation based on deep convolutional networks, Journal of Computer Science and Technology, .

DOI Google Scholar

[37]

A. Caroppo, A. Leone, and P. Siciliano, Comparison between deep learning models and traditional machine learning approaches for facial expression recognition in ageing adults, Journal of Computer Science and Technology, .

DOI Google Scholar

[38]

P. Wang, E. Fan, and P. Wang, Comparative analysis of image classification algorithms based on traditional machine learning and deep learning, Pattern Recognition Letters, .

DOI Google Scholar

[39]

S. Q. Ren, K. M. He, R. Girshick, and J. Sun, Faster R-CNN towards real-time object detection with region proposal networks, http://arxiv.org/abs/1506.01497, 2016.

DOI

[40]

K. M. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask R-CNN, presented at the 16th IEEE Conference on Computer Vision (ICCV), Venice, Italy, 2017.

DOI

[41]

J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, and Y. C. Wei, Deformable convolutional networks, presented at the 16th IEEE Conference on Computer Vision (ICCV), Venice, Italy, 2017.

DOI

[42]

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, SSD: Single shot multibox detector, presented at the 14th European Conference on Computer Vision, Amsterdam, the Netherlands, 2016.

DOI

[43]

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, You only look once: Unified, real-time object detection, presented at the 33th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016.

DOI

[44]

J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, presented at the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017.

DOI

[45]

J. Redmon and A. Farhadi, Yolov3: An incremental improvement, https://arxiv.org/abs/1804.02767, 2018.

[46]

Q. J. Zhao, T. Sheng, Y. T. Wang, Z. Tang, Y. Chen, L. Cai, and H. B. Ling, M2det: A single-shot object detector based on multi-level feature pyramid network, in Proc. the 33th AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 2019, pp. 9259–9266.

DOI

[47]

M. X. Tan, R. M. Pang, and Q. V. Le, EfficientDet: Scalable and efficient object detection, presented at the 37th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020.

DOI

[48]

Q. Q. Chen and Q. H. Xiong, Garbage classification detection based on improved YOLOv4, Journal of Computer and Communications, .

DOI Google Scholar

[49]

M. A. Islam, S. Jia, and N. D. B. Bruce, How much position information do convolutional neural networks encode? https://arxiv.org/abs/2001.08248, 2020.

[50]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, and M. Isard, et al., Tensorflow: A system for large-scale machine learning, in Proc. the 12th USENIX Conference on Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 265–283.

[51]

Z. Z. Wu, S. H. Wan, X. F. Wang, M. Tan, L. Zou, X. L. Li, and Y. Chen, A benchmark data set for aircraft type recognition from remote sensing images, Applied Soft Computing, .

DOI Google Scholar

[52]

M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (VOC) challenge, International Journal of Computer Vision, .

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 07 June 2021

Revised: 20 August 2021

Accepted: 18 September 2021

Published: 17 March 2022

Issue date: October 2022

Copyright

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Nos. 12001523, 11971046, 12131003, and 11871081), the Scientific Research Project of Beijing Municipal Education Commission (No. KM201910005012), and Beijing Natural Science Foundation Project (No. Z200002).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).