[1]
G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, Densely connected convolutional networks, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 2261-2269.
[2]
T. Lin, P. Goyal, R. B. Girshick, K. He, and P. Dollar, Focal loss for dense object detection, in Proceedings of the International Conference on Computer Vision, Venice, Italy, 2017, pp. 2999-3007.
[3]
K. He, G. Gkioxari, P. Dollar, and R. B. Girshick, Mask R-CNN, in Proceedings of the International Conference on Computer Vision, Venice, Italy, 2017, pp. 2980-2988.
[4]
O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in Proc. Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 2015, pp. 234-241.
[5]
J. Lian, X. Zhou, F. Zhang, Z. Chen, X. Xie, and G. Sun, xdeepfm: Combining explicit and implicit feature interactions for recommender systems, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, London, UK, 2018, pp. 1754-1763.
[6]
S. Wang, L. He, B. Cao, C. Lu, P. S. Yu, and A. B. Ragin, Structural deep brain network mining, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 475-484.
[7]
H. Xu, Z. Yu, J. Yang, H. Xiong, and H. Zhu, Dynamic talent flow analysis with deep sequence prediction modeling, Transactions on Knowledge and Data Engineering, vol. 31, no. 10, pp. 1926-1939, 2019.
[8]
D. P. Kingma and M. Welling, Auto-encoding variational bayes, in Proceedings of the International Conference on Learning Representations, Banff, Canada, 2014, pp. 34-42.
[9]
Y. Li and J. Ye, Learning adversarial networks for semi-supervised text classification via policy gradient, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, London, UK, 2018, pp. 1715-1723.
[10]
K. G. Dizaji, X. Wang, and H. Huang, Semi-supervised generative adversarial network for gene expression inference, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, London, UK, 2018, pp. 1435-1444.
[11]
T. Lin, A. Roy Chowdhury, and S. Maji, Bilinear CNN models for fine-grained visual recognition, in Proceedings of the International Conference on Computer Vision, Santiago, Chile, 2015, pp. 1449-1457.
[12]
H. Zheng, J. Fu, T. Mei, and J. Luo, Learning multi-attention convolutional neural network for fine-grained image recognition, in Proceedings of the International Conference on Computer Vision, Venice, Italy, 2017, pp. 5219-5227.
[13]
C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, The Caltech-UCSD Birds-200-2011 Dataset, Report, California Institute of Technology, CA, USA, 2011.
[14]
X. Zhang, H. Xiong, W. Zhou, W. Lin, and Q. Tian, Picking deep filter responses for fine-grained image recognition, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 1134-1142.
[15]
A. Khosla, N. Jayadevaprakash, B. Yao, and F.-F. Li, Novel dataset for fine-grained image categorization: Stanford dogs, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA, 2011, p. 1.
[16]
J. Krause, H. Jin, J. Yang, and F. Li, Fine-grained recognition without part annotations, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 5546-5555.
[17]
Z.-H. Zhou, Abductive learning: Towards bridging machine learning and logical reasoning, Science China Information Sciences, vol. 62, no. 7, pp. 76 101:1-76 101:3, 2019.
[18]
M. Ilse, J. M. Tomczak, and M. Welling, Attention-based deep multiple instance learning, in Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 2018, pp. 2132-2141.
[19]
M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov, and A. J. Smola, Deep sets, in Proc. of Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 3394-3404.
[20]
Y. Gao, O. Beijbom, N. Zhang, and T. Darrell, Compact bilinear pooling, in Proceedings of the International Conference on Computer Vision, Las Vegas, NV, USA, 2016, pp. 317-326.
[21]
N. Zhang, J. Donahue, R. B. Girshick, and T. Darrell, Part-based R-CNNs for fine-grained category detection, in Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 2014, pp. 834-849.
[22]
F. Perronnin and D. Larlus, Fisher vectors meet neural networks: A hybrid classification architecture, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3743-3752.
[23]
J. Fu, H. Zheng, and T. Mei, Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 4476-4484.
[24]
P. H. O. Pinheiro and R. Collobert, From image-level to pixel-level labeling with convolutional networks, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1713-1721.
[25]
J. Feng and Z. Zhou, Deep MIML network, in Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 2017, pp. 1884-1890.
[26]
Y. Yang, Y. Wu, D. Zhan, Z. Liu, and Y. Jiang, Complex object classification: A multi-modal multi-instance multi-label deep network with optimal transport, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, London, UK, 2018, pp. 2594-2603.
[27]
K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in Proceedings of the International Conference on Machine Learning, Lille, France, 2015, pp. 2048-2057.
[28]
H. Li, M. R. Min, Y. Ge, and A. Kadav, A context-aware attention network for interactive question answering, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 927-935.
[29]
N. Pappas and A. Popescu-Belis, Explaining the stars: Weighted multiple-instance learning for aspect-based sentiment analysis, in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 455-466.
[30]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770-778.
[31]
X. Wang, Y. Yan, P. Tang, X. Bai, and W. Liu, Revisiting multiple instance neural networks, Pattern Recognition, vol. 74, pp. 15-24, 2018.
[32]
L. v. d. Maaten and G. Hinton, Visualizing data using t-SNE, Journal of Machine Learning Research, vol. 9, no. 11, pp. 2579-2605, 2008.