Z. Yuan, Y. Lu, and Y. Xue, DroidDetector: Android malware characterization and detection using deep learning, Tsinghua Science and Technology, vol. 21, no. 1, pp. 114–123, 2016.
Y. Sun, Z. Dou, Y. Li, and S. Wang, Improving semantic part features for person re-identification with supervised non-local similarity, Tsinghua Science and Technology, vol. 25, no. 5, pp. 636–646, 2020.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009.
H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., vol. 22, no. 3, pp. 400–407, 1951.
R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Proc. 26th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 315–323.
A. Dedazio, F. Bach, and S. Lacoste-Julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 1646–1654.
O. Shamir and T. Zhang, Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes, in Proc. 30th Int. Conf. Machine Learning, Atlanta, GA, USA, 2013, pp. 71–79.
L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM J. Optim., vol. 24, no. 4, pp. 2057–2075, 2014.
S. Shalev-Shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, in Proc. 31st Int. Conf. Machine Learning, Beijing, China, 2014, pp. I-64–I-72.
A. Defazio, A simple practical accelerated method for finite sums, in Proc. 30th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 676–684.
H. Lin, J. Mairal, and Z. Harchaoui, Catalyst acceleration for first-order convex optimization: From theory to practice, J. Mach. Learn. Res., vol. 18, no. 1, pp. 7854–7907, 2017.
Z. Allen-Zhu, Katyusha: The first direct acceleration of stochastic gradient methods, J. Mach. Learn. Res., vol. 18, no. 1, pp. 8194–8244, 2017.
K. Zhou, F. Shang, and J. Cheng, A simple stochastic variance reduced algorithm with fast convergence rates, in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 5980–5989.
Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course. New York, NY, USA: Springer, 2004.
A. Chambolle and C. H. Dossal, On the convergence of the iterates of “FISTA”, J. Optim. Theory Appl., vol. 166, no. 3, p. 25, 2015.
J. Liu, L. Xu, S. Shen, and Q. Ling, An accelerated variance reducing stochastic method with Douglas-Rachford splitting, Mach. Learn., vol. 108, no. 5, pp. 859–878, 2019.
P. Panagiotis, L. Stella, and A. Bemporad, Douglas-Rachford splitting: Complexity estimates and accelerated variants, in Proc. 53rd IEEE Conf. Decision and Control, Los Angeles, CA, USA, 2014, pp. 4234–4239.
C. Lemaréchal and C. Sagastizábal, Practical aspects of the Moreau–Yosida regularization: Theoretical preliminaries, SIAM J. Optim., vol. 7, no. 2, pp. 367–385, 1997.
T. Hofmann, A. Lucchi, S. Lacoste-Julien, and B. McWilliams, Variance reduced stochastic gradient descent with neighbors, in Proc. 28th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 2305–2313.
H. Luo, X. Bai, G. Lim, and J. Peng, New global algorithms for quadratic programming with a few negative eigenvalues based on alternative direction method and convex relaxation, Math. Prog. Comp., vol. 11, no. 1, pp. 119–171, 2019.
H. Luo, X. Ding, J. Peng, R. Jiang, and D. Li, Complexity results and effective algorithms for worst-case linear optimization under uncertainties, Informs J. Comput., vol. 33, no. 1, pp. 180–197, 2021.