Feature Representations Using the Reflected Rectified Linear Unit (RReLU) Activation

Chaity Banerjee; Tathagata Mukherjee; Eduardo Pasiliao Jr.

doi:10.26599/BDMA.2019.9020024

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (9.3 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Feature Representations Using the Reflected Rectified Linear Unit (RReLU) Activation

Chaity Banerjee, Tathagata Mukherjee(

), Eduardo Pasiliao Jr.

∙ Department of Idustrial & Systems Engineering, University of Central Florida, Orlando, FL 32816-2368, USA.

∙ Department of Computer Science, University of Alabama in Huntsville, Huntsville, AL 35806, USA.

∙ Air Force Research Labs, United States Air Force, Eglin Air Force Base, Shalimar, FL 32579, USA.

Show Author Information

Abstract

Deep Neural Networks (DNNs) have become the tool of choice for machine learning practitioners today. One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers. In this work, we introduce a four-output activation function called the Reflected Rectified Linear Unit (RReLU) activation which considers both a feature and its negation during computation. Our activation function is "sparse", in that only two of the four possible outputs are active at a given time. We test our activation function on the standard MNIST and CIFAR-10 datasets, which are classification problems, as well as on a novel Computational Fluid Dynamics (CFD) dataset which is posed as a regression problem. On the baseline network for the MNIST dataset, having two hidden layers, our activation function improves the validation accuracy from 0.09 to 0.97 compared to the well-known ReLU activation. For the CIFAR-10 dataset, we use a deep baseline network that achieves 0.78 validation accuracy with 20 epochs but overfits the data. Using the RReLU activation, we can achieve the same accuracy without overfitting the data. For the CFD dataset, we show that the RReLU activation can reduce the number of epochs from 100 (using ReLU) to 10 while obtaining the same levels of performance.

Keywords

deep learning feature space approximations multi-output activations Rectified Linear Unit (ReLU)

References

[1]

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.

[2]

O. M. Parkhi, A. Vedaldi, and A. Zisserman, Deep face recognition, in Proc. British Machine Vision Conf., Swansea, UK, 2015.

Crossref

[3]

Z. Yu, T. R. Li, N. Yu, X. Gong, K. Chen, and Y. Pan, Three-stream convolutional networks for video-based person re-identification, arXiv preprint arXiv: 1712.01652, 2017.

[4]

R. Socher, Y. Bengio, and C. D. Manning, Deep learning for NLP (without magic), in Proc. Tutorial Abstracts of ACL 2012, Jeju Island, Korea, 2012, p. 5.

[5]

D. Roy, T. Mukherjee, M. Chatterjee, and E. Pasiliao, Detection of rogue RF transmitters using generative adversarial nets, in 2019 IEEE Wireless Communications and Networking Conf., Marrakesh, Morocco, 2019.

Crossref

[6]

A. Byravan and D. Fox, SE3-nets: Learning rigid body motion using deep neural networks, in 2017 IEEE Int. Conf. Robotics and Automation, Singapore, 2017, pp. 173-180.

Crossref

[7]

J. Liu, Y. Pan, M. Li, Z. Y. Chen, L. Tang, C. Q. Lu, and J. X. Wang, Applications of deep learning to MRI images: A survey, Big Data Mining and Analytics, vol. 1, no. 1, pp. 1-18, 2018.

Crossref Google Scholar

[8]

M. Zeng, M. Li, Z. H. Fei, F. X. Wu, Y. H. Li, Y. Pan, and J. X. Wang, A deep learning framework for identifying essential proteins by integrating multiple types of biological information, IEEE/ACM Trans. Comput. Biol. Bioinfor., .

Crossref Google Scholar

[9]

M. Yan, L. Liu, S. H. Chen, and Y. Pan, A deep learning method for prediction of benign epilepsy with centrotemporal spikes, in Proc. 14th Int. Symp. Bioinformatics Research and Applications, Beijing, China, 2018, pp. 253-258.

Crossref

[10]

N. Yu, Z. Yu, F. Gu, T. R. Li, X. X. Tian, and Y. Pan, Deep learning in genomic and medical image data analysis: Challenges and approaches, J. Inf. Process. Syst., vol. 13, no. 2, pp. 204-214, 2017.

Google Scholar

[11]

C. Y. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, Understanding deep learning requires rethinking generalization, arXiv preprint arXiv: 1611.03530, 2016.

[12]

K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural Networks, vol. 2, no. 5, pp. 359-366, 1989.

Crossref Google Scholar

[13]

D. M. Loroch, F. J. Pfreundt, N. Wehn, and J. Keuper, Sparsity in deep neural networks—An empirical investigation with tensorQuant, in Joint European Conf. Machine Learning and Knowledge Discovery in Databases, Dublin, Ireland, 2018, pp. 5-20.

Crossref

[14]

W. L. Shang, K. Sohn, D. Almeida, and H. Lee, Understanding and improving convolutional neural networks via concatenated rectified linear units, in Proc. 33rd Int. Conf. Machine Learning, New York, NY, USA, 2016, pp. 2217-2225.

[15]

M. Blot, M. Cord, and N. Thome, Max-min convolutional neural networks for image classification, in 2016 IEEE Int. Conf. Image Processing, Phoenix, AZ, USA, 2016, pp. 3678-3682.

Crossref

[16]

D. Rolnick and M. Tegmark, The power of deeper networks for expressing natural functions, arXiv preprint arXiv: 1705.05502, 2017.

[17]

R. M. Neal, Connectionist learning of belief networks, Artif. Intell., vol. 56, no. 1, pp. 71-113, 1992.

Crossref Google Scholar

[18]

X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in Proc. 13th Int. Conf. Artificial Intelligence and Statistics, Sardinia, Italy, 2010, pp. 249-256.

[19]

M. Courbariaux, Y. Bengio, and J. P. David, BinaryConnect: Training deep neural networks with binary weights during propagations, in Proc. 28th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 3123-3131.

[20]

S. Elfwing, E. Uchibe, and K. Doya, Sigmoid-weighted linear units for neural network function approximation in reinforcement learning, Neural Networks, vol. 107, pp. 3-11, 2018.

Crossref Google Scholar

[21]

P. Ramachandran, B. Zoph, and Q. V. Le, Searching for activation functions, arXiv preprint arXiv: 1710.05941, 2017.

[22]

Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436-444, 2015.

Crossref Google Scholar

[23]

B. Karlik and A. Vehbi, Performance analysis of various activation functions in generalized MLP architectures of neural networks, Int. J. Artif. Intell. Expert Syst., vol. 1, no. 4, pp. 111-122, 2011.

Google Scholar

[24]

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processing (almost) from scratch, J. Mach. Learn. Res., vol. 12, pp. 2493-2537, 2011.

Google Scholar

[25]

J. Turian, J. Bergstra, and Y. Bengio, Quadratic features and deep architectures for chunking, in Proc. Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, USA, 2009, pp. 245-248.

Crossref

[26]

V. Nair and G. E. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proc. 27th Int. Conf. Machine Learning, Haifa, Israel, 2010, pp. 807-814.

[27]

M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, et al., On rectified linear units for speech processing, in 2013 IEEE Int. Conf. Acoustics, Speech and Signal Processing, Vancouver, Canada, 2013, pp. 3517-3521.

Crossref

[28]

R. Arora, A. Basu, P. Mianjy, and A. Mukherjee, Understanding deep neural networks with rectified linear units, arXiv preprint arXiv: 1611.01491, 2016.

[29]

S. Goel, V. Kanade, A. Klivans, and J. Thaler, Reliably learning the ReLU in polynomial time, arXiv preprint arXiv: 1611.10258, 2016.

[30]

C. Banerjee, T. Mukherjee, and E. Jr. Pasiliao, An empirical study on generalizations of the ReLU activation function, in Proc. 2019 ACM Southeast Conf., Kennesaw, GA, USA, 2019, pp. 164-167.

Crossref

[31]

A. L. Maas, A. Y. Hannun, and A. Y. Ng, Rectifier nonlinearities improve neural network acoustic models, in Proc. 30th Int. Conf. Machine Learning, Atlanta, GA, USA, 2013.

[32]

K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification, in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 1026-1034.

Crossref

[33]

X. J. Jin, C. Y. Xu, J. S. Feng, Y. C. Wei, J. J. Xiong, and S. C. Yan, Deep learning with S-shaped rectified linear activation units, in Proc. 30th AAAI Conf. Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 1737-1743.

[34]

S. Qiu and B. L. Cai, Flexible rectified linear units for improving convolutional neural networks, arXiv preprint arXiv: 1706.08098, 2017.

Crossref

[35]

C. Dugas, Y. Bengio, F. Bélisle, C. Nadeau, and R. Garcia, Incorporating second-order functional knowledge for better option pricing, in Proc. 13th Int. Conf. Neural Information Processing Systems, Denver, CO, USA, 2001, pp. 472-478.

[36]

L. Trottier, P. Giguere, and B. Chaib-draa, Parametric exponential linear unit for deep convolutional neural networks, in Proc. 16th IEEE Int. Conf. Machine Learning and Applications, Cancun, Mexico, 2017, pp. 207-214.

Crossref

[37]

B. Grelsson and M. Felsberg, Improved learning in convolutional neural networks with shifted exponential linear units (ShELUs), in Proc. 24th Int. Conf. Pattern Recognition, Beijing, China, 2018, pp. 517-522.

Crossref

[38]

I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, Maxout networks, arXiv preprint arXiv: 1302.4389, 2013.

[39]

H. W. Lin, M. Tegmark, and D. Rolnick, Why does deep and cheap learning work so well? J. Stat. Phys., vol. 168, no. 6, pp. 1223-1247, 2017.

Crossref Google Scholar

[40]

P. Petersen and F. Voigtlaender, Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Networks, vol. 108, pp. 296-330, 2018.

Crossref Google Scholar

[41]

Z. Yu, T. R. Li, N. Yu, Y. Pan, H. M. Chen, and B. Liu, Reconstruction of hidden representation for robust feature extraction, ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, p. 18, 2019.

Crossref Google Scholar

[42]

Z. Yu, N. Yu, Y. Pan, and T. R. Li, A novel deep learning network architecture with cross-layer neurons, in 2016 IEEE Int. Conf. Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communications, Atlanta, GA, USA, 2016, pp. 111-117.

Crossref

[43]

P. Ballester and R. M. Araujo, On the performance of GoogLeNet and AlexNet applied to sketches, in Proc. 30th AAAI Conf. Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 1124-1128.

[44]

A. Kendall, M. Grimes, and R. Cipolla, PoseNet: A convolutional network for real-time 6-DOF camera relocalization, in Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 2938-2946.

Crossref

[45]

A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 5188-5196.

Crossref

[46]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv: 1412.6980, 2014.

[47]

A. Krizhevsky and G. Hinton, Convolutional deep belief networks on CIFAR-10, Unpublished Manuscript, vol. 40, no. 7, pp. 1-9, 2010.

Google Scholar

[48]

C. Banerjee, T. Mukherjee, C. Lilian, D. Reasor, X. W. Liu, and E. Pasiliao, A feature selection algorithm using neural networks, International Journal of Machine Learning and Computing, vol. 4, pp. 1-8, 2020.

Google Scholar

Big Data Mining and Analytics

Volume 3 Issue 2,
June 2020

Pages 102-120

DOI: 10.26599/BDMA.2019.9020024

Cite this article:

Banerjee C, Mukherjee T, Pasiliao Jr. E. Feature Representations Using the Reflected Rectified Linear Unit (RReLU) Activation. Big Data Mining and Analytics, 2020, 3(2): 102-120. https://doi.org/10.26599/BDMA.2019.9020024

900

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 25 November 2019

Accepted: 13 December 2019

Published: 27 February 2020

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).