What is it about?

The deep learning architectures' activation functions play a significant role in processing the data entering the network to provide the most appropriate output. Activation functions (AF) are created by taking into consideration aspects like avoiding model local minima and improving training efficiency. Negative weights and vanishing gradients are frequently taken into account by the AF suggested in the literature. Recently, a number of non-monotonic AF have increasingly replaced previous methods for improving convolutional neural network (CNN) performance. In this study, two novel non-linear non-monotonic activation functions, α­SechSig and α­TanhSig are proposed that can overcome the existing problems. The negative part of α­SechSig and α­TanhSig is non-monotonic and approaches zero as the negative input decreases, allowing the negative part to retain its sparsity while introducing negative activation values and non-zero derivative values. In experimental evaluations, α­SechSig and α­TanhSig activation functions were tested on MNIST, KMNIST, Svhn_Cropped, STL-10, and CIFAR-10 datasets. In addition, better results were obtained than the non-monotonic Swish, Logish, Mish, Smish, and monotonic ReLU, SinLU, and LReLU AF known in the literature. Moreover, the best accuracy score for the αSechSig and αTanhSig activation functions was obtained with MNIST at 0.9959 and 0.9956, respectively.

Featured Image

Read the Original

This page is a summary of: α­SechSig and α­TanhSig: two novel non-monotonic activation functions, Soft Computing, October 2023, Springer Science + Business Media,
DOI: 10.1007/s00500-023-09279-2.
You can read the full text:

Read

Contributors

The following have contributed to this page