Using Gradient Descent to An Optimization Algorithm that uses the Optimal Value of Parameters (Coefficients) for a Differentiable Function

Falah Amer  Abdulazeez; Abdul Sttar  Ismail; Rafid  S. Abdulaziz

doi:10.17762/ijcnis.v15i1.5718

PDF

Published: Jun 7, 2023

DOI: https://doi.org/10.17762/ijcnis.v15i1.5718

Keywords:

Gradient descent, optimisation algorithm, deep network optimisation; adaptive gradient descent; batch size

Falah Amer Abdulazeez

University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

Abdul Sttar Ismail

University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

Rafid S. Abdulaziz

University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

Abstract

Deep neural networks (DNN) are commonly employed. Deep networks' many parameters require extensive training. Complex optimizers with multiple hyper parameters speed up network training and increase generalisation. Complex optimizer hyper parameter tuning is generally trial-and-error. In this study, we visually assess the distinct contributions of training samples to a parameter update. Adaptive stochastic gradient descent is a variation of batch stochastic gradient descent for neural networks using ReLU in hidden layers (aSGD). It involves the mean effective gradient as the genuine slope for boundary changes, in contrast to earlier procedures. Experiments on MNIST show that aSGD speeds up DNN optimization and improves accuracy without added hyper parameters. Experiments on synthetic datasets demonstrate it can locate redundant nodes, which helps model compression.

How to Cite

Abdulazeez, F. A. ., Ismail, A. S. ., & S. Abdulaziz, R. . (2023). Using Gradient Descent to An Optimization Algorithm that uses the Optimal Value of Parameters (Coefficients) for a Differentiable Function. International Journal of Communication Networks and Information Security (IJCNIS), 15(1), 24–36. https://doi.org/10.17762/ijcnis.v15i1.5718

Issue

Vol. 15 No. 1 (2023)

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details