Sampling-based gradient regularization for capturing long-term dependencies in recurrent neural networks

2Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Vanishing (and exploding) gradients effect is a common problem for recurrent neural networks which use backpropagation method for calculation of derivatives. We construct an analytical framework to estimate a contribution of each training example to the norm of the long-term components of the target functions gradient and use it to hold the norm of the gradients in the suitable range. Using this subroutine we can construct mini-batches for the stochastic gradient descent (SGD) training that leads to high performance and accuracy of the trained network even for very complex tasks. To check our framework experimentally we use some special synthetic benchmarks for testing RNNs on ability to capture long-term dependencies. Our network can detect links between events in the (temporal) sequence at the range 100 and longer.

Cite

CITATION STYLE

APA

Chernodub, A., & Nowicki, D. (2016). Sampling-based gradient regularization for capturing long-term dependencies in recurrent neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9948 LNCS, pp. 90–97). Springer Verlag. https://doi.org/10.1007/978-3-319-46672-9_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free