Improving Stochastic Gradient Descent Initializing with Data Summarization

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Linear Regression (LR) is the prototypical statistical model, which can be applied on a wide range of predictive problems. Ordinary Least Squares (OLS), is the standard technique for estimating the parameters of the LR model. However, such computation can be slow and resource-hungry for large data sets with high dimensionality, due to heavy matrix operations. More importantly, OLS may be impractical for large data sets as the entire data set is required to be loaded into main memory, often exceeding RAM capacity. These limitations emphasize the need for optimization techniques to compute LR. Two state of the art algorithms used to compute LR are: Stochastic Gradient Descent (SGD) and Data Summarization (DS), combined with a matrix factorization. A few decades ago DS was the main technique to accelerate data mining computations, followed by SGD. Nowadays, SGD has become the workhorse behind most ML algorithms and deep neural networks. Merging both techniques, we propose to initialize SGD with a solution computed via DS on the initial batch of points, leaving SGD computation on the remaining points “as is”. An experimental evaluation with several data sets shows our improved SGD algorithm reaches higher quality solutions (lower MSE error, higher R2 ) and it converges faster (less iterations, reduced data usage, less computation time). We believe our simple SGD change can benefit many more ML models beyond LR.

Cite

CITATION STYLE

APA

Varghese, R., & Ordonez, C. (2023). Improving Stochastic Gradient Descent Initializing with Data Summarization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14148 LNCS, pp. 212–223). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-39831-5_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free