Structure-inducing pre-training

9Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Language model pre-training and the derived general-purpose methods have reshaped machine learning research. However, there remains considerable uncertainty regarding why pre-training improves the performance of downstream tasks. This challenge is pronounced when using language model pre-training in domains outside of natural language. Here we investigate this problem by analysing how pre-training methods impose relational structure in induced per-sample latent spaces—that is, what constraints do pre-training methods impose on the distance or geometry between the pre-trained embeddings of samples. A comprehensive review of pre-training methods reveals that this question remains open, despite theoretical analyses showing the importance of understanding this form of induced structure. Based on this review, we introduce a pre-training framework that enables a granular and comprehensive understanding of how relational structure can be induced. We present a theoretical analysis of the framework from the first principles and establish a connection between the relational inductive bias of pre-training and fine-tuning performance. Empirical studies spanning three data modalities and ten fine-tuning tasks confirm theoretical analyses, inform the design of novel pre-training methods and establish consistent improvements over a compelling suite of methods.

References Powered by Scopus

ImageNet: A Large-Scale Hierarchical Image Database

50715Citations
N/AReaders
Get full text

Dimensionality reduction by learning an invariant mapping

4515Citations
N/AReaders
Get full text

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

1201Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Current and future directions in network biology

7Citations
N/AReaders
Get full text

Graph Artificial Intelligence in Medicine

4Citations
N/AReaders
Get full text

Progress and opportunities of foundation models in bioinformatics

2Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

McDermott, M. B. A., Yap, B., Szolovits, P., & Zitnik, M. (2023). Structure-inducing pre-training. Nature Machine Intelligence, 5(6), 612–621. https://doi.org/10.1038/s42256-023-00647-z

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

43%

Researcher 7

33%

Professor / Associate Prof. 4

19%

Lecturer / Post doc 1

5%

Readers' Discipline

Tooltip

Computer Science 13

68%

Biochemistry, Genetics and Molecular Bi... 3

16%

Mathematics 2

11%

Materials Science 1

5%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free