SNPInterForest: A new method for detecting epistatic interactions

57Citations
Citations of this article
109Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Multiple genetic factors and their interactive effects are speculated to contribute to complex diseases. Detecting such genetic interactive effects, i.e., epistatic interactions, however, remains a significant challenge in large-scale association studies.Results: We have developed a new method, named SNPInterForest, for identifying epistatic interactions by extending an ensemble learning technique called random forest. Random forest is a predictive method that has been proposed for use in discovering single-nucleotide polymorphisms (SNPs), which are most predictive of the disease status in association studies. However, it is less sensitive to SNPs with little marginal effect. Furthermore, it does not natively exhibit information on interaction patterns of susceptibility SNPs. We extended the random forest framework to overcome the above limitations by means of (i) modifying the construction of the random forest and (ii) implementing a procedure for extracting interaction patterns from the constructed random forest. The performance of the proposed method was evaluated by simulated data under a wide spectrum of disease models. SNPInterForest performed very well in successfully identifying pure epistatic interactions with high precision and was still more than capable of concurrently identifying multiple interactions under the existence of genetic heterogeneity. It was also performed on real GWAS data of rheumatoid arthritis from the Wellcome Trust Case Control Consortium (WTCCC), and novel potential interactions were reported.Conclusions: SNPInterForest, offering an efficient means to detect epistatic interactions without statistical analyses, is promising for practical use as a way to reveal the epistatic interactions involved in common complex diseases. © 2011 Yoshida and Koike; licensee BioMed Central Ltd.

Figures

  • Figure 1 Performance for models of epistatic interactions with weak marginal effects. Performance of SNPInterForest compared with those of BOOST and SNPHarvester for models of epistatic interactions with weak marginal effects: (a) additive model, (b) multiplicative model, and (c) threshold model.
  • Figure 2 Distribution of the importance score compared with the original RF. Distribution of the importance score which is computed by permutation testing for a simple model of pure epistatic interactions. The upper panel shows the results from the original random forest, and the lower panel shows the result from SNPInterForest. The SNPs associated with disease are represented by red boxes, and the other SNPs are represented by black boxes.
  • Table 1 Comparison of performances of different methods on simple models of pure epistatic interactions
  • Table 2 Comparison of performances of different methods on hybrid models
  • Table 3 Comparison of performances of different methods on heterogeneous models
  • Table 4 Running time of different methods on WTCCC RA data
  • Table 5 Interactions identified by SNPInterForest in WTCCC RA data
  • Table 6 Gene information related to the SNPs identified in WTCCC RA data

References Powered by Scopus

Random forests

94865Citations
29772Readers

This article is free to access.

This article is free to access.

Get full text

Cited by Powered by Scopus

640Citations
896Readers
Get full text
314Citations
645Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yoshida, M., & Koike, A. (2011). SNPInterForest: A new method for detecting epistatic interactions. BMC Bioinformatics, 12. https://doi.org/10.1186/1471-2105-12-469

Readers over time

‘11‘12‘13‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘24‘2505101520

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 44

54%

Researcher 26

32%

Professor / Associate Prof. 12

15%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 23

40%

Computer Science 15

26%

Mathematics 10

17%

Biochemistry, Genetics and Molecular Bi... 10

17%

Save time finding and organizing research with Mendeley

Sign up for free
0