Data processing and classification analysis of proteomic changes: A case study of oil pollution in the mussel, Mytilus edulis

30Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Proteomics may help to detect subtle pollution-related changes, such as responses to mixture pollution at low concentrations, where clear signs of toxicity are absent. The challenges associated with the analysis of large-scale multivariate proteomic datasets have been widely discussed in medical research and biomarker discovery. This concept has been introduced to ecotoxicology only recently, so data processing and classification analysis need to be refined before they can be readily applied in biomarker discovery and monitoring studies. Results: Data sets obtained from a case study of oil pollution in the Blue mussel were investigated for differential protein expression by retentate chromatography-mass spectrometry and decision tree classification. Different tissues and different settings were used to evaluate classifiers towards their discriminatory power. It was found that, due the intrinsic variability of the data sets, reliable classification of unknown samples could only be achieved on a broad statistical basis (n > 60) with the observed expression changes comprising high statistical significance and sufficient amplitude. The application of stringent criteria to guard against overfitting of the models eventually allowed satisfactory classification for only one of the investigated data sets and settings. Conclusion: Machine learning techniques provide a promising approach to process and extract informative expression signatures from high-dimensional mass-spectrometry data. Even though characterisation of the proteins forming the expression signatures would be ideal, knowledge of the specific proteins is not mandatory for effective class discrimination. This may constitute a new biomarker approach in ecotoxicology, where working with organisms, which do not have sequenced genomes render protein identification by database searching problematic. However, data processing has to be critically evaluated and statistical constraints have to be considered before supervised classification algorithms are employed. © 2006 Monsinjon et al; licensee BioMed Central Ltd.

Figures

  • Table 1: Differentially expressed peptides/proteins from gills (A) and digestive gland (DG, B) of Blue mussels obtained from controls (C) and 21d exposures to 0.5 mg/L crude oil (oil) and 0.5 mg/L crude oil spiked with a mixture of APs and PAHs (0.1 mg/L; sO); mean intensities (arbitrary values) and SD of controls (Gills, n = 51; DG, n = 74), oil (Gills, n = 66; DG, n = 71) and sO (Gills, n = 55; DG, n = 69). Expression changes relative to the controls are significantly different at a level of p < 0.001 (one-way ANOVA for C vs. oil vs. sO) at least for one of the exposures. Peaks are listed in decreasing order of their respective p values from top to bottom.
  • Table 2: Prediction success of the classifiers for gills (A) and digestive gland (B) from the Blue mussel for pairwise comparison of controls (C) with either of the two different exposures to i) 0.5 mg/L crude oil (oil) and ii) 0.5 mg/L crude oil spiked with a mixture of APs and PAHs (0.1 mg/L; sO) for 21 days as validated with an independent sample set not used in model construction. Sample size for learning and testing set, amount of samples attributed to each class, individual and overall classification success in % of testing set is presented.
  • Table 3: Peak constituents of the optimal classification models for gills (A) and digestive gland (B) of Blue mussels exposed to 0.5 mg/L crude oil (oil) and 0.5 mg/L crude oil spiked with a mixture of APs and PAHs (0.1 mg/L; sO) for 21 days. The score equals the discriminatory weight of the variable within the classifier. A pre-selection of highly significant peaks (p < 0.001) has been carried out prior to model construction.

References Powered by Scopus

A Rapid and Sensitive Method for the Quantitation of Microgram Quantities of Protein Utilizing the Principle of Protein-Dye Binding

232714Citations
N/AReaders
Get full text

Toxicology and genetic toxicology in the new era of "toxicogenomics": Impact of "-omics" technologies

346Citations
N/AReaders
Get full text

PAH metabolites in bile, cytochrome P4501A and DNA adducts as environmental risk parameters for chronic oil exposure: A laboratory experiment with Atlantic cod

305Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Marine Bivalve Molluscs: Second Edition

186Citations
N/AReaders
Get full text

Proteins in ecotoxicology - How, why and why not?

114Citations
N/AReaders
Get full text

Proteomic research in bivalves. Towards the identification of molecular markers of aquatic pollution

104Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Monsinjon, T., Andersen, O. K., Leboulenger, F., & Knigge, T. (2006). Data processing and classification analysis of proteomic changes: A case study of oil pollution in the mussel, Mytilus edulis. Proteome Science, 4. https://doi.org/10.1186/1477-5956-4-17

Readers over time

‘09‘10‘11‘12‘13‘14‘15‘16‘17‘19‘20‘21‘22‘24036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 18

55%

Professor / Associate Prof. 8

24%

Researcher 5

15%

Lecturer / Post doc 2

6%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 23

74%

Engineering 3

10%

Biochemistry, Genetics and Molecular Bi... 3

10%

Chemistry 2

6%

Save time finding and organizing research with Mendeley

Sign up for free
0