We propose a novel mixture probability model for the probability distribution function (PDF) of microarray signals, which comprises a noise and a signal component. The noise term, due to non-specific mRNA hybridization, is given by a lognormal distribution; and the true signal, from specific mRNA hybridization, is described by the generalized Pareto-gamma (GPG) function. The model, applied to expression data of 251 human breast cancer tumors on the Affymetrix microarray platform, yields accurate fits for all tumor samples. We observe that (i) high aggressive cancers have, in general, broader right tails in the GPG than low aggressive cancers; (ii) the exponent parameter value of the GPG distribution is not constant and correlates strongly with ∼4000 expressed genes and several "gold standard" clinical risk factors. These results can not be obtained from so-called "scale-free network" models. We conclude that an accurate parameterization of scale-dependent GPG function could provide robust prognostic benefits for cancer patients. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Chua, A. L. S., Ivshina, A. V., & Kuznetsov, V. A. (2006). Pareto-gamma statistic reveals global rescaling in transcriptomes of low and high aggressive breast cancer phenotypes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4146 LNBI, pp. 49–59). Springer Verlag. https://doi.org/10.1007/11818564_7
Mendeley helps you to discover research relevant for your work.