Re: Descriptive statistics

From: Saverio Alberti (alberti@alpha400.cmns.mnegri.it)
Date: Sat May 25 2002 - 10:06:43 EST


A fresh look at FC data analysis is welcome. I must say I enjoyed the
comments..

However, one should not forget to read what others have done for a few
decades.. A long discussion on KS statistics has gone on in the past,
maybe the participants would like to comment.

Only two points: mathematical modeling of "difficult"  distributions (a
few dimly expressing cells and a majority of non-expressing ones)
can be effectively performed (1, 2). Whether or not these and similar
papers have been well understood and/or put to practice is a different
issue..

Furthermore, to perform statistics on populations of expressing cells
(positives) that are clearly separated from non-expressing cells
(negatives) is very different from when these are merged in a continuous
distribution (3, 4). Obviously, the word 'positive' in the first instance
isn't arbitrary at all, and '% positive' may be a rather concise and
meaningful way to describe the sample in a yes-or-no situation, e.g. gene
transfection. In case number two the capacity to go back to case number
one, whether mathematically (1, 2) or electronically (3, 4), is a distinct
advantage..


Saverio


1.	Lampariello, F. Evaluation of the number of positive cells from
flow cytometric immunoassays by mathematical modeling of cellular
autofluorescence, Cytometry. 15: 294-301, 1994.
2.	Lampariello, F. and Aiello, A. Complete mathematical modeling
method for the analysis of immunofluorescence distributions composed of
negative and weakly positive cells, Cytometry. 32: 241-254, 1998.
3.	Alberti, S., Parks, D. R., and Herzenberg, L. A. A single laser
method for subtraction of cell autofluorescence in flow cytometry,
Cytometry. 8: 114-9, 1987.
4.	Alberti, S., Bucci, C., Fornaro, M., Robotti, A., and Stella, M.
Immunofluorescence analysis in flow cytometry: better selection of
antibody-labeled cells after fluorescence overcompensation in the red
channel, J. Histochem. Cytochem. 39: 701-6, 1991.



Saverio Alberti
Head, Lab. of Experimental Oncology
Department of Cell Biology and Oncology
Consorzio Mario Negri Sud
66030 Santa Maria Imbaro (Chieti), Italy
Phone: (39-0872) 570.293
FAX: (39-0872) 570.412
E-mail: alberti@negrisud.it


On Thu, 23 May 2002, Sergey Dzekunov wrote:

> I am a new member of the discussion group and would like to address a few
> questions of general nature.
>
> 1) I have noticed in the previous messages that when talking about
> descriptive statistics of FC data,
> some people refer to the Kolmogorov-Smirnov test as the tool to determine
> which distribution better
> describes a data set. I am not sure about specific implementations of this
> test, but am afraid that it
> applies to continuous distribution functions, whereas binned distributions
> should be challenged by
> the Chi-square test for the same purpose.
>
> 2) Another common task is comparison of distribution means. To generalize on
> the first comment, I'd like
> to share with the group the following logistics that I was glad to find in
> this famous book: "Numerical recipes in C. The art
> of scientific computing." Second edition, Cambridge U. press, 1988-1992,
> ISBN 0521431085
> Here is the scheme:
> Q: Do two samples have different means?
> A: Prior to comparing the averages, one should do the following
>	Step 1: Run Chi-square test to see if the two distributions are different.
>		"No" -- go to step 2. "Yes" -- go to step 5.
>	Step 2: F-test to find if the two data sets have the same variances.
>		"Yes" -- go to step 3. "No" -- go to step 4.
>	Step 3: t-test to see if two samples have the same means (or significantly
> different ones).
>		Done.
>	Step 4: Use the Unequal-variance t-test, but be careful with the
> distributions which are
>		substantially different in shape. With this in mind, consider it done.
>	Step 5: This situation is very likely to be identical to the "peanuts and
> oranges" one.
>		Since the distributions are different, the conclusions about any
> differences in their means
>		may be quite speculative.
>
> 3) This part refers to the procedure of determining "percent positive"
> cells, which I think everyone
> working with flow cytometry has to deal with at least sometimes. Although
> there is little if any of statistical
> power in this parameter, it is widely used in validation and comparison of
> assays and is almost universal
> in reporting results of gene transfection/expression.
> The "Percent positive" is determined as the number of cells above a
> threshold arbitrarily established on a
> control distribution. Such thresholding has no ability to distinguish
> between the cells that have increased
> their fluorescence either by one percent of ten-fold -- as long as both have
> crossed the threshold.
> Nevertheless, given the popularity and simplicity of this parameter, it
> would be worthwhile to standardize
> it in some fashion. I have put together a simple algorithm that is capable
> of computing a single number for
> "percent positive" which is immune to the influence from the user. However,
> prior to splitting hairs
> on others and posting this algorithm, I would like to know if someone has
> already tried to solve the same problem.
> What does the discussion group think about it? Has any FC society ever
> posted some guidelines on certain
> algorithms and specifically thresholding?
>
> I highly regard flow cytometry as an "intuitive" and "artistic" technique
> which has as much power as a researcher
> is capable to make use of -- I must explicitly say this in respect of
> talented scientists who think far and wide
> when looking at the data. However, as a biophysicist I can't help it but try
> to bring more sense into the numbers
> as long as those are such a big part of our scientific language :)
> I sincerely wish that articles like this one appeared more often:
> Durand R.E. Calibration of Flow Cytometer Detection Systems. Methods in Cell
> Biology, Acad.Press 1990, Vol.33 p.647
>
> Sincerely,
> Sergey M. Dzekunov
> MaxCyte, Inc. Rockville, MD
>
>
>



This archive was generated by hypermail 2b29 : Sun Jan 05 2003 - 19:26:11 EST