Mean ratios; complete with a bottom line!

From: Mario Roederer (Roederer@drmr.com)
Date: Tue Mar 27 2001 - 15:36:54 EST

Next message: JOHN MARIO GONZALEZ _ PROFESOR DPT. MICROBIOLOGIA-.: "Hematogones"
Previous message: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
In reply to: Howard Shapiro: "Re: Ratio or Mean"
Next in thread: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
Reply: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
Maybe reply: MATTHEW ROSINSKI: "Mean ratios; complete with a bottom line!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Howard:

You seemed a bit steamed in your EMail... so it is with trepidation
that I step into the foray.  Hopefully you won't be "mean" to me.

>...the software gives you the geometric
>mean because it is easy to compute, i.e., it doesn't require transformation
>of the data between log and linear space, while computing the real mean
>would.

Most software doesn't give you geometric mean because it's "easy", it
does so because it tends to be the most useful statistic.  The
transformation itself is trivial and does not impact on programming
demand, and it sure doesn't slow down today's computers noticeably.
Any software package capable of software compensation clearly has the
capability of computing arithmetic means...

>TAKING THE RATIO OF TWO QUANTITIES ON A LOG SCALE IS SO STUPID THAT ANYONE
>INCLUDING SUCH A CALCULATION IN A PAPER SUBMITTED TO A JOURNAL SHOULD BE
>BANNED FOR A YEAR FROM SUBMITTING ANOTHER PAPER - but, lucky for so many
>people, most of the reviewers and editors, even those associated with some
>really toney journals, are blissfully unaware just how stupid it is.

It is important to clarify that what is stupid is ratioing the
channel values of a log scale--i.e., ratioing values that increase
linearly on a logarithmic fluorescence scale.  There is nothing
inherently wrong with ratioing the "scale" values (those that
increase exponentially).  For example, if a population has a median
fluorescence of 10,000 (4th decade, channel 1024) and another has a
median fluorescence of 1,000 (3rd decade, channel 768), then it is
appropriate to note that it is 10 times as bright as the other (not
1.33 times).

We used ratios when it was appropriate: for example, when we were
measuring the fold-increase of beta-gal activity driven by a promoter
after stimulation (measured by FACS-Gal assay).  The pre-stimulation
condition still expressed considerable beta-gal; when stimulated, we
got 5x or 10x as much.  The ratio of the median fluorescences was
appropriate because we found that the RATIO of the post- to
pre-stimulation values was conserved across different cell lines,
although they had different basal expression levels (and therefore
different stimulated levels).  This was interesting
scientifically--says something about the log-responsiveness of
promoters and enhancers... but I digress.  Note that it is rarely
correct to ratio against the median or mean autofluorescence--rather,
as you point out, subtraction is superior for such a case.

>If you are actually trying to compare flow data with a bulk assay of some
>kind - for example, you have determined the total amount of fluorescent
>label in a solution of 100,000 cells, and you now want to calibrate the
>flow cytometric fluorescence histogram in terms of molecules of label per
>channel - you do need to use the arithmetic mean, as Alice Givan recently
>pointed out, and you therefore need linear data, while you usually have log
>data.

Actually, we found an very good correlation between the MEDIAN
fluorescence of a population of cultured cells expressing
b-galactosidase (measured by the FACS-Gal assay) and the total b-gal
content of the population by a biochemical assay (MUG).  Of course,
this was because the populations were relatively homogeneous
(clonal), with about 1-decade range in fluorescence--for
heterogeneous expressions, the median was not a very good correlate
of the biochemical activity.

This actually raises the most important point that everyone seems to
be dancing around but ignoring:  using any statistic is good as long
as you justify (to yourself, and to the reviewers) that it is
appropriate!  In other words, if your cell population is homogeneous,
then nearly anything will work.  If it's not homogeneous, then you
may have a lot of trouble with any single statistic.

While I agree that the arithmetic mean is probably going to be the
closest for heterogeneous populations, I disagree that it should be
used!  The fact that the population cannot be effectively described
by the median means that there is an interesting heterogeneity
underlying the expression--and therefore it becomes a mistake to
reduce the data to a single value.

After all, this is where the power of flow cytometry is:  in the
description of the DISTRIBUTION of expression.  The fact that people
continue to take pains to reduce our gloriously rich and detailed
data to a single number pains me to no end!  Much better would be to
calculate the 10th, 25th, 50th (median), 75th, and 90th percentiles
of a complex distribution:  at least now you have 5 parameters to the
distribution and therefore a much better chance of accurately
describing it (and possibly discovering underlying phenomena hidden
by using only a single value).


Here's my bottom line for log distributions:

"If the median is an inadequate description of the distribution, then
it is inappropriate to reduce the distribution to a single value by
any algorithm."

In such a case, using the arithmetic mean, geometric mean, Mario's
mean, Howard's mean, or even God's mean (should that actually differ
from Howard's) won't be any better and is only throwing mud onto a
beautiful painting of data.

mr

(PS: Mario's never mean.)

Next message: JOHN MARIO GONZALEZ _ PROFESOR DPT. MICROBIOLOGIA-.: "Hematogones"
Previous message: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
In reply to: Howard Shapiro: "Re: Ratio or Mean"
Next in thread: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
Reply: Howard Shapiro: "Re: Mean ratios; complete with a bottom line!"
Maybe reply: MATTHEW ROSINSKI: "Mean ratios; complete with a bottom line!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Wed Apr 03 2002 - 11:57:31 EST