Re[2]: Kolmogorov-Smirnov

From: cabrera.ej@pg.com
Date: Wed Nov 26 1997 - 15:02:00 EST


Ken
I could not agree more.  K-S statistics as generated in CellQuest (it 
probably uses channels to get degrees of freedom) is far toooooooooo 
sensitive to get P values.  I do not know how to get the degrees of freedom 
from the CV.  I hope that someone reading these messages does know and 
educate us!

Ed


______________________________ Reply Separator 
_________________________________
Subject: Re: Kolmogorov-Smirnov
Author:  (INTERNET)AULTK@mail.mmc.org at external
Date:    11/25/97 10:02 AM


As someone who made extensive use of the K-S test in a past life I would 
like to point out one important issue that is not dealt with in 
statistics text books, and I think was not explicitly mentioned in Ted 
Young's excellent paper that introduced K-S to the flow community.

The K-S test is typically used to compare two frequency distributions 
non-parametrically.  The number of degrees of freedom that one uses to 
calculate a p value from a K-S statistic is based upon the number of 
bins in the frequency distribution histogram.  For flow cytometry data 
one is tempted to use the number of channels, i.e. 256 or 1024 etc. as 
the degrees of freedom.  Doing this will result in any two histograms 
that are not identical being highly statistically significantly 
different.  In other words using channels as degrees of freedom makes 
the K-S test ridiculously too sensitive to trivial differences in the 
histograms.
   In fact our flow histograms have far fewer degrees of freedom than
they have channels.  The correct value is based upon the CV of your 
histogram, i.e. how many distinct distinguishable histograms can you fit 
into the number of channels that you have?  For most of our data we have 
no way (that I know of) to estimate the correct number of degrees of 
freedom.
   For this reason I have always used the K-S statistic as a measure of
the difference between two histograms, but have not used it to calculate 
a p value.  I am sure there are others listening to this discussion who 
are better qualified to discuss how one calculates degrees of freedom 
for a flow histogram - I for one would be interested in such a 
discussion.

Ken Ault
aultk@mail.mmc.org



This archive was generated by hypermail 2b29 : Wed Apr 03 2002 - 11:50:22 EST