I will make an attempt here to sort out the can of worms Alice G. called contour maps. Hopefully we won't get any noggin clogs along the way. Firstly, we must return to the smoothed data discussion. Assuming that one is using a 64x64 or 128x128 grid to generate the 2D histogram to be contoured, it is my experience that unless one has a large number of events in the histogram (several hundred thousand) the contour lines are very messy. The noise in the system and the relative sparceness of the data taxes most contour line generation algorithms. So, in order to have readable contour maps one needs to either collect very large data sets or smooth small data sets. On unsmoothed small data sets, other graphic presentations will probably be more informative than contour maps. Secondly there is the question of how to choose the interval between contour levels. Three are currently in common use in the flow community. The most common one can be referred to as Linear, where the interval between contour lines is a fixed density or number of events. Choosing this interval is the can of worm that Alice refers to - make the interval too small and the contour map turns into a big black smudge - make the interval too large and significant features can disappear. I have not seen a good algorithm for automatically generating this interval. Another method of choosing contour intervals can be called Logarithmic. The user specifies an interval, say 50%, and the algorithm finds the highest peak, puts the first contour at 50% of that level, the next at 25%, and so forth until the last contour is at the one event level. This is an automatic process, and graphs generated with this method will be consistent in showing variation in the data that occur at low frequency but poor at showing moderate to high frequency features. The last method can be called Probability. This method was developed by Wayne Moore in the early 1980s. Based on its use here for over 15 years I can confidently say it provides an automatic way of contouring immunologic flow data that shows all significant moderate and high level features. Combined with outlying dots, it also allows viewing low-level features. Commercially this method is available both in CellQuest (BD) and FlowJo (Treestar). Briefly, here's how it works. The user specifies a percent, which, like the Logarithmic method, determines the number of contour levels - 10% results in nine, 5% results in 19, etc. The algorithm picks the contour levels so that the specified percentage of events is between each level. Note that if there are separate event populations (as there usually is), then the number of events between corresponding contour levels on each population will add up to the specified percentage. Mathematically this means that given any event (at random), it has equal chance on appearing between any two contour levels - hence the name Probability contours. Using probability contours with outlying dots on smoothed data removes almost all the uncertainty that Alice expressed for visualizing immunologic flow data. For data that has populations with narrow sharp peaks, such as chromosomes, Linear levels work much better. Lastly, what is important is not contours per say, but the method of choosing the contour line levels and the smoothness of the underlying distribution. Having chosen the levels, other renderings of the data such as the pseudo-color plots on probability levels in FlowJo, can be just as informative as a contour lines. Nothing, however, will substitute for the experience and insight of the researcher analyzing the data. -Marty Bigos Stanford Shared FACS Facility
This archive was generated by hypermail 2b29 : Wed Apr 03 2002 - 11:50:11 EST