Re: Last words from Mario (we hope) on data display

From: Howard Shapiro (hms@shapirolab.com)
Date: Sat Oct 04 1997 - 22:16:13 EST


>I will propose
>to Cytometry to write a perspective on the topic of FACS data display.

>I challenge any of those of you who
>champion dot plots (or even color dot plots) to join my effort and write a
>"counterpoint" analysis to provide a balancing viewpoint.

A forum with various viewpoints, well documented and illustrated, would
probably  be very helpful to the readership.

>
>Calman writes about dot plots:
>
>>   Furthermore it stresses the single cell nature of the
>>   data- each dot is a cell.
>
>No, no, no, no, no, no, no!  This is precisely the problem!  In dot plots, each
>dot can be one cell, two cells, three cells, or a thousand cells.  You can
never
>know which.

I think what he probably meant was that a dot plot does allow you to spot
numbers of occurrences below your lowest threshold value for contouring,
including single occurrences; it is true that if there isn't a dot at a
point on a dot plot there weren't any cells observed with the corresponding
data values.


>I agree with Alice that the precise way of contouring can affect how the more
>frequent populations appear.  This is why the choice of contouring
algorithms is
>so important!  One of the most robust (in that it is objective, not allowing
>"user-defined" contouring levels, etc.) is probability contouring.  This method
>of contouring has been adopted by SAS Institute for use in their bivariate
>displays--in addition, it is offered by several FACS data analysis packages.
>
>This method of contouring generates displays that are indepedent of the number
>of events collected -- something that no other display can do.

By "probability contouring" do you mean normalization, so the contour lines
represent percentile values rather than absolute numbers of cells?  This is
a very sensible method of displaying things, and facilitates comparison of
samples of unequal sizes.  In order to deal with rare events, however, you
still have to have dots or their equivalent added to the contour plot.

 Thus, using Dot
>plots or color dot plots or user-defined thresholding, I can make a variety of
>conclusions about the same sample depending solely on how many events I choose
>to collect (or display)!
>
>Jim Houston is 100% correct that the precise method of data display is
critical.
>I urge  reviewers and editors to demand that this information be included
in all
>FACS data displays.

Note, for example, that bivariate chromosome contour plots are generally
made with higher thresholds than plots of immunofluorescence...and it
wouldn't be a bad idea to include a scale or to indicate which contour lines
represent which percentiles.


>
>Once again, this brings us to the fundamental point of data display:  to convey
>information accurately to the reader.  I highly recommend a book by Edward
>Tufte, "The Visual Display of Quantitative Information," about this topic
>(especially to programmers developing analysis packages).  This fabulous book
>shows how misleading different styles of graphs can be, and discusses some of
>the underlying principles of data display--principles largely ignored by
>developers of FACS data analysis programs.


>There was some discussion about art vs. science.  Do not mistake artistry for
>disinformation!  Of course there is art in science, and in the presentation of
>scientific data.  If not, we would only see tables of numbers that would be
>incomprehensible--we are, after all, only human.

Tufte actually has three books out; each is a work of art as well as a work
of science. In the Chapter on Data Analysis in the 3rd Edition of Practical
Flow Cytometry, I suggest that single parameter distributions be represented
using Tufte's minimalist version of the "Box and Whiskers" plot, which shows
the position of the median, 25th and 75th, and 5th and 95th percentile
values (or, alternatively, the full range of the data instead of 5th and
95th %).  This could readly be extended to a two-dimensional version, but
might be better represented by different colored (or differently shaded)
areas than by contour lines.



-Howard



This archive was generated by hypermail 2b29 : Wed Apr 03 2002 - 11:50:11 EST