Bad Flow Data & reviewing -- What can we do?

From: Roederer, Mario (VRC) (MarioR@mail.nih.gov)
Date: Tue Oct 16 2001 - 12:00:05 EST


This topic strikes a nerve with many of us.  Indeed, ISAC did at one point
have the decent notion to have a committee on "data presentation standards"
or something like that.  I remember seeing something at Montpellier--a
pamphlet on presentation, I think.  Since then, I haven't heard about the
progress of this committee.  I made a number of suggestions on the
committee's effort, as it was a reasonable start, but don't know if that had
any affect.  Indeed, even this pamphlet had a number of mistaken notions,
showing how ingrained things can get even within the community.

For example, there was the suggestion that we should always put numbers on
the Y axis of a univariate histogram ("# of cells").  In reality, these
numbers are meaningless--they depend on the resolution with which the data
is binned, which can vary from program to program and instrument to
instrument.  The reasoning was that the only way to compare histograms was
to have these numbers to ensure that the data was interpreted properly.
However, this is a misconception--in reality, the peak height in a histogram
is rarely meaningful; it is the peak area which carries meaning.  What is
necessary in a histogram presentation is to identify how many cells were
collected (and displayed in the histogram), and, if any peak in the
histogram is cut off, to identify what fraction of the vertical scale is
shown.  I.e., the only thing worth putting on the Y axis label is "% max",
where "max" is the maximum peak height.  Admittedly, many of my papers have
the meaningless numbers on the axis...  but I'm still learning...

I am sure that even this little discussion may set off a minor
firestorm--and that's probably good: it will be educational, which is the
main point of this list!  (By the way, remember that contour plots are also
histograms (2D histograms), and they have no numbers on the "Z" axis
corresponding to event frequency.  Why should univariate histograms have
them?)

Jim Houston asks about the needed information for histograms or dot
plots--always, the minimum information is the number of events displayed.
(And yes, I am guilty of not always putting that information in my own
publications.)  I still strongly advocate against dot plots; there are much
more informative displays available.

But the point of this email is not to address the specific defects in data
presentation, nor even to start to lay them out.  That, in fact, would be
better done in a book.

Both Jim and Robert Zucker bring up the lack of the Community's involvement
in peer review.  It is worth noting that JAMA requires every paper to be
reviewed by a statistician, outside of the normal review.  Why not have the
same thing for every flow paper?  It seems that the major publications
should require an expert to review papers containing FACS
presentations/analyses for appropriateness.  But it won't happen: if we
can't even police our own Journals to ensure appropriate data presentation,
then what makes anyone think we have the competence to do so for other
Journals?

Some years ago, a few of us bantied around an idea of "post-publication"
review of articles that would be placed online.  The concept was as follows:
each major journal would be assigned to one or two expert reviewers.  Each
issue would be examined for articles that had flow cytometry in them, and
then the reviewer would go over the paper with a predefined list of
criteria.  The review would explicitly avoid any judgment about the paper's
conclusions; it would only address whether the flow cytometric analyses were
properly presented, interpreted, and then to note what additional
information is required, what possible artifacts need to be eliminated, etc.
The review process would be fundamentally based on a checklist (e.g., "was
cell viability assessed?", "what staining controls were performed?", "is the
data properly compensated?", "did the authors note how many events were
displayed?", "are the statistical intreprations of low event counts
appropriate?" etc. etc.... I could envision a 100-item list).  There would
be "sub-lists" for different types of flow, like "cell cycle",
"immunophenotyping", "intracellular detection", and "it's obvious I dropped
my samples off at my local core facility, didn't tell them what was in each
tube, forgot my controls anway, had them generate a few graphs for me, and
then xeroxed them until the dots I didn't like went away, so don't blame me
because I can't understand the difference between a contour plot and a
photomultiplier tube."  The reviews would be posted on-line.

The idea of the online post-publication review is that the general
scientific community, when reviewing an article, could turn to the web site
and quickly see if there are major problems with the technology that they
might not appreciate because of the subtleties.  Since the criteria would
all be published online as well, the goal would be that authors would start
turning to this site before publication in order to better present data,
rather than seeing criticisms of their papers show up afterwards.  Authors
might be allowed to appeal aspects of a review that they feel are
inappropriate, thereby providing an ongoing evolution of the evaluation
process.  There might even be a manuscript pre-review service where authors
could ensure appropriateness before submitting for review.

What would this require?  No more than a one or two dozen FACS-savvy people
to volunteer for this public service. Anyone with a modicum of experience in
flow would be excellent for this; in fact, it's probably better to recruit
younger (less jaundiced) people for the process. In reality, the review
process would be very rapid, since these are not detailed reviews aimed at
the science of the paper, but only at the data presentation.  I was so hot
on this idea (now 2 years old) that I even registered a domain for its use
(http://www.sciwatch.org)--a registration I renew in the hopes that
something might actually come of it.

In my idealistic vision, eventually journals would turn to the Flow
community to do this as a standard of practice rather than have it go on
post-publication.  Journals might even adopt the standard data presentation
requirements.  People might actually publish FACS data that we can believe.

But maybe we need to start at home first.  I'd like to suggest that
Cytometry and Clinical Communications in Cytometry both make an editorial
decision to require all published papers to come up to some minimum
acceptable standard.  If these journals make the commitment, then perhaps
there will be enough motivation for a document outlining these procedures to
be put together.  However much it makes sense, I do not suggest that this be
done by a committee under the auspices of ISAC, since that effort has
essentially failed, principally through inaction.  Rather, I think the
Editorial Boards should empower a group to put such a document together. If
such an effort works, it can serve as a model for other journals to adopt.

mr



This archive was generated by hypermail 2b29 : Sun Jan 05 2003 - 19:01:34 EST