Analyzing Data from Ordered Categories

Clinical investigations often involve data in the form of ordered categories — e.g., "worse," "unchanged," "improved," "much improved." Comparison of two groups when the data are of this kind should not be done by the chi-square test, which wastes information...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The New England journal of medicine 1984-08, Vol.311 (7), p.442-448
Hauptverfasser: Moses, Lincoln E, Emerson, John D, Hosseini, Hussein
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clinical investigations often involve data in the form of ordered categories — e.g., "worse," "unchanged," "improved," "much improved." Comparison of two groups when the data are of this kind should not be done by the chi-square test, which wastes information and is insensitive in this context. The Wilcoxon-Mann-Whitney test provides a proper analysis. Alternatively, scores may be assigned to the categories in order, and the t-test applied. We demonstrate both approaches here. Sometimes data in ordered categories are reduced to a two-by-two table by the collapsing of the high categories into one category and the low categories into another. This practice is inefficient; moreover, it entails avoidable subjectivity in the choice of the cutting point that defines the two super-categories. The Wilcoxon-Mann-Whitney procedure (or the t-test with use of ordered scores) is preferable. A survey of research articles in Volume 306 of the New England Journal of Medicine shows many instances of ordered-category data (about 20 per cent of the articles had such data) and no instance of analysis by the preferred methods presented here. We suggest that investigators who are unfamiliar with these methods should seek the assistance of a professional statistician when they must deal with such data. (N Engl J Med 1984; 311: 442–8.) THE clinical investigator must at times rely on quantitative information that is intrinsically imprecise. Clinical response ("worse," "unchanged," "improved") is such a variable; the response can be qualitatively ordered, but it often cannot be expressed on a precise numerical scale. Input variables of the same sort arise, too; stage of disease (I, II, III, IV) is an example. Effective ways to analyze such information are not widely known. Sometimes inefficient methods of analysis are applied; this is equivalent to ignoring part of the data. This article points out ways to go wrong, but its primary emphasis is on showing methods . . .
ISSN:0028-4793
1533-4406
DOI:10.1056/NEJM198408163110705