Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology

Marina Remington; P. J. Tyrer; J. Newson-Smith; D. V. Cicchetti

doi:10.1017/S0033291700034097

Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology

Published online by Cambridge University Press: 09 July 2009

J. Newson-Smith and

Marina Remington: Affiliation:
Department of Psychiatry, University of Southampton, and the Veterans Administration Hospital, West Haven, Connecticut, USA
P. J. Tyrer*: Affiliation:
Department of Psychiatry, University of Southampton, and the Veterans Administration Hospital, West Haven, Connecticut, USA
J. Newson-Smith: Affiliation:
Department of Psychiatry, University of Southampton, and the Veterans Administration Hospital, West Haven, Connecticut, USA
D. V. Cicchetti: Affiliation:
Department of Psychiatry, University of Southampton, and the Veterans Administration Hospital, West Haven, Connecticut, USA
*: 1Address for correspondence: Dr P. J. Tyrer, Department of Psychiatry, Royal South Hants Hospital, Graham Road, Southampton.

Article contents

Synopsis
References

Get access

Rights & Permissions

Synopsis

The reliability of 26 items from the ninth edition of the Present State Examination (PSE) was assessed using both the conventional categorical scales and separately constructed analogue scales. Reliability was also calculated when the analogue responses were rescaled down to 2, 3 and 4 categories. The levels of inter-rater agreement obtained were comparable to those achieved in previous studies of PSE reliability, although as expected the levels of agreement on audiotapes were greater than those for independent interviews performed on the same day. These levels were not significantly affected by any of the changes in scale format, but there were apparent differences in reliability depending on the statistics used. In selecting or constructing a psychiatric rating scale, the question of reliability should not influence the choice of a categorical or continuous scale, or the number of scored points in the scale.

Type: Research Article
Information: Psychological Medicine , Volume 9 , Issue 4 , November 1979 , pp. 765 - 770

DOI: https://doi.org/10.1017/S0033291700034097 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1979

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aitken, R. C. B. (1969). Measurement of feelings using visual analogue scales. Proceedings of the Royal Society of Medicine 62, 989–993.CrossRef Google Scholar PubMed

Bartko, J. J. (1966). The intraclass correlation coefficient as a measure of reliability. Psychological Reports 19, 3–11.CrossRef Google Scholar PubMed

Bartko, J. J. (1974). Corrective note to: ‘the intraclass coefficient as a measure of reliability’. Psychological Reports 34, 418.CrossRef Google Scholar

Bendig, A. W. (1954). Reliability and the number of rating scale categories. Journal of Applied Psychology 38, 38–40.CrossRef Google Scholar

Cicchetti, D. V. (1976). Assessing inter-rater reliability for rating scales: resolving some basic issues. British Journal of Psychiatry 129, 452–456.CrossRef Google Scholar PubMed

Cicchetti, D. V., Aivano, S. L. & Vitale, J. (1976). A computer program for assessing the reliability and systematic bias of individual measurements. Educational and Psychological Measurement 36, 761–765.CrossRef Google Scholar

Cicchetti, D. V., Aivano, S. L. &, Vitale, J. (1977). Computer programs for assessing rater agreement and rater bias for qualitative data. Educational and Psychological Measurement 37, 195–201.CrossRef Google Scholar

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37–46.CrossRef Google Scholar

Cohen, J. (1968). Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin 70, 213–220.CrossRef Google Scholar PubMed

Cooper, J. E., Copeland, J. R. M., Brown, G. W., Harris, T. & Gourlay, A. J. (1977). Further studies on interviewer training and inter-rater reliability of the Present State Examination (PSE). Psychological Medicine 7, 517–523.CrossRef Google Scholar PubMed

Ferguson, L. W. (1941). A study of the Likert technique of attitude scale construction. Journal of Social Psychology 13, 51–57.CrossRef Google Scholar

Fleiss, J. L. (1975). Measuring agreement between two judges on the presence or absence of a trait. Biometrics 31, 651–659.CrossRef Google Scholar PubMed

Fleiss, J. L. & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement 33, 613–619.CrossRef Google Scholar

Fleiss, J. L., Cohen, J. & Everitt, B. S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin 72, 323–327.CrossRef Google Scholar

Gelder, M. G. & Marks, I. M. (1966). Severe agoraphobia: A controlled prospective trial of behaviour therapy. British Journal of Psychiatry 112, 309–319.CrossRef Google Scholar PubMed

Hall, J. N. (1974). Inter-rater reliability of word rating scales. British Journal of Psychiatry 125, 248–255.CrossRef Google Scholar

Helzer, J. E., Robins, L. N., Taibleson, M., Woodruff, R. A., Reich, T. & Wish, E. D. (1977). Reliability of psychiatric diagnosis. I. A methodological review. Archives of General Psychiatry 34, 129–133.CrossRef Google Scholar

Jahoda, M., Deutsch, M. & Cook, S. W. (eds.) (1951). Research Methods in Social Relations. Dryden Press: New York.Google Scholar

Kendell, R. E., Everitt, B., Cooper, J. E., Sartorius, N. & David, M. E. (1968). The reliability of the Present State Examination. Social Psychiatry 3, 123–129.CrossRef Google Scholar

Komorita, S. S. (1963). Attitude content, intensity, and the neutral point on a Likert scale. Journal of Social Psychology 61, 327–334.CrossRef Google Scholar PubMed

Komorita, S. S. & Graham, W. K. (1965). Number of scale points and the reliability of scales. Educational and Psychological Measurement 4, 987–995.CrossRef Google Scholar

Lissitz, R. W. & Green, S. B. (1975). Effect of the number of scale points on reliability: a Monte Carlo approach. Journal of Applied Psychology 60, 10–13.CrossRef Google Scholar

Luria, R. E. & McHugh, P. R. (1974). Reliability and clinical ability of the ‘Wing’ Present State Examination. Archives of General Psychiatry 30, 866–871.CrossRef Google Scholar

Matell, M. S. & Jacoby, J. (1971). Is there an optional number of alternatives for Likert scale items? Study I: Reliability and validity. Educational and Psychological Measurement 31, 657–674.CrossRef Google Scholar

McNemar, Q. (1947). Note on the sampling error of the differences between correlated proportions or percentages. Psychometrika 12, 153–157.CrossRef Google Scholar PubMed

Spitzer, R. L. & Fleiss, J. L. (1974). A re-analysis of the reliability of psychiatric diagnosis. British Journal of Psychiatry 125, 341–347.CrossRef Google Scholar PubMed

Spitzer, R. L., Cohen, J., Fleiss, M. S. & Endicott, J. (1967). Quantification of agreement in psychiatric diagnosis. Archives of General Psychiatry 17, 83–87.CrossRef Google Scholar PubMed

Symonds, P. M. (1924). On the loss of reliability in ratings due to coarseness of scale. Journal of Experimental Psychology 7, 456–461.CrossRef Google Scholar

Watson, J. P., Gaind, R. & Marks, I. M. (1971). Prolonged exposure: a rapid treatment for phobias. British Medical Journal i, 13–15.CrossRef Google Scholar

Wing, J. K., Birley, J. L. T., Cooper, J. E., Graham, P. & Isaacs, A. (1967). Reliability of a procedure for measuring and classifying ‘Present Psychiatric State’. British Journal of Psychiatry 113, 499–575.CrossRef Google Scholar PubMed

Wing, J. K., Cooper, J. E. & Sartorius, N. (1974). Measurement and Classification of Psychiatric Symptoms. Cambridge University Press: London.Google Scholar

Wing, J. K., Nixon, J. M., Mann, S. A. & Leff, J. P. (1977). Reliability of the PSE (ninth edition) used in a population study. Psychological Medicine 7, 505–516.CrossRef Google Scholar PubMed

World Health Organization (1973). The International Pilot Study of Schizophrenia. WHO: Geneva.Google Scholar

Article contents

Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology

Synopsis

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests