Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.

Robert B.M. Landewe; Kay Geert A Hermann; Desiree M.F.M Van Der Heijde; Xenophon Baraliakos; Anne-Grethe Jurik; Robert G. Lambert; Mikkel Østergaard; Martin Rudwaleit; David C Salonen; Jürgen Braun

Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.

Robert B.M. Landewe, Kay Geert A Hermann, Desiree M.F.M Van Der Heijde, Xenophon Baraliakos, Anne-Grethe Jurik, Robert G. Lambert, Mikkel Østergaard, Martin Rudwaleit, David C Salonen, Jürgen Braun

86 Citations (Scopus)

Abstract

Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity to change of several scoring systems to assess disease activity and change in disease activity in patients with AS. Twenty sets of consecutive MRI, derived from a randomized clinical trial comparing an active drug with placebo and selected on the basis of the presence of activity at baseline, were presented electronically to 7 experienced readers from different countries (Europe, Canada). Readers scored the MRI by 3 different methods including: a global score (grading activity per SI joint); a more comprehensive global score (grading activity per SI joint per quadrant); and a detailed scoring system [Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system], which scores 6 images, divided into quadrants, with additional scores for 'depth' and 'intensity.' A fourth and a fifth scoring system were constructed afterwards. The fourth method included the SPARCC score minus the additional scores for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated by calculating standardized response means (SRM) on change scores that were made positive. Overall inter-reader ICC per method were between 0.47 and 0.58 for scoring status, and between 0.40 and 0.53 for scoring change. ICC per possible reader pairs showed much more fluctuation per method, with lowest observed values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed systems. Sensitivity to change differed per reader, but in general was somewhat better for the comprehensive SPARCC system. This experiment under 'real life,' far from optimal conditions demonstrates the feasibility of scoring exercises for method comparison, provides evidence for the reliability and sensitivity to change of scoring systems to be used in assessing activity of SI joints in clinical trials, and sets the conditions for further validation research in this field

Original language	English
Journal	Journal of Rheumatology
Volume	32
Issue number	10
Pages (from-to)	2050-2055
ISSN	0315-162X
Publication status	Published - 2005

Access to Document

http://www.ncbi.nlm.nih.gov//entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16206369

Cite this

Landewe, R. B. M., Hermann, K. G. A., Van Der Heijde, D. M. F. M., Baraliakos, X., Jurik, A-G., Lambert, R. G., Østergaard, M., Rudwaleit, M., Salonen, D. C., & Braun, J. (2005). Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment. Journal of Rheumatology, 32(10), 2050-2055. http://www.ncbi.nlm.nih.gov//entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16206369

Landewe, RBM, Hermann, KGA, Van Der Heijde, DMFM, Baraliakos, X, Jurik, A-G, Lambert, RG, Østergaard, M, Rudwaleit, M, Salonen, DC & Braun, J 2005, 'Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.', Journal of Rheumatology, vol. 32, no. 10, pp. 2050-2055. <http://www.ncbi.nlm.nih.gov//entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16206369>

@article{9fb1104fbab24976b26b20449a531b98,

title = "Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.",

abstract = "Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity to change of several scoring systems to assess disease activity and change in disease activity in patients with AS. Twenty sets of consecutive MRI, derived from a randomized clinical trial comparing an active drug with placebo and selected on the basis of the presence of activity at baseline, were presented electronically to 7 experienced readers from different countries (Europe, Canada). Readers scored the MRI by 3 different methods including: a global score (grading activity per SI joint); a more comprehensive global score (grading activity per SI joint per quadrant); and a detailed scoring system [Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system], which scores 6 images, divided into quadrants, with additional scores for 'depth' and 'intensity.' A fourth and a fifth scoring system were constructed afterwards. The fourth method included the SPARCC score minus the additional scores for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated by calculating standardized response means (SRM) on change scores that were made positive. Overall inter-reader ICC per method were between 0.47 and 0.58 for scoring status, and between 0.40 and 0.53 for scoring change. ICC per possible reader pairs showed much more fluctuation per method, with lowest observed values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed systems. Sensitivity to change differed per reader, but in general was somewhat better for the comprehensive SPARCC system. This experiment under 'real life,' far from optimal conditions demonstrates the feasibility of scoring exercises for method comparison, provides evidence for the reliability and sensitivity to change of scoring systems to be used in assessing activity of SI joints in clinical trials, and sets the conditions for further validation research in this field",

author = "Landewe, {Robert B.M.} and Hermann, {Kay Geert A} and {Van Der Heijde}, {Desiree M.F.M} and Xenophon Baraliakos and Anne-Grethe Jurik and Lambert, {Robert G.} and Mikkel {\O}stergaard and Martin Rudwaleit and Salonen, {David C} and J{\"u}rgen Braun",

year = "2005",

language = "English",

volume = "32",

pages = "2050--2055",

journal = "Journal of Rheumatology",

issn = "0315-162X",

publisher = "Journal of Rheumatology Publishing Co. Ltd.",

number = "10",

}

TY - JOUR

T1 - Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.

AU - Landewe, Robert B.M.

AU - Hermann, Kay Geert A

AU - Van Der Heijde, Desiree M.F.M

AU - Baraliakos, Xenophon

AU - Jurik, Anne-Grethe

AU - Lambert, Robert G.

AU - Østergaard, Mikkel

AU - Rudwaleit, Martin

AU - Salonen, David C

AU - Braun, Jürgen

PY - 2005

Y1 - 2005

N2 - Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity to change of several scoring systems to assess disease activity and change in disease activity in patients with AS. Twenty sets of consecutive MRI, derived from a randomized clinical trial comparing an active drug with placebo and selected on the basis of the presence of activity at baseline, were presented electronically to 7 experienced readers from different countries (Europe, Canada). Readers scored the MRI by 3 different methods including: a global score (grading activity per SI joint); a more comprehensive global score (grading activity per SI joint per quadrant); and a detailed scoring system [Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system], which scores 6 images, divided into quadrants, with additional scores for 'depth' and 'intensity.' A fourth and a fifth scoring system were constructed afterwards. The fourth method included the SPARCC score minus the additional scores for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated by calculating standardized response means (SRM) on change scores that were made positive. Overall inter-reader ICC per method were between 0.47 and 0.58 for scoring status, and between 0.40 and 0.53 for scoring change. ICC per possible reader pairs showed much more fluctuation per method, with lowest observed values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed systems. Sensitivity to change differed per reader, but in general was somewhat better for the comprehensive SPARCC system. This experiment under 'real life,' far from optimal conditions demonstrates the feasibility of scoring exercises for method comparison, provides evidence for the reliability and sensitivity to change of scoring systems to be used in assessing activity of SI joints in clinical trials, and sets the conditions for further validation research in this field

AB - Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity to change of several scoring systems to assess disease activity and change in disease activity in patients with AS. Twenty sets of consecutive MRI, derived from a randomized clinical trial comparing an active drug with placebo and selected on the basis of the presence of activity at baseline, were presented electronically to 7 experienced readers from different countries (Europe, Canada). Readers scored the MRI by 3 different methods including: a global score (grading activity per SI joint); a more comprehensive global score (grading activity per SI joint per quadrant); and a detailed scoring system [Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system], which scores 6 images, divided into quadrants, with additional scores for 'depth' and 'intensity.' A fourth and a fifth scoring system were constructed afterwards. The fourth method included the SPARCC score minus the additional scores for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated by calculating standardized response means (SRM) on change scores that were made positive. Overall inter-reader ICC per method were between 0.47 and 0.58 for scoring status, and between 0.40 and 0.53 for scoring change. ICC per possible reader pairs showed much more fluctuation per method, with lowest observed values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed systems. Sensitivity to change differed per reader, but in general was somewhat better for the comprehensive SPARCC system. This experiment under 'real life,' far from optimal conditions demonstrates the feasibility of scoring exercises for method comparison, provides evidence for the reliability and sensitivity to change of scoring systems to be used in assessing activity of SI joints in clinical trials, and sets the conditions for further validation research in this field

M3 - Journal article

SN - 0315-162X

VL - 32

SP - 2050

EP - 2055

JO - Journal of Rheumatology

JF - Journal of Rheumatology

IS - 10

ER -

Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment.

Abstract

Access to Document

Fingerprint

Cite this