Valvular heart disease
Development of a Consensus Document to Improve Multireader Concordance and Accuracy of Aortic Regurgitation Severity Grading by Echocardiography Versus Cardiac Magnetic Resonance Imaging

https://doi.org/10.1016/j.amjcard.2012.04.052Get rights and content

Current guidelines recommend a multiparametric echocardiographic assessment of aortic regurgitation (AR). However, the absence of a hierarchical weighting of discordant parameters could cause interobserver variability. In the present study, we sought to define and improve the interobserver variability of AR assessment. Seventeen level 3 readers graded 20 randomly selected patients with AR. The readers also provided a usefulness score for each parameter, depending on its influence on their decision of the AR severity grade. A consensus strategy was subsequently formulated and validated against cardiac magnetic resonance imaging in a separate group of 80 patients. The readers were updated with the consensus document and recalibrated using the same cases. Agreement was statistically assessed using Randolph's free-marginal multirater kappa. At baseline, no uniform approach was used to combine the individual parameters, contributing to the interobserver variability (overall kappa 0.5). A consensus strategy to categorize AR severity was developed in which the left ventricular volume took precedence over the other parameters and was used to differentiate chronic severe AR from less severe categories. Recalibration of the readers using this consensus strategy improved concordance (kappa increased to 0.7). The new strategy also improved the accuracy relative to cardiac magnetic resonance imaging, as evidenced by full agreement on severe AR between the consensus document-based grading and AR severity defined by cardiac magnetic resonance imaging in the separate validation group of 80 patients. In conclusion, grading of chronic AR using a multiparametric approach has suboptimal consistency between readers and a left ventricular volume-based consensus document improved concordance and accuracy.

Section snippets

Methods

The study was divided into 4 phases. In the calibration phase, we performed a baseline assessment of interobserver agreement and majority accuracy against a reference standard. The consensus phase involved formulation of a consensus document to standardize the grading of AR. In the validation phase, we checked the accuracy of the consensus document-based AR grading against the cardiac MRI findings. Finally, the recalibration phase involved updating all readers with the consensus document and a

Results

The clinical and echocardiographic characteristics of the studied patients are listed in Table 1. The mean ejection fraction was within the normal range, but the ventricles were generally enlarged. The baseline concordance among the readers was suboptimal, with an average kappa of 0.5 and the lowest kappa (0.4) for moderate AR. Agreement of >80% of readers was observed for only 13 of the 20 patients (Figure 1). Logistic regression analysis did not show a statistically significant association

Discussion

The findings of the present study showed there was suboptimal multireader concordance in the grading severity of AR, which appeared to be attributable to a lack of hierarchy of the key parameters recommended for use in grading of AR severity. Our study has demonstrated that multireader concordance and accuracy can be improved by a simple algorithmic approach (Figure 5) derived from a LV volume-based consensus document. The lack of preferential weighting of the key parameters in which the

References (31)

Cited by (0)

A complete list of the AR Concordance Investigators can be found in the Appendix.

View full text