Observer variation in FDG PET-CT for staging of non small cell lung carcinoma
Purpose: Error and variation in reporting remains one of the weakest features of clinical imaging despite enormous technological advances in nuclear medicine and radiology. The aim of this study was to evaluate agreement amongst experienced readers in staging non-small-cell lung cancer (NSCLC) with PET-CT. Methods: A series of 18F-FDG PET-CT scans from 100 consecutive patients were reviewed independently by three experienced readers, with two readers reviewing each scan series a second time. Individual mediastinal lymph node stations were assessed as benign/inflammatory, equivocal or malignant, and AJCC N and M stage were also assigned. Kappa (κ) was used to compare ratings from two categories and weighted kappa (κw) for three or more categories, and kappa values were interpreted according to the Landis-Koch benchmarks. Results: Both intra- and interobserver agreement for N and M staging were high. For M staging there was almost perfect intra- and interobserver agreement (κ=0.90–0.93). For N staging, agreement was either almost perfect or substantial (intraobserver κw=0.79-0.91; interobserver κw=0.75–0.81). Importantly, there was almost perfect agreement for N0/1 vs N2/3 disease (κ=0.80–0.97). Agreement for inferior and superior mediastinal nodes (stations 1, 2, 3, 7, 8, 9) was either almost perfect or substantial (κw=0.71–0.88), but lower for hilar nodes (10; κw=0.56–0.71). Interreporter variability was greatest for aortopulmonary nodes (5, 6; κw=0.48–0.55). Conclusion: Amongst experienced reporters in a single centre, there was a very high level of agreement for both mediastinal nodal stage and detection of distant metastases with PET-CT. This supports the use of PET-CT as a robust imaging modality for staging NSCLC.