Brain volumetric software is increasingly suggested for clinical routine. The present study quantifies the agreement across different software applications. Ten cases with and ten gender- and age-adjusted healthy controls without hippocampal atrophy (median age: 70; 25-75% range: 64-77 years and 74; 66-78 years) were retrospectively selected from a previously published cohort of Alzheimer's dementia patients and normal ageing controls. Hippocampal volumes were computed based on 3 Tesla T1-MPRAGE-sequences with FreeSurfer (FS), Statistical-Parametric-Mapping (SPM; Neuromorphometrics and Hammers atlases), Geodesic-Information-Flows (GIF), Similarity-and-Truth-Estimation-for-Propagated-Segmentations (STEPS), and Quantib™. MTA (medial temporal lobe atrophy) scores were manually rated. Volumetric measures of each individual were compared against the mean of all applications with intraclass correlation coefficients (ICC) and Bland-Altman plots. Comparing against the mean of all methods, moderate to low agreement was present considering categorization of hippocampal volumes into quartiles. ICCs ranged noticeably between applications (left hippocampus (LH): from 0.42 (STEPS) to 0.88 (FS); right hippocampus (RH): from 0.36 (Quantib™) to 0.86 (FS). Mean differences between individual methods and the mean of all methods [mm3] were considerable (LH: FS -209, SPM-Neuromorphometrics -820; SPM-Hammers -1474; Quantib™ -680; GIF 891; STEPS 2218; RH: FS -232, SPM-Neuromorphometrics -745; SPM-Hammers -1547; Quantib™ -723; GIF 982; STEPS 2188). In this clinically relevant sample size with large spread in data ranging from normal aging to severe atrophy, hippocampal volumes derived by well-accepted applications were quantitatively different. Thus, interchangeable use is not recommended.
Keywords: atrophy; brain; hippocampus; magnetic resonance imaging; software.