Computer assisted radiology and surgery| Volume 14, ISSUE 8, P985-991, August 2007

Reliable Evaluation of Performance Level for Computer-Aided Diagnostic Scheme

      Rationale and Objectives

      Computer-aided diagnostic (CAD) schemes have been developed for assisting radiologists in the detection of various lesions in medical images. The reliable evaluation of CAD schemes is an important task in the field of CAD research.

      Materials and Methods

      Many evaluation approaches have been proposed for evaluating the performance of various CAD schemes in the past. However, some important issues in the evaluation of CAD schemes have not been systematically analyzed. The first important issue is the analysis and comparison of various evaluation methods in terms of certain characteristics. The second includes the analysis of pitfalls in the incorrect use of various evaluation methods and the effective approaches to the reduction of the bias and variance caused by these pitfalls. We attempt to address the first important issue in details in this article by conducting Monte Carlo simulation experiments, and to discuss the second issue in the Discussion section.


      No single evaluation method is universally superior to the others; different situations of CAD applications require different evaluation methods, as recommended in this article. Bias and variance in the estimated performance levels caused by various pitfalls can be reduced considerably by the correct use of good evaluation methods.


      This article would be useful to researchers in the field of CAD research for selecting appropriate evaluation methods and for improving the reliability of the estimated performance of their CAD schemes.

      Key Words

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Academic Radiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Chan H.P.
        • Doi K.
        • Vyborny C.J.
        • et al.
        Improvement in radiologists’ detection of clustered microcalcifications on mammograms: the potential of computer-aided diagnosis.
        Invest Radiol. 1990; 25: 1102-1110
        • Kobayashi T.
        • Xu X.
        • MacMahon H.
        • et al.
        Effect of a computer-aided diagnosis scheme on radiologists’ performance in detection of lung nodules on radiographs.
        Radiology. 1996; 199: 843-848
        • Fukunaga F.
        • Hayes R.R.
        Effects of sample size on classifier design.
        IEEE Trans Pattern Anal Mach Intell. 1989; 11: 873-885
        • Fukunaga F.
        • Hayes R.R.
        Estimation of classifier performance.
        IEEE Trans Pattern Anal Mach Intell. 1989; 11: 1087-1101
        • Wagner R.F.
        • Chan H.-P.
        • Mossoba J.T.
        • et al.
        Finite-sample effects and resampling plans: applications to linear classifiers in computer-aided diagnosis.
        Proc SPIE Conf Medical Imaging. 1997; 3034: 467-477
        • Chan H.-P.
        • Sahiner B.
        • Wagner R.F.
        • et al.
        Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers.
        Med Phys. 1999; 26: 2654-2668
        • Sahiner B.
        • Chan H.-P.
        • Petrick N.
        • et al.
        Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size.
        Med Phys. 2000; 27: 1509-1522
        • Gur D.
        • Wagner R.F.
        • Chan H.-P.
        On the repeated use of database for testing incremental improvement of computer-aided detection schemes.
        Acad Radiol. 2004; 11 (R.): 103-105
        • Li Q.
        • Doi K.
        The reduction of bias and variance for the evaluation of computer-aided diagnostic scheme.
        Med Phys. 2006; 33: 868-875
        • Efron B.
        • Tibshirani R.F.
        An Antroduction to bootstrap. Chapman and Hall, New York1993
        • Aoyama M.
        • Li Q.
        • Katsuragawa S.
        • et al.
        Automated computerized scheme for distinction between benign and malignant solitary pulmonary nodules on chest images.
        Med Phys. 2002; 29: 701-708
        • Aoyama M.
        • Li Q.
        • Katsuragawa S.
        • et al.
        Computerized scheme for determination of the likelihood measure of malignancy for pulmonary nodules on low-dose CT images.
        Med Phys. 2003; 30: 387-394
        • MacKay D.J.C.
        Bayesian methods for adaptive models. 1992 (Available online at:
        • Kupinski M.A.
        • Edwards D.C.
        • Giger M.L.
        • et al.
        Ideal observer approximation using Bayesian classification neural networks.
        IEEE Trans Med Imaging. 2001; 20: 886-899
        • Kallergi M.
        Computer-aided diagnosis of mammographic microcalcification clusters.
        Med Phys. 2004; 31: 314-326
        • Li H.
        • Giger M.L.
        • Huo Z.
        Computerized analysis of mammographic parenchymal patterns for assessing breast cancer risk: effect of ROI size and location.
        Med Phys. 2004; 31: 549-555
        • Sahiner B.
        • Chan H.-P.
        • Roubidoux M.A.
        • et al.
        Computerized characterization of breast masses on three-dimensional ultrasound volumes.
        Med Phys. 2004; 31: 744-754
        • Nakayama R.
        • Uchiyama Y.
        • Watanabe R.
        • et al.
        Computer-aided diagnosis scheme for histological classification of clustered microcalcifications on magnification mammograms.
        Med Phys. 2004; 31: 789-799
        • Nappi J.
        • Frimmel H.
        • Dachman A.H.
        • et al.
        Computerized detection of colorectal masses in CT colonography based on fuzzy merging and wall-thickening analysis.
        Med Phys. 2004; 31: 860-872
        • Catarious Jr, D.
        • Baydush A.H.
        • Floyd Jr, A.E.
        Incorporation of an iterative, linear segmentation routine into a mammographic mass CAD system.
        Med Phys. 2004; 31: 1512-1520
        • Iordanescu G.
        • Summers R.M.
        Reduction of false positives on the rectal tube in computer-aided detection for CT colonography.
        Med Phys. 2004; 31: 2855-2862
        • Yin T.-K.
        • Chiu N.-T.
        A computer-aided diagnosis for locating abnormalities in bone scintigraphy by a fuzzy system with a three step minimization approach.
        IEEE Trans Med Imag. 2004; 23: 639-654
        • Paik D.S.
        • Beaulieu C.F.
        • Rubin G.D.
        • et al.
        Surface normal overlap: a computer-aided detection algorithm with application to colonic polyps and lung nodules in helical CT.
        IEEE Trans Med Imag. 2004; 23: 661-675
        • Joo S.
        • Yang Y.S.
        • Moon W.K.
        • et al.
        Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features.
        IEEE Trans Med Imaging. 2004; 23: 1292-1300