Rationale and Objectives
The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI)
is the largest publicly available computed tomography (CT) image reference data set
of lung nodules. In this article, a comprehensive data analysis of the data set and
a uniform data model are presented with the purpose of facilitating potential researchers
to have an in-depth understanding to and efficient use of the data set in their lung
cancer–related investigations.
Materials and Methods
A uniform data model was designed for representation and organization of various types
of information contained in different source data files. A software tool was developed
for the processing and analysis of the database, which 1) automatically aligns and
graphically displays the nodule outlines marked manually by radiologists onto the
corresponding CT images; 2) extracts diagnostic nodule characteristics annotated by
radiologists; 3) calculates a variety of nodule image features based on the outlines
of nodules, including diameter, volume, and degree of roundness, and so forth; 4)
integrates all the extracted nodule information into the uniform data model and stores
it in a common and easy-to-access data format; and 5) analyzes and summarizes various
feature distributions of nodules in several different categories. Using this data
processing and analysis tool, all 1018 CT scans from the data set were processed and
analyzed for their statistical distribution.
Results
The information contained in different source data files with different formats was
extracted and integrated into a new and uniform data model. Based on the new data
model, the statistical distributions of nodules in terms of nodule geometric features
and diagnostic characteristics were summarized. In the LIDC/IDRI data set, 2655 nodules
≥3 mm, 5875 nodules <3 mm, and 7411 non-nodules are identified, respectively. Among
the 2655 nodules, 1) 775, 488, 481, and 911 were marked by one, two, three, or four
radiologists, respectively; 2) most of nodules ≥3 mm (85.7%) have a diameter <10.0 mm
with the mean value of 6.72 mm; and 3) 10.87%, 31.4%, 38.8%, 16.4%, and 2.6% of nodules
were assessed with a malignancy score of 1, 2, 3, 4, and 5, respectively.
Conclusions
This study demonstrates the usefulness of the proposed software tool to the potential
users for an in-depth understanding of the LIDC/IDRI data set, therefore likely to
be beneficial to their future investigations. The analysis results also demonstrate
the distribution diversity of nodules characteristics, therefore being useful as a
reference resource for assessing the performance of a new and existing nodule detection
and/or segmentation schemes.
Key Words
To read this article in full you will need to make a payment
Purchase one-time access:
Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online accessOne-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:
Subscribe to Academic RadiologyAlready a print subscriber? Claim online access
Already an online subscriber? Sign in
Register: Create an account
Institutional Access: Sign in to ScienceDirect
References
- The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans.Med Phys. 2011; 38: 915-932
- National Cancer Institute initiative: lung image database resource for imaging research.Acad Radiol. 2001; 8: 447-450
- The Lung Image Database Consortium (LIDC) data collection process for nodule detection and annotation.Acad Radiol. 2007; 14: 1464-1474
- A computer-aided diagnosis system for detection of lung nodules in chest radiographs with an evaluation on a public database.Med Image Anal. 2006; 10: 247-258
- Lung nodule detection in low-dose and thin-slice computed tomography.Comput Biol Med. 2008; 38: 525-534
- Performance analysis of a new computer aided detection system for identifying lung nodules on chest radiographs.Med Image Anal. 2008; 12: 240-258
- A new computationally efficient CAD system for pulmonary nodule detection in CT imagery.Med Image Anal. 2010; 14: 390-406
- Computer-aided diagnosis for lung cancer.Radiol Clin North Am. 2000; 38: 497-509
- Computer-aided diagnosis: a shape classification of pulmonary nodules imaged by high-resolution CT.Comput Med Imaging Graph. 2005; 29: 565-570
- Computer-aided diagnosis of pulmonary nodules on CT scans: segmentation and classification using 3D active contours.Med Phys. 2006; 33: 2323
- A pulmonary nodule view system for the Lung Image Database Consortium (LIDC).Acad Radiol. 2011; 18: 1181-1185
- The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements.Acad Radiol. 2007; 14: 1475-1485
- An analysis of early studies released by the Lung Imaging Database Consortium (LIDC).Acad Radiol. 2007; 14: 1382-1388
- Computerized comprehensive data analysis of Lung Imaging Database Consortium (LIDC).Med Phys. 2010; 37: 3802-3808
Available at: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI; Accessed October 20, 2013.
Article info
Publication history
Published online: January 16, 2015
Accepted:
December 6,
2014
Received:
September 15,
2014
Identification
Copyright
© 2015 AUR. Published by Elsevier Inc. All rights reserved.