Introduction to TCIApathfinder

Pamela Russell

2019-04-16

Installation

From within R:

install.packages("TCIApathfinder")

From GitHub:

# install.packages("devtools")
devtools::install_github("pamelarussell/TCIApathfinder")

Authentication

An API key is required to access data from TCIA. To obtain and correctly store your API key:

  1. Request a key from TCIA by following the instructions here.

  2. Create a text file in your home directory (~/) called .Renviron.

  3. Create the contents of the .Renviron file like this, making sure the last line in the file is empty. Otherwise, R will silently fail to load the file.

    TCIA_API_KEY=xxx-xxx-xxx-xxx
    
  4. Restart R. .Renviron is only processed at the beginning of an R session.

Usage

Load the package:

Get the names of all TCIA collections:

## [1] "4D-Lung"            "ACRIN-FLT-Breast"   "APOLLO"            
## [4] "Anti-PD-1_Lung"     "Anti-PD-1_MELANOMA" "BREAST-DIAGNOSIS"

Get the names of all imaging modalities

## [1] "CR"     "CT"     "CTPT"   "DX"     "FUSION" "MG"

Note: a collection or body part can be specified to narrow down results.

Get the names of all body parts studied:

## [1] "ABDOMEN"         "BD CT ABD WO_W " "BLADDER"         "BRAIN"          
## [5] "BREAST"          "CAROTID"

Note: a collection or modality can be specified to narrow down results.

Get information for all patients in a collection

##     patient_id patient_name patient_dob patient_sex patient_ethnic_group
## 1 TCGA-AR-A1AQ TCGA-AR-A1AQ          NA           F                   NA
## 2 TCGA-AR-A24S TCGA-AR-A24S          NA           F                   NA
## 3 TCGA-AR-A1AX TCGA-AR-A1AX          NA           F                   NA
## 4 TCGA-AR-A24M TCGA-AR-A24M          NA           F                   NA
## 5 TCGA-AR-A24R TCGA-AR-A24R          NA           F                   NA
## 6 TCGA-AR-A24U TCGA-AR-A24U          NA           F                   NA
##   collection
## 1  TCGA-BRCA
## 2  TCGA-BRCA
## 3  TCGA-BRCA
## 4  TCGA-BRCA
## 5  TCGA-BRCA
## 6  TCGA-BRCA

Note: if no collection is passed, patients for all collections are returned.

Get all image series based on criteria

##   patient_id collection
## 1         NA  TCGA-BRCA
## 2         NA  TCGA-BRCA
## 3         NA  TCGA-BRCA
## 4         NA  TCGA-BRCA
## 5         NA  TCGA-BRCA
## 6         NA  TCGA-BRCA
##                                                 study_instance_uid
## 1 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 2 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 3 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 4 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 5 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 6 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
##                                                series_instance_uid
## 1 1.3.6.1.4.1.14519.5.2.1.3344.4002.142000486987125226950494153345
## 2 1.3.6.1.4.1.14519.5.2.1.3344.4002.176672261446738229459423756538
## 3 1.3.6.1.4.1.14519.5.2.1.3344.4002.211084519843030234592826223931
## 4 1.3.6.1.4.1.14519.5.2.1.3344.4002.240461194127099406985978695670
## 5 1.3.6.1.4.1.14519.5.2.1.3344.4002.268424555374802928499999399479
## 6 1.3.6.1.4.1.14519.5.2.1.3344.4002.270335870121494755687802920012
##   modality   protocol_name series_date series_description
## 1       MR VIBRANT BREAST/  2001-11-21      LT SAG FSE T2
## 2       MR VIBRANT BREAST/  2001-11-21            VIBRANT
## 3       MR VIBRANT BREAST/  2001-11-21              ax t1
## 4       MR VIBRANT BREAST/  2001-11-21          ASSET CAL
## 5       MR VIBRANT BREAST/  2001-11-21      RT SAG FSE T1
## 6       MR VIBRANT BREAST/  2001-11-21       BRAVA--1 MIN
##   body_part_examined series_number annotations_flag       manufacturer
## 1             BREAST      4.000000               NA GE MEDICAL SYSTEMS
## 2             BREAST      8.000000               NA GE MEDICAL SYSTEMS
## 3             BREAST      2.000000               NA GE MEDICAL SYSTEMS
## 4             BREAST      7.000000               NA GE MEDICAL SYSTEMS
## 5             BREAST      5.000000               NA GE MEDICAL SYSTEMS
## 6             BREAST      9.000000               NA GE MEDICAL SYSTEMS
##   manufacturer_model_name software_versions image_count
## 1            SIGNA EXCITE                11          36
## 2            SIGNA EXCITE                11         464
## 3            SIGNA EXCITE                11          40
## 4            SIGNA EXCITE                11          64
## 5            SIGNA EXCITE                11          40
## 6            SIGNA EXCITE                11         136

Note: other ways to narrow down results include

Get detailed information on all imaging studies for a patient

##     patient_id patient_name patient_dob patient_age patient_sex
## 1 TCGA-AR-A1AQ TCGA-AR-A1AQ          NA        049Y           F
## 2 TCGA-AR-A1AQ TCGA-AR-A1AQ          NA        050Y           F
##   patient_ethnic_group admitting_diagnoses_description collection study_id
## 1                   NA                              NA  TCGA-BRCA       NA
## 2                   NA                              NA  TCGA-BRCA       NA
##                                                 study_instance_uid
## 1 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 2 1.3.6.1.4.1.14519.5.2.1.3344.4002.100294194044853718189419781050
##   study_date     study_description series_count
## 1 2001-11-21         *MRI - BREAST           11
## 2 2003-05-07 MRI BREAST, BILATERAL           12

The variables in studies$patient_studies correspond to the fields of a PatientStudy object as described in the API documentation.

Note: other ways to narrow down results include a collection or a study instance UID.

Get all imaging studies for a collection

##   Collection    PatientID
## 1  TCGA-BRCA TCGA-AO-A129
## 2  TCGA-BRCA TCGA-AR-A24S
## 3  TCGA-BRCA TCGA-AR-A1AQ
## 4  TCGA-BRCA TCGA-AR-A24S
## 5  TCGA-BRCA TCGA-E2-A107
## 6  TCGA-BRCA TCGA-E2-A108
##                                                   StudyInstanceUID
## 1 1.3.6.1.4.1.14519.5.2.1.9203.4002.285233690698334761371173535710
## 2 1.3.6.1.4.1.14519.5.2.1.3344.4002.249354313922816279857767035077
## 3 1.3.6.1.4.1.14519.5.2.1.3344.4002.307747749278929226311301198628
## 4 1.3.6.1.4.1.14519.5.2.1.3344.4002.291701067965044082231683003194
## 5 1.3.6.1.4.1.14519.5.2.1.3023.4002.187532659972726751563533268137
## 6 1.3.6.1.4.1.14519.5.2.1.3023.4002.189852030878882704246542774709

Note: a patient ID can be provided to further narrow down results.

Get individual DICOM image IDs for an image series

## [1] "1.3.6.1.4.1.14519.5.2.1.3344.4002.103833384819234677052128156490"
## [2] "1.3.6.1.4.1.14519.5.2.1.3344.4002.107594813336758156477053115154"
## [3] "1.3.6.1.4.1.14519.5.2.1.3344.4002.108961012724040858986707256483"
## [4] "1.3.6.1.4.1.14519.5.2.1.3344.4002.113224119964450170072494597907"
## [5] "1.3.6.1.4.1.14519.5.2.1.3344.4002.114239357229984807449733158209"
## [6] "1.3.6.1.4.1.14519.5.2.1.3344.4002.114874592963584770107488944633"

Download a single DICOM image

## [1] "/var/folders/r1/tyhthr1569v58n3rs3hf2j0r0000gp/T//RtmptomfLB/file4003427341c1/150.dcm"

Note: a file name can be provided to override the original file name.

Download an image series and extract it

## [1] "/private/var/folders/r1/tyhthr1569v58n3rs3hf2j0r0000gp/T/RtmptomfLB/file4003570d346d"

Download, save and extract an image series, optionally to a temporary location

## Downloading Series
## Unzipping Series
## [1] "/private/var/folders/r1/tyhthr1569v58n3rs3hf2j0r0000gp/T/RtmptomfLB/file4003519dd295"

Additional functions

See package documentation for further details: