br Giorgio Corti et al br version Instead
Giorgio Corti et al
version). Instead, the FUSION panel has been designed selecting the most frequent oncogenic kinases involved in fusions and the most frequently rearranged partners, identified on the basis of the available literature and The Cancer Genome Atlas database. Custom probes were designed to capture exons and introns of upstream (50) and downstream (30) partners. The panel also allowed enriching hot-spot mutations previously associated to EGFR blockade resistance in CRC, the entire promoter of EGFR ligands, and, finally, all coding exons of DETA NONOate known to be involved in CRC tumorigenesis (PTEN, TP53, APC, CTNNB1). The entire selected target regions encom-passed 918kb (see Supplemental Table 2 in the online version). Upon
Figure 1 Customized Workflow for DNA Extraction and Next Generation Sequencing Library Preparation. A, Different Sample Types are Typically Available, Including FFPE, Fresh Tissue, or ctDNA. B, Based on the Tissue of Origin, Colorectal Cancer DNA Presents Different Features. C, Definite Protocols for DNA Extraction are Used on the Basis of DNA Characteristics. D, Library Prep Steps Have Been Optimized According to DNA Features, to Obtain DNA Fragments With Proper Length and to Include Sequencing Adapters. E, the DNA Enrichment Step is the Same Regardless of the Type of Sample: DNA Regions of Interest Hybridize With Target-specific Biotinylated Probes and are Then Captured by Streptavidin-Coated Beads
FFPE FRESH TISSUE ctDNA
FFPE specimens Urine
B Fragmented Intact Highly fragmented
Low quality High quality Good quality
D Library preparation
QIAamp DNA ReliaPrep gDNA Tissue Plasma and CSF: QIAamp
Circulating Nucleic Acid Kit
FFPE Tissue Kit
Miniprep System (Qiagen)
(Promega) Urine: centrifugation-based
cutting and A-tailing
Adapter ligation simultaneous Adapter ligation
Abbreviations: CSF ¼ cerebrospinal fluid; ctDNA ¼ circulating tumor DNA; FFPE ¼ formalin-fixed paraffin-embedded; PBMC ¼ peripheral blood mononuclear cell.
IDEA NGS Workflow
quality assessment, final libraries were then sequenced using Illumina MiSeq or NextSeq500 sequencers (Illumina Inc).
Bioinformatic Analysis of NGS Data
All bioinformatic tools were run with default parameters unless otherwise specified. In the quality control (QC) module and “Mapping” phase, raw reads generated by the sequencer were aligned to the reference genome by bwa-mem38 algorithm (version 0.7.13-r1126); polymerase chain reaction (PCR) duplicates were marked using MarkDuplicates in the Picard tools suite39 (v. 2.0.1). SAMtools40 (v. 1.3.1) was used for reading, writing, or viewing files in the SAM/BAM/CRAM format. The circular binary segmentation algorithm, as implemented in the DNA-copy R module,41 was used to cluster all gene copy-number alterations (CNA) in the dedicated module. Pindel42 (v. 0.2.5b6) was used for local read realignment in the insertion/deletion (INDEL) module. Blat43 (v. 35) was used for fine remapping of the reads in the FUSION module, with tileSize ¼ 11 and stepSize ¼ 5. We set that each reported fusion breakpoint must be supported by at least 10 reads and each fusion partner must have at least 15 mapped bases on the respective end of the read.
To carry out analyses for multiple patients at the same time, the bioinformatic workflow leverages a high performance computing cluster composed of 5 nodes running the SLURM workload man-ager. The use of a high performance computing cluster allows spreading jobs across nodes to significantly speed-up analysis as well as storing in a central location sequencing data, genome references, aligner indexes, annotations, genomic databases, and analysis tools to ensure reproducibility. All custom scripts are available at https:// bitbucket.org/irccit/idea.