CIDR
Skip navigation

You are here: CIDR>Sequencing> General Information

 

General Sequencing Information

 

Investigators have a wealth of study design options available to them using next generation sequencing technologies. We currently offer production scale whole genome, whole exome and custom targeted sequencing services which utilize LIMS tracking, robotic automation, strict QC standards and automated primary, secondary and tertiary analyses.

 

 

Services Include

 

 

Sequencing Release Formats

QC Metrics

The QC report is a per-sample report that contains >100 QC metrics, including sequencing completeness and coverage/depth information (such as mean coverage, % targeted bases covered at 20X), quality measures against high density SNP array (heterozygote sensitivity, homozygote and heterozygote concordance with SNP array), variant call quality measures (% dbSNP and TiTv ratio) and a number of laboratory oriented measures (% duplicates, library size, insert size, etc.).  Cross-sample contamination is estimated using VerifyBamID. The QC report is used in-house to monitor QC metrics real-time and to quickly evaluate data.

 

Annotation

ANNOVAR is incorporated in our CIDRSeqSuite pipeline and is used for annotation of variants from VCF files.  A merged per-sample ANNOVAR report includes information from databases such as RefGene, UCSC, Ensembl, dbSNP131&132, OMIM and the NHGRI GWAS catalogue.  Global and population-specific non-reference allele frequencies are provided from Complete Genomics, 1000 Genomes Project and NHLBI Exome Sequencing Project (ESP), and prediction scores from SIFT, polyphen and dbNSFP.

 

Variant Calls

CIDR can provide single-sample and/or multi-sample VCF files.  SNV and small indels are called using GATK's 3+HaplotypeCaller joint-calling gVCF workflow using all samples sequenced for a project. Depending on the size of the project, variant filtering is performed either using GATK's Variant Quality Score Recalibration (VQSR) protocol or using "hard" cut-offs.

 

Raw Data

The most raw form of data released are the BAM or CRAM files. We prefer to release loss-less CRAM files generated using SAMTools 1.3.2.


In addition, we release the genotypes obtained from a high density SNP array.


Although our exploration of new analysis software and new algorithms is ongoing, we use an established analysis pipeline, CIDRSeqSuite, for processing sequence data. The CIDRSeqSuite pipeline is updated frequently to enhance the quality of alignment and variant calling and details are included in the release.


Please inquire if you have specific questions related to sequencing release formats.

 

 


Contact Us  |  Privacy Policy  |  Site Map  |  Get Adobe Reader

 

Subscribe to CIDR News

 

 

 

photo of lab tch

 

Senior Informatics Research Analyst Brian Craig programs the MiSeq