The Advanced Technology Genomics Core (ATGC) offers several sequencing services, including:
- Illumina next-generation sequencing
- Sanger-based DNA sequencing
- Sanger-based gene resequencing
- Single cell analysis
Learn more about each of these services below.
The ATGC provides a complete next generation sequencing service. Investigators provide the facility with genomic DNA, total RNA or ChIP DNA (depending on the requested application) and the facility provides complete sample processing.
NGS services include:
1. Project consultation and budget planning with a facility representative and an MDACC faculty bioinformatician.
2. Library Preparation: An NGS library is made up of random fragments that represent the entire sample. It is created by shearing DNA into 150-400 base fragments. These fragments are ligated to specific adapters. Library fragments of the appropriate size are then selected (size is application dependent) and isolated. Following a sample cleanup step, the resultant library is quantified by qPCR and checked for quality using the Agilent TapeStation. The ATGC has automated library preparation for most applications using the Eppendorf EPMotion 5075 and Agilent Bravo Liquid Handlers.
3. Cluster Generation: Library fragments are bound to a flow cell by hybridizing the fragments to a lawn of oligonucleotides complementary to the adapter sequences. Bound fragments are clonally amplified by bridge amplification to create millions of individual dense clusters of clones. Cluster generation occurs in a closed environment on the Illumina cBOT instrument. Cluster generation occurs on-instrument on the NovaSeq6000, NextSeq500 and iSeq 100 instruments.
4. Illumina Sequencing: Sequencing on the flow cell employs Illumina’s well-established sequencing-by-synthesis chemistry. This chemistry utilizes two (NovaSeq6000, NextSeq500) or four (HiSeq4000 and MiSeq) reversible terminator nucleotides, each possessing a chemically blocked hydroxyl group. To begin sequencing, primers are hybridized to single stranded, covalently bound templates on the flow cell. Fluorescently labeled nucleotides are then flowed across the flow cell. During chain extension the fluorescent nucleotides compete for incorporation into the growing DNA chain. A single complementary nucleotide is incorporated into each DNA, terminating the chain and resulting in the simultaneous one base extension of millions of DNA clusters. The incorporated nucleotides are excited by a laser, and emit their characteristic fluorescence (or lack of fluorescence). This fluorescence is detected and recorded in an imaging step. Following base detection, the fluorescent dye is cleaved and the 3’ hydroxyl block is chemically reversed, allowing chain extension to continue. This is repeated 36 to 300 times, generating a series of images.
5. Data Analysis: The raw data generated is imaged, and base-called before sequence analysis begins. Sequences generated are de-multiplexed, aligned to a reference genome and transferred to an institutional server where the sequence data is accessed by MDACC bionformaticians. Data analysis is performed in collaboration with faculty from the department of bioinformatics.
The ATGC provides free NGS project consultation.
Contact Erika Thompson email@example.com to make an appointment
|Flow Cell Type and Cycle #||Number of Lanes||Run Options||Approx. PF Clusters (M)||Estimated Output (Gb)*||MDACC Price**|
||1||1X100, 2X50, 26X91X8||650-800||65-80||$3,500|
|S-Prime-100 Xp||2||1X100, 2X50, 26X91X8||325-400||32-40||$1,937|
|S1-100||1||1X100, 2X50, 26X91
|S1-100 Xp||2||1X100, 2X50, 26X91
|S2-100||1||1X100, 2X50, 26X91
|S2-100 Xp||2||1X100, 2X50, 26X91
|S4 200 Xp||4||2X100||2000-2500||400-500||$7,267
*The ATGC does not guarantee output for investigator prepared libraries
**Please see the Price List for current External or GCC pricing.
HiSeq4000 -the Illumina Hiseq4000 is a mid-throughput sequencer that consists of two, eight-lane flow cells. The flow cells can be run independently or in parallel. Each flow has a maximum out-put of 750 Gb per run or 93.75 Gb per 150 bp paired end lane.
|Read Length||Estimated Output per Lane||Price per Lane*|
|50 Single Read||13.1 Gb||$1,276|
|75bp Paired End||40.63 Gb||$2,130|
|100bp Paired End||54.1 Gb||$2,444|
*Please see the Price List for current External or GCC pricing.
NextSeq500-The Illumina NextSeq 500 System is the only desktop next-generation sequencing (NGS) system capable of sequencing a 30X human genome in a single run. Two flow cell formats and multiple reagent configurations enable data output from 20–120 Gb in a single run, providing flexibility across a broad range of applications. It has a simple workflow and quick run times that enable fast sequencing of exomes, transcriptomes, and whole genomes. The NextSeq 500 sequencer generates up to 400 million clusters passing filter (up to 120 Gb) in the High Output configuration and up to 130 million clusters passing filter (up to 40 Gb) in the Mid Output configuration.
|NextSeq500 Sequencing-per run||Estimated Output per Run||Approx. SE Reads Per Run||MD Anderson Price*|
|75SR||25-30 Gb||400 million||$1,823|
|75PE||50-60 Gb||400 million||$3,236|
|150PE||100-120 Gb||400 million||$4,855|
|75PE||16-19 Gb||130 million||$1,438|
|150PE||35-39 Gb||130 million||$2,104|
*Please see the Price List for current External or GCC pricing.
MiSeq-the Illumina MiSeq is a low output sequencer capable of generating up to 15 Gb of data per instrument run. It has the longest read-length in the Illumina line-up, generating up to 600 bases per 300bp paired-end run. A variety of flow cells and read lengths provide flexibility on this single sample platform.
|Flow Cell Type||Maximum Output||Approx. SE Reads Per Run||MD Anderson Price*|
|MiSeq150 V3||3.8 Gb||20-25million||$1,108|
|MiSeq600 V3||15 Gb||20-25million||$1,804|
|MiSeq300 V2||4.5-5 Gb||10-15 million||$1,260|
|MiSeq500 V2||7.5-8 Gb||10-15 million||$1,445|
|MiSeq300Nano||300 Mb||up to 1 million||$469|
*Please see the Price List for current External or GCC pricing.
Transcriptome analysis may be quantitative (gene expression analysis) and/or qualitative (transcript discovery, splice variant identification, coding SNP validation, gene fusions). The ATGC offers several options for transcriptome analysis. The choice of sample preparation method is based on the sample quality, quantity and the investigator’s experimental objective.
|Application||FFPE Compatible||Strand Specific||Application Notes|
|RNA Exome (RNA Access)||Yes||Yes||Human mRNA-Seq for FFPE samples- Generates cDNA from total RNA then captures the exome regions. This protocol is optimized for sequencing RNA from degraded or FFPE samples and samples with limited starting material. RNA-Access enables the discovery of novel features such as alternative splicing, fusion transcripts, coding splice variants and quantitative gene expression analysis. Capture based method covering 98.3% of the RefSeq Exome. Human only|
|Stranded mRNA-Seq||No||Yes||Uses oligo dT based capture for Poly enrichment followed by cDNA synthesis using random and oligo dT priming. Sequences generated map to coding regions of the genome.
Applications: Gene expression quantitation, fusions, splice variants- Poly A transcripts only
|Stranded Total RNA-Seq
||Yes||Yes||Here rRNA depletion is performed (no Poly A enrichment) followed by cDNA synthesis utilizing oligo-d(T) and random hexamers. This method allows the sequencing of mRNA and non-polyadenylated RNA including histone mRNAs, precursors for Cajal body related small RNAs, and lncRNAs. Sequences map to exons and intergenic regions.
Applications: Gene expression, lncRNA. More complete transcriptome with Poly A and non-Poly A transcripts
|Low Input mRNA-Seq
||mRNA--Seq for good quality samples with <100ng total RNA|
|Low Input Total RNA-Seq||Yes||Yes||Total RNA-Seq for samples with <100ng total RNA|
|Small RNA-Seq including miRNA-Seq||Yes||
|Quantification of miRNA expression. Protocol integrates Unique Molecular barcodes (UMIs ) into the reverse transcription reaction enabling unbiased andaccurate miRNome-wide quantification of mature miRNAs. UMIs require 75SR sequencing.|
|TCR a/b Profiling||No||No||TCR alpha and Beta targeted sequencing|
|RNA Capture||Custom application|
|RIP-Seq||No||No||Investigator provides immunoprecipitated RNA|
Note: Strand specificity -Preserves strand information. Strand specificity can be used to identify antisense transcripts, determine the transcribed strand of non-coding RNAs and may help to demarcate the boundaries of overlapping genes.
|Application||FFPE Compatible||Application Notes|
|Agilent Exome V7||Yes||SureSelect Human All Exon v7, is a comprehensive exome, designed using the latest versions of RefSeq (99.3% coverage), GENCODE (99.6% coverage), CCDS (99.6% coverage) and UCSC Known Genes (99.6% coverage). Design Size: 48.2 Mb|
|Agilent Clinical Research Exome||Yes||The SureSelect clinical research exome V2 is a comprehensive medical exome with overall exonic coverage, enhanced coverage of genes associated with disease and increased coverage of HGMD, OMIM, ClinVar, and ACMG targets. The associated gene list includes gene names and evidence of their disease relevance. Design Size: 67.3 Mb|
|T200.1 Panel||Yes||263 gene solid tumor panel. Covers all exons.|
|ChiP-Seq||NA||Used to identify transcription factor (protein) binding sites in genomes and specific cell types. The investigator performs chromatin IP and provides antibody captured DNA to ATGC. Library requires a minimum of 10ng of immunoprecipitated DNA. Maximum size 500bp. Enrichment should be 10 fold or greater. Requires ' input' DNA for analysis comparison.|
|Targeted Capture||Yes||Custom Application. Selectively enriches for and sequences investigator defined regions of interest. The ATGC provides custom targeted capture using hybridization-based capture probes and amplicon based enrichment methods (Illumina, Haloplex and Qiagen designs). The facility stocks the T200.1 panel.|
|Whole Genome -Seq||No||Human, Mouse, Rat, Yeast, Monkey, Viral, Bacterial and other genomes. For applications in cancer research, the ATGC provides sequencing of matched tumor and normal samples.|
The ATGC provides budget planning, technology consultations and project planning with an NGS specialist and an MD Anderson faculty bioinformatician. We strongly recommend that first time NGS service users and investigators with large-scale projects schedule a meeting before initiating a project. To schedule a consultation meeting please contact Erika Thompson firstname.lastname@example.org.
All samples should be accompanied by a completed sample submission form. Sample submission requirements (minimum quantity and recommended sequence length) vary based on the service and sample type.
Bioinformatics Faculty Collaborators
- Xiaoping Su, PhD.
Submit forms to
The ATGC provides DNA Sequencing from single stranded or double stranded DNA, from purified plasmids, PCR products, and BACs. Sequencing is performed primarily on ABI 3730XL and 3730 DNA sequencers using Big Dye terminator cycle sequencing chemistry. The facility quantifies all samples, performs sequencing reactions, cleanup and capillary electrophoresis, analyzes the data and provides sequence as text files and chromatograms.
View our service pricing schedule for more information about DNA sequencing pricing.
Turnaround Time: 24-48 hours (weekdays)
Longer turnaround times may occur when our sample volume is very high, when special conditions are requested and when we experience instrument problems.
Guidelines for DNA submission
Sample submission requirements
Plasmid concentration must be 100 ng/µl. Submit 10 µl per reaction. Custom primers must be at 1 pmol/µl. Submit 10 µl per reaction. The DNA and primer must be submitted in 0.5-ml Eppendorf tubes with the sample name written on the sides and tops of the tubes. BAC DNA should be submitted at 500 ng/µl and primers at 25 pmol/µl. PCR products should be submitted at a minimum concentration of 20 ng/µl for products less than 1 kb. Products 1 kb or greater should be submitted at 30 ng/µl. Submit 10 µl per reaction
Quantitation of DNA
All DNA submitted to the facility needs to be accurately quantified. We recommend visual determination of DNA quality and quantity on an agarose gel using a quantitative DNA ladder. Alternatively DNA concentration can be determined fluorometrically. DNA concentrations determined using a spectrophotometer are often artificially high due to the presence of RNA, proteins, bacterial genomic DNA and other contaminants.
Note: Low DNA concentration is the most common cause of poor quality sequence and failed reactions. Too much DNA can, however, be as bad as too little. The presence of too much template results in top-heavy data (strong peaks at the beginning which fade rapidly), pull-up peaks (non-specific peaks that appear below the correct peak) and loss of peak resolution. In addition, it shortens the life of our capillaries.
Custom sequencing primer design guidelines
- Thermal cycling conditions: Denaturing step heats the reaction mix to 96°C. The annealing step cools the reaction mix to 55°C, and the polymerization step extends at 60°C. Primers must have annealing temperatures of 56°C or higher.
- Primer length should be at least 20- to 25-mers. GC content should be 50% or more.
- Primers should be designed with a tightly binding 3' end.
- When designing a primer, do not pick a region that is closer than 50 bases to the region of interest.
- Primers for PCR reactions tend to work fine for automated sequencing.
Sequencing primers provided by ATGC
The facility currently provides the following primers free of charge:
- T3: CCT CAC TAA AGG GAA CAA AAG C
- T7: TAA TAC GAC TCA CTA TAG GGC GA
- T3-0: ATT AAC CCT CAC TAA AGG GA
- T7-0: TAA TAC GAC TCA CTA TAG GG
- M13(-21): GTA AAA CGA CGG CCA G
- M13rev: TCA CAC AGG AAA CAG CTA TGA C
- Sp6: GAT TTA GGT GAC ACT ATA G
- Bluescript KS: TCG AGG TCG ACG GTA TC
- Bluescript SK: CGC TCT AGA ACT AGT GGA TC
- BGH Rev: TAG AAG GCA CAG TCG AGG
- PCMV For: CGC AAA TGG GCG GTA GGC GTG
- pGEX 3': CCG GGA GCT GCA TGT GTC AGA GG
- pGEX 5': GGG CTG GCA AGC CAC GTT TGG TG
- T7 Term: GCT AGT TAT TGC TCA GCG G
The facility will repeat a sample at no cost to the investigator if there is an instrument failure, or if the quality of sequence is compromised due to an error in the facility. If the investigator recommends a sample be repeated, the same DNA and primer will be used to repeat the sequencing reaction. If the reaction fails a second time, the investigator will be charged for the repeated reaction. If an incorrect primer has been requested for sequencing and as a result no sequence data is produced by the sequencing reaction, the cost will be borne by the investigator.
The facility will assist investigators in designing primers to difficult regions. Please contact Erika Thompson if you require this service
Frequently asked questions
View answers to frequently asked questions such as: Why did my template not sequence? Why should I resuspend in water? Does the host strain matter? Does it matter which media I use to grow my cells? Why did my sequence stop short?
*Note: The text sequence provides you with unedited, raw sequence data. Please view the chromatograms to correct minor errors made by the base-calling software. The Sequencing and Microarray Facility provides a server for the rapid dispersal of data to the principal investigators. The results remain on the server for 30 days, after which files will automatically be deleted. Investigators should copy all data to their drives.
When using AppleTalk, the server will only allow 20 users to log on at a time. The server has been set up to allow 15 minutes per user log on to download files. If your workstation logs on automatically at start up you will be bumped off the system after 15 minutes.
(If you do not have a folder online, please contact the ATGC and we will have a folder created for you.)
New Directions to Access Your Online Sequencing Results
- Click "Start"
- Click "Run"
- Type in “\\mymdafiles\seq”
- Click "OK"
- A dialog box will appear. Type in your "username" and "password".
- Click on "Go" (located at top of screen)
- Click on "Connect to Server"
- Type in “smb://mymdafiles/seq”
Both PC and Mac users will use the same username and password that you use to access your MD Anderson account through Entourage and/or Outlook.
Viewing chromatograms (free downloadable software)
Chromas Software: http://technelysium.com.au/wp/chromas
4peaks Software: http://downloads.nucleobytes.com/4peaks
Sanger-based gene resequencing enables gene mutation detection by evaluating an entire gene or individual SNPs in a single experiment. The ATGC performs this assay using custom designed primers. View the genes available for this service.
View the service pricing schedule for more information about gene resequencing pricing.
- Customization available
- New assay design free of charge
- Gold standard for next-generation sequencing validation
Quantification is performed on a Qubit fluorometer. Sequencing is performed on a 3730XL DNA Analyzer (Thermo) and comparative analysis uses SeqScape software (Thermo).
- Quantification and normalization of samples
- PCR amplification and purification
- Sanger sequencing in both directions (where possible)
- Comparative alignment to the reference sequence
- Re-analysis (if needed)
- Report of all mutations found and all sequences generated sent to customer
Send completed submission form to D.J. Doss or submit hard copy with samples. Samples may be submitted in 1.5 or 0.5 microcentrifuge tubes with the name clearly written on the top of the tube. The amount of gDNA needed is dependent on the gene requested, please contact D.J. Doss for more information.
Scheduling sample drop-off
Contact D.J. Doss to schedule sample drop-off and to determine the amount of gDNA needed.
For new assay design please contact D.J. Doss or Erika Thompson.
Submitting samples for 10X single cell processing
Project Consultation - We strongly recommend a consultation meeting before starting your first 10X Genomics project. Consultations are free for investigators utilizing the ATGC's single cell service. To request a meeting please contact David Pollock (email@example.com) or Erika Thompson (firstname.lastname@example.org).
Scheduling Your Experiment - The single cell service is by appointment only. You must have an appointment prior to submitting samples. Submission appointments should be made a minimum of one week in advance of your anticipated submission date. This prevents scheduling conflicts and ensures the appropriate reagents are available and at the correct temperatures for immediate use (having samples sit while we bring reagents to temperature may negatively impact data). We will make effort to accommodate appointments made 24-48 hours before submission but we cannot guarantee availability.
To schedule a submission appointment, please contact David Pollock (email@example.com).
Sample Submission – To submit samples please complete a 10X Genomics Single Cell Service Request Form and email the completed form (with an active account number and the appropriate signature) to David Pollock (see email above). For external investigators, an additional external billing form with PO# and PI signature authorization is required in addition to the service request form.
Single Cell Applications
3' scRNAseq Gene expression profiling
- Gene expression profiling.
- Capture up to 10,000 cells per well of partitioning chip
Workflow - Single cells, RT priming-beads with specific barcode and universal molecular identifier (UMI), as well as lysis and RT reagents are partitioned into oil droplets where the beads are dissolved and cells are lysed for reverse transcription to take place. The droplets are then broken and the pooled single-stranded, barcoded cDNA are amplified and fragmented for library preparation. During the library preparation process, appropriate sequence primer sites and adapters are added so that the final product contains the 10X Barcode, UMI, the appropriately sized cDNA insert and Illumina adapters with P5 and P7 primer sequences for sequencing on an Illumina Sequencer (typically the NovaSeq6000 or NextSeq500).
5’ scRNAseq gene expression with immune profiling
- Gene expression profiling with the added ability to immune profile T cell TCR and B cell Ig for both human and mouse samples.
- Capture up to 10,000 cells per well of partitioning chip.
Workflow - The workflow for 5’ gene expression is similar to that of 3’ scRNAseq with the additional ability to enrich for the V(D)J segments expressed in T cells or B cells after cDNA amplification is completed. This provides the capability of running gene expression profiling, TCR profiling, and Ig profiling from the same cell suspension.
- Determination of single cell open chromatin regions for understanding epigenetic and regulatory variation across the genome.
- Capture up to 10,000 nuclei per well of partitioning chip.
Workflow - The single cell ATAC assay is used to assess the accessibility of chromatin on a single cell level. In the first step of this workflow, single nuclei are isolated from the single cell suspensions. Next, the nuclei generated in the first step undergo a transposase enzymatic reaction where accessible DNA regions are fragmented and tagged with sequencing adaptors. The transposed nuclei are then partitioned into GEMs where each nuclei is individually barcoded and made ready for library preparation.
- Detection of single cell CNV events and rare clones.
- Capture up to 5,000 cells per well of partitioning chip.
Workflow - The first step in the workflow is to partition individual cells in a hydrogel matrix to generate Cell Beads in a microfluidic chip. The Cell Beads are treated to lyse the captured cell and denature the genomic DNA (gDNA). On a second microfluidic chip, GemCode Technology samples a pool of ~750,000 10x Barcodes to separately index the gDNA of each individual cell. It does so by partitioning Cell Beads into nanoliter-scale Gel Beads-in-emulsion (GEMs), where all fragments share a common 10x Barcode. Libraries are generated and sequenced and 10x Barcodes are used to associate individual reads back to the individual partitions, and thereby, to each individual cell.
For more details regarding the applications mentioned above, please go to 10X Genomics website: www.10xgenomics.com
Sample Type - Samples should be submitted as single cell suspensions in 1.5 mL microfuge tubes. Cell dissociation from tissue and removal of significant cell debris via filtration (e.g. 40 um Flowmi tip cell strainer) should be performed (by the investigator) prior to submission.
Formalin-Fixed cells are not suitable for use in any 10X Genomics protocol.
Cell enrichment - by Flow cytometry, magnetic bead positive or negative cell selection, etc.. should be performed by the submitting investigator prior to sample submission.
Viability - Optimal viability is >90% however viabilities of 70-90% are acceptable. Viable cell suspensions can be cryopreserved (viability >90% prior to cryopreservation) and then submitted at a later time. Please note that the viability of previously cryopreserved samples can vary greatly. A Low viability may negatively impact the efficiency of cell capture as well as the sequencing data.
Buffer - Single cell suspensions for 3’ or 5’ scRNAseq services can be submitted in either 1X PBS (with ≤ 0.04% BSA-recommended by 10X Genomics-ideal buffer) or, for more sensitive cells, culture media. Please note, that additives in media can have an effect on capture efficiency. At the minimum, EDTA/EGTA should be left out of the media since it will inhibit downstream enzymatic steps. If bringing samples in PBS, cells should have been freshly harvested (minimal transit time before submission); otherwise, cells in media may be better depending duration of the cells outside of the system or on the sensitivity of your cell suspension in a non-media environment. Fetal bovine serum is permitted in either PBS or media but should not contain more than 10% fetal bovine serum (5% or less preferred).
For scATACseq and scCNV DNAseq cell suspensions should be submitted in 1X PBS. We do not recommend media for either application.
Free Nucleic Acids- If samples are suspected of containing contaminating free nucleic acids (from dead, dying, or lysed cells), we recommend that you remove them from the cell suspension by low-speed centrifugation (300-500XG for 5 minutes at 4 degrees C) of the cells along with 1-2 washes with fresh PBS/media. The presence of free nucleic acids negatively impacts the efficiency of cell capture as well as the sequencing data..
3’ or 5’ scRNAseq -The optimal cell suspension concentration is 700-1200 cells/uL (700,000-1,200,000 cells/mL) in 100-200 uL of PBS/media. We can, however, run lower cell quantities (we have successfully run cells < 10,000). For low cell quantities the volume of cells required is a maximum of 10-30 uL.
scATACseq,- The cell input for this application is 100,000-1X10^6 cells in 400-500 uL of PBS. Lower cell inputs are permitted in the range of 2,000-40,000 in a volume of 100-200 uL of PBS.
scCNV DNAseq- The concentration requirements are generally higher for this application and the working volume lower. The optimal concentration is between 1000-4000 cells/uL. A higher concentration may be required to obtain the maximum (5000) cell count for this application.
10X Genomics has minimum reads/cell recommendations for each application. These are very broad guidelines and may not meet the needs of your specific project. We strongly recommend that you consult your biostatistical collaborator to determine the number of cells and reads per cell required to meet your specific experimental objectives.
The table below outlines the ATGC’s general recommendations for the number of read pairs/cell for each application. The table is meant only as a guide for sequencing coverage.
ATGC read recommendations
|sc 3' RNAseq V3||50,000 read pairs/cell|
|sc 5' RNAseq gene expression||50,000 read pairs/cell|
|sc 5' RNAseq VDJ enrichment||10,000 read pairs/cell|
|scDNAseq CNV||800,000 read pairs/cell|
|scATACseq||50,000 read pairs/cell|
Single cell sequencing is typically performed on the Illumina NovaSeq6000 or NextSeq500 sequencers. Both sequencers allow flexibility by providing various flow cell format options to meet your sequencing requirements. The choice of sequencer and flow cell format for single cell experiments depends on the number of samples submitted and the target cell captures requested for each sample. For high sample volume or complex sample library pools, we will prescreen for appropriate sample sequencing distribution using the iSeq100 prior to sequencing on higher density flow cells.
The ATGC only provides raw data generated from our service. Base Call files (bcl) are converted to fastq files using specific software from 10X Genomics. Modified fastq files are used for QC metrics (html files) and demultiplexing of the sequencing data using 10X Genomics Application Pipeline software. Analysis files are created using 10X Genomics Pipeline software which can be viewed using freeware analysis software which is available for download from 10X Genomics.
Please see single cell service and sequencing pricing on the ATGC Price List. ATGC Price List.