The Sequencing and Microarray Facility (SMF) offers several sequencing services, including:
- Illumina next-generation sequencing
- Sanger-based DNA sequencing
- Sanger-based gene resequencing
- Single cell mRNA-Seq
Learn more about each of these services below.
The SMF provides a complete next generation sequencing service. Investigators provide the facility with genomic DNA, total RNA or ChIP DNA (depending on the requested application) and the facility provides complete sample processing.
NGS services include:
1. Project consultation and budget planning with a facility representative and a MDACC faculty bioinformatician.
2. Library Preparation: An NGS library is made up of random fragments that represent the entire sample. It is created by shearing DNA into 150-400 base fragments. These fragments are ligated to specific adapters. Library fragments of the appropriate size are then selected (size is application dependent) and isolated. Following a sample cleanup step, the resultant library is quantified by qPCR and checked for quality using the Agilent TapeStation. The SMF has automated library preparation for most applications using the Eppendorf EPMotion 5075 and Agilent Bravo Liquid Handlers.
|Flow Cell Type||Estimated Output||Approx. Pass Filter Clusters||Lane Splitting||MD Anderson Price*|
|S1-300 (150bp PE)||400 -500 Gb||*1.3-1.6 billion||No||$11,842/flow cell|
|S1-300Xp (150bp PE)||200 -250 Gb||*0.65-0.8 billion||2 lanes||$6474/lane|
|S1-200 (100bp PE)||260- 333 Gb||*1.3-1.6 billion||No||$10,357/flow cell|
|S1-200Xp (100bp PE)||130- 166 Gb||*0.65-0.8 billion||2 lanes||$5,641/lane|
|S1-100 (50bp PE)||130-167 Gb||*1.3-1.6 billion||No||$7,703/flow cell|
|S1-100Xp (50bp PE)||130-167 Gb||*0.65-0.8 billion||2 lanes||$4315/lane|
|S2-300 (150bp PE)||1000-1200 Gb||3.3-4.0 billion||No||$19,447/flow cell|
|S2-300Xp (150bp PE)||500-600 Gb||1.65-2.0 billion||2 lanes||$10,187/lane|
|S2-200 (100bp PE)||660-800 Gb||3.3-4.0 billion||No||$16,792/flow cell|
|S2-200Xp (100bp PE)||330-400 Gb||1.65-2.0 billion||2 lanes||$8,859/lane|
|S2-100 (50bp PE)||300-400 Gb||3.3-4.0 billion||No||$11,966/flow cell|
|S2-100Xp (50bp PE)||150-200 Gb||1.65-2.0 billion||2 lanes||$6,446/lane|
|S4-300 (150bp PE)||2400-3000 Gb||8.0-10.0 billion||No||$36,630/flow cell|
|S4-300Xp (150bp PE)||600- 750 Gb||2.0-2.5 billion||4 lanes||$9,603/lane|
|S4-200 (100bp PE)||1600-2000 Gb||8.0-10.0 billion||No||$31,798/flow cell|
|S4 200Xp (100bp PE)||400-500 Gb||2.0-2.5 billion||4 lanes||$8,410/lane
*Please see the Price List for current External or GCC pricing.
HiSeq4000 -the Illumina Hiseq4000 is a mid-throughput sequencer that consists of two, eight lane flow cells. The flow cells can be run independently or in parallel. Each flow has a maximum out-put of 750 Gb per run or 93.75 Gb per 150 bp paired end lane.
|NextSeq500 Sequencing-per run||Estimated Output per Run||Approx. SE Reads Per Run||MD Anderson Price*|
|75SR||25-30 Gb||400 million||$1,747|
|75PE||50-60 Gb||400 million||$3,154|
|150PE||100-120 Gb||400 million||$4,736|
|75PE||16-19 Gb||130 million||$1,389|
|150PE||35-39 Gb||130 million||$2,096|
*Please see the Price List for current External or GCC pricing.
MiSeq-the Illumina MiSeq is a low output sequencer capable of generating up to 15 Gb of data per instrument run. It has the longest read-length in the Illumina line-up, generating up to 600 bases per 300bp paired end run. A variety of flow cells and read lengths provide flexibility on this single sample platform.
|Flow Cell Type||Maximum Output||Approx. SE Reads Per Run||MD Anderson Price*|
|MiSeq150 V3||3.8 Gb||20-25million||$1,096|
|MiSeq600 V3||15 Gb||20-25million||$1,829|
|MiSeq300 V2||4.5-5 Gb||10-15 million||$1,284|
|MiSeq500 V2||7.5-8 Gb||10-15 million||$1,423|
|MiSeq300Nano||300 Mb||up to 1 million||$445|
*Please see the Price List for current External or GCC pricing.
Transcriptome analysis may be quantitative (gene expression analysis) and/or qualitative (transcript discovery, splice variant identification, coding SNP validation, gene fusions..). The SMF offers several options for transcriptome analysis. The choice of sample preparation method is based on the sample quality, quantity and the investigator’s experimental objective.
|Application||FFPE Compatible||Strand Specific||Application Notes|
|RNA Exome (RNA Access)||Yes||Yes||Human mRNA-Seq for FFPE samples- Generates cDNA from total RNA then captures the exome regions. This protocol is optimized for sequencing RNA from degraded or FFPE samples and samples with limited starting material. RNA-Access enables the discovery of novel features such as alternative splicing, fusion transcripts and coding variants.: fusions, splice variants, quantitative gene expression analysis. Capture based method covering 98.3% of the RefSeq Exome. Human only|
|Stranded mRNA-Seq||No||Yes||Uses oligo dT based capture for Poly enrichment followed by cDNA synthesis using random and oligo dT priming. Sequences generated map to coding regions of the genome.
Applications: Gene expression quantitation, fusions, splice variants- Poly A transcripts only
|Stranded Total RNA-Seq
||Yes||Yes||Here rRNA depletion is performed (no Poly A enrichment) followed by cDNA synthesis utilizing oligo-d(T) and random hexamers. This method allows the sequencing of mRNA and non-polyadenylated RNA including histone mRNAs, precursors for Cajal body related small RNAs, and lncRNAs. Sequences map to exons and intergenic regions.
Applications: Gene expression, LncRNA. More complete transcriptome with Poly A and non-Poly A transcripts
|Low Input mRNA-Seq
||mRNA--Seq for good quality samples with <100ng total RNA|
|Low Input Total RNA-Seq||Yes||Yes||Total RNA-Seq for samples with <100ng total RNA|
|Small RNA-Seq including miRNA-Seq||Yes||
|Quantification of miRNA expression. Protocol integrates Unique Molecular barcodes (UMIs ) into the reverse transcription reaction enabling unbiased andaccurate miRNome-wide quantification of mature miRNAs. UMIs require 75SR sequencing.|
|TCR a/b Profiling||No||No||TCR alpha and Beta targeted sequencing|
|RNA Capture||Custom application|
|RIP-Seq||No||No||Investigator provides immunoprecipitated RNA|
Note: Strand specificity -Preserves strand information. Strand specificity can be used to identify antisense transcripts, determine the transcribed strand of non-coding RNAs and may help to demarcate the boundaries of overlapping genes.
|Application||FFPE Compatible||Application Notes|
|Agilent Exome V7||Yes||SureSelect Human All Exon v7, is a comprehensive exome, designed using the latest versions of RefSeq (99.3% coverage), GENCODE (99.6% coverage), CCDS (99.6% coverage) and UCSC Known Genes (99.6% coverage). Design Size: 48.2 Mb|
|Agilent Clinical Research Exome||Yes||The SureSelect clinical research exome V2 is a comprehensive medical exome with overall exonic coverage, enhanced coverage of genes associated with disease and increased coverage of HGMD, OMIM, ClinVar, and ACMG targets. The associated gene list includes gene names and evidence of their disease relevance. Design Size: 67.3 Mb|
|T200.1 Panel||Yes||263 gene solid tumor panel. Covers all exons.|
|ChiP-Seq||NA||Used to identify transcription factor (protein) binding sites in genomes and specific cell types. The investigator performs chromatin IP and provides antibody captured DNA to SMF.Library requires a minimum pf 10ng of immunoprecipitated DNA. Maximum size 500bp. Enrichment should be 10 fold or greater. Requires ' input' DNA for analysis comparison.|
|Targeted Capture||Yes||Custom Application. Selectively enriches for and sequences investigator defined regions of interest. The SMF provides custom targeted capture using hybridization-based capture probes and amplicon based enrichment methods (Illumina, Haloplex and Qiagen designs). The facility stocks the T200.1 panel.|
|Whole Genome -Seq||No||Human, Mouse, Rat, Yeast, Monkey, Viral, Bacterial and other genomes. For applications in cancer research, the SMF provides sequencing of matched tumor and normal samples.|
The SMF provides budget planning, technology consultations and project planning with a NGS specialist and a MD Anderson faculty bioinformatician. We strongly recommend that first time NGS service users and investigators with large-scale projects schedule a meeting before initiating a project. To schedule a consultation meeting please contact Erika Thompson firstname.lastname@example.org.
All samples should be accompanied by a completed sample submission form. Sample submission requirements (mimunum quantity and recommended sequence length) vary based on the service and sample type.
Bioinformatics Faculty Collaborators
- Xiaoping Su, PhD.
Submit forms to
The SMF provides DNA Sequencing from single stranded or double stranded DNA, from purified plasmids, PCR products, and BACs. Sequencing is performed primarily on ABI 3730XL and 3730 DNA sequencers using Big Dye terminator cycle sequencing chemistry. The facility quantifies all samples, performs sequencing reactions, cleanup and capillary electrophoresis, analyzes the data and provides sequence as text files and chromatograms.
View our service pricing schedule for more information about DNA sequencing pricing.
Turnaround Time: 24-48 hours (weekdays)
Longer turnaround times may occur when our sample volume is very high, when special conditions are requested and when we experience instrument problems.
Guidelines for DNA submission
Sample submission requirements
Plasmid concentration must be 100 ng/µl. Submit 10 µl per reaction. Custom primers must be at 1 pmol/µl. Submit 10 µl per reaction. The DNA and primer must be submitted in 0.5-ml Eppendorf tubes with the sample name written on the sides and tops of the tubes. BAC DNA should be submitted at 500 ng/µl and primers at 25 pmol/µl. PCR products should be submitted at a minimum concentration of 20 ng/µl for products less than 1 kb. Products 1 kb or greater should be submitted at 30 ng/µl. Submit 10 µl per reaction
Quantitation of DNA
All DNA submitted to the facility needs to be accurately quantified. We recommend visual determination of DNA quality and quantity on an agarose gel using a quantitative DNA ladder. Alternatively DNA concentration can be determined fluorometrically. DNA concentrations determined using a spectrophotometer are often artificially high due to the presence of RNA, proteins, bacterial genomic DNA and other contaminants.
Note: Low DNA concentration is the most common cause of poor quality sequence and failed reactions. Too much DNA can, however, be as bad as too little. The presence of too much template results in top-heavy data (strong peaks at the beginning which fade rapidly), pull-up peaks (non-specific peaks that appear below the correct peak) and loss of peak resolution. In addition, it shortens the life of our capillaries.
Custom sequencing primer design guidelines
- Thermal cycling conditions: Denaturing step heats the reaction mix to 96°C. The annealing step cools the reaction mix to 55°C, and the polymerization step extends at 60°C. Primers must have annealing temperatures of 56°C or higher.
- Primer length should be at least 20- to 25-mers. GC content should be 50% or more.
- Primers should be designed with a tightly binding 3' end.
- When designing a primer, do not pick a region that is closer than 50 bases to the region of interest.
- Primers for PCR reactions tend to work fine for automated sequencing.
Sequencing primers provided by SMF
The facility currently provides the following primers free of charge:
- T3: CCT CAC TAA AGG GAA CAA AAG C
- T7: TAA TAC GAC TCA CTA TAG GGC GA
- T3-0: ATT AAC CCT CAC TAA AGG GA
- T7-0: TAA TAC GAC TCA CTA TAG GG
- M13(-21): GTA AAA CGA CGG CCA G
- M13rev: TCA CAC AGG AAA CAG CTA TGA C
- Sp6: GAT TTA GGT GAC ACT ATA G
- Bluescript KS: TCG AGG TCG ACG GTA TC
- Bluescript SK: CGC TCT AGA ACT AGT GGA TC
- BGH Rev: TAG AAG GCA CAG TCG AGG
- PCMV For: CGC AAA TGG GCG GTA GGC GTG
- pGEX 3': CCG GGA GCT GCA TGT GTC AGA GG
- pGEX 5': GGG CTG GCA AGC CAC GTT TGG TG
- T7 Term: GCT AGT TAT TGC TCA GCG G
The facility will repeat a sample at no cost to the investigator if there is an instrument failure, or if the quality of sequence is compromised due to an error in the facility. If the investigator recommends a sample be repeated, the same DNA and primer will be used to repeat the sequencing reaction. If the reaction fails a second time, the investigator will be charged for the repeated reaction. If an incorrect primer has been requested for sequencing and as a result no sequence data is produced by the sequencing reaction, the cost will be borne by the investigator.
The facility will assist investigators in designing primers to difficult regions. Please contact Erika Thompson if you require this service
Frequently asked questions
View answers to frequently asked questions such as: Why did my template not sequence? Why should I resuspend in water? Does the host strain matter? Does it matter which media I use to grow my cells? Why did my sequence stop short?
*Note: The text sequence provides you with unedited, raw sequence data. Please view the chromatograms to correct minor errors made by the base-calling software. The Sequencing and Microarray Facility provides a server for the rapid dispersal of data to the principal investigators. The results remain on the server for 30 days, after which files will automatically be deleted. Investigators should copy all data to their drives.
When using AppleTalk, the server will only allow 20 users to log on at a time. The server has been set up to allow 15 minutes per user log on to download files. If your workstation logs on automatically at start up you will be bumped off the system after 15 minutes.
(If you do not have a folder online, please contact the SMF and we will have a folder created for you.)
New Directions to Access Your Online Sequencing Results
- Click "Start"
- Click "Run"
- Type in “\\mymdafiles\seq”
- Click "OK"
- A dialog box will appear. Type in your "username" and "password".
- Click on "Go" (located at top of screen)
- Click on "Connect to Server"
- Type in “smb://mymdafiles/seq”
Both PC and Mac users will use the same username and password that you use to access your MD Anderson account through Entourage and/or Outlook.
Viewing chromatograms (free downloadable software)
Chromas Software: http://technelysium.com.au/wp/chromas
4peaks Software: http://downloads.nucleobytes.com/4peaks
Sanger-based gene resequencing enables gene mutation detection by evaluating an entire gene or individual SNPs in a single experiment. The SMF performs this assay using custom designed primers. View the genes available for this service.
View the service pricing schedule for more information about gene resequencing pricing.
- Customization available
- New assay design free of charge
- Gold standard for next-generation sequencing validation
Quantification is performed on a Qubit fluorometer. Sequencing is performed on a 3730XL DNA Analyzer (Thermo) and comparative analysis uses SeqScape software (Thermo).
- Quantification and normalization of samples
- PCR amplification and purification
- Sanger sequencing in both directions (where possible)
- Comparative alignment to the reference sequence
- Re-analysis (if needed)
- Report of all mutations found and all sequences generated sent to customer
Send completed submission form to D.J. Doss or submit hard copy with samples. Samples may be submitted in 1.5 or 0.5 microcentrifuge tubes with the name clearly written on the top of the tube. The amount of gDNA needed is dependent on the gene requested, please contact D.J. Doss for more information.
Scheduling sample drop-off
Contact D.J. Doss to schedule sample drop-off and to determine the amount of gDNA needed.
For new assay design please contact D.J. Doss or Erika Thompson.
The 10X Genomics Chromium system for Single Cell mRNA Sequencing is now available in the SMF.
- Capture 100 to 10,000 single cells in minutes
- High single cell capture efficiency: Up to 65%
- Low doublet rates
- Fast turnaround times: Cell capture to sequencing-ready cDNA library in as little as 2 days
- High number of transcripts detected per cell
- Cost Effective: Lower cost of processing compared to other available systems
- Libraries are compatible with the Illumina NextSeq 500 sequencer which is currently available in the SMF
Sample concentration and viability of the cell suspension is evaluated using automated cell counting (Countess II FL Auto counter) and manual (hemacytometer) counting. Samples are normalized for input onto the Chromium A Chip, which will create the GEMs (gel beads in emulsion) droplets. Potential target capture of single samples can range from 100 single cells to over 10,000 single cells. Single cells, RT priming-beads with specific barcode and universal molecular identifier (UMI), as well as lysis and RT reagents are partitioned into oil droplets where the beads are dissolved and cells are lysed for reverse transcription to take place. The droplets are then broken and the pooled single-stranded, barcoded cDNA are amplified and fragmented for library preparation. During the library preparation process, appropriate sequence primer sites and adapters are added where the final product contains the 10X Barcode, UMI, the appropriately sized cDNA insert, Illumina adapter and Illumina P5 and P7 primer sequences for sequencing on the NextSeq 500 sequencer.
Cell suspensions are required for submission for this service. We are not able to process fresh frozen or formalin-fixed paraffin embedded tissue. Cell dissociation from tissue is required and cell debris must be removed from the culture prior to sample submission.
Cell enrichment by flow cytometry, magnetic bead positive or negative cell selection should be performed prior to submitting samples to the facility.
Viable cell suspensions can be cryopreserved and then submitted at a later time. Please note that the viability of these types of samples can vary greatly. Low viability will have an impact on final cell capture amounts as well as sequencing data. Viability should be at least 70-90% with an ideal viability above 90%.
Cell suspensions should be submitted in culture media or PBS. If bringing samples in PBS, they should have been freshly harvested (minimal transit time before submission); otherwise, cells in media are best. Cell media shouldn’t have more than 10% fetal bovine serum.
The ideal cell suspension concentration is between 700-1200 cells/uL (700,000-1,200,000 cells/mL) in at least one mL of media/PBS. Lower amounts of cells may be run. Samples <10,000 cells/mL may be harder to perform quality check for accurate amount and viability measurements.
For initial projects, 10X Genomics suggests running about 1700 cells input. The Chromium instrument has a cell capture efficiency of 65%, so if you input 1700 cells your return will have an estimated target cell capture of 1000 cells.
Submitting samples for 10X processing
It's appreciated if you could provide us a week’s notice of your intended submission (a minimum 48 hours’ notice), so we can be sure there are no scheduling conflicts and that we have sufficient reagents/consumables for your project.
Prior to submitting samples, we ask for a completed 10X Genomics Single Cell Service Request Form to be completed. Please send an electronic copy of the form to email@example.com. Please be sure that your request form includes an active account number and signature from your financial administrator on the account. For external customers, an additional external billing form and PI signature will be required in addition to the service request form.
10X Genomics recommends a minimum of 50,000 reads/cell. Optimal is 75,000+ reads per cell. Single cell mRNAseq runs can be performed on the Illumina NextSeq 500.
The SMF only provides raw data generated from our service. Base Call files (bcl) are converted to fastq files using specific software from 10X Genomics. Modified fastq files are used for QC metrics (html files) and demultiplexing of the sequencing data using Cell Ranger software. cLoupe files are created in Cell Ranger and can be viewed using Loupe Cell Browser, which is freeware that can be downloaded from 10X Genomics’ website (please be aware of minimum computer requirements prior to downloading the software).
Please see the Capture and Library Preparation as well as the Sequencing-NextSeq 500 pricing on the SMF Price List.