The Sequencing and Microarray Facility (SMF) offers several sequencing services, including:
- Illumina next-generation sequencing
- Sanger-based DNA sequencing
- Sanger-based gene resequencing
- Single cell mRNA-Seq
Learn more about each of these services below.
The Sequencing and Microarray Facility (SMF) offers massively parallel next generation sequencing services on Illumina HiSeq2000, HiSeq4000, NextSeq500 and MiSeq sequencers. The HiSeq4000 is Illumina's newest and most advanced sequencing platform. It operates on Illumina's well established reversible terminator-based sequencing-by-synthesis chemistry and generates more than 1.5 terabases of sequence per instrument run (150 nucleotide paired-end with 2 flow cells). The HiSeq2000 sequencer generates up to 500 gigabases of sequence per instrument run (100 nucleotide paired-end). The NextSeq500 sequencer generates up to 120 gigabases (150 nucleotide paired-end) in the high output mode and 39 gigabases (150 nucleotide paired-end) in the mid-output mode.
The SMF provides a complete next generation sequencing service. Investigators provide the facility with genomic DNA, total RNA or ChIP DNA (depending on the requested application) and the facility provides complete sample processing.
The technology and workflow
The NGS workflow can be divided into five parts: project consultation, library preparation, cluster generation, Illumina HiSeq sequencing and data analysis.
The SMF provides a comprehensive NGS service. Supported applications include:
Whole genome sequencing of Human, Mouse, Rat, Yeast, Monkey, Viral, Bacterial and other genomes. For applications in cancer research, the SMF provides sequencing of matched tumor and normal samples.
Transcriptome Analysis may be quantitative (gene expression analysis) and/or qualitative (transcript discovery, splice variant identification, coding SNP validation). The SMF offers several options for transcriptome analysis. The choice of sample preparation method is based on the investsigator's experimental objective and should be decided in conjuction with the bioinformatician.
Stranded mRNA-Seq uses oligo dT-based capture for Poly enrichment followed by cDNA synthesis using random and oligo dT priming. Sequences generated map to coding regions of the genome.
Stranded Total RNA-Seq. Here rRNA depletion is performed (no Poly A enrichment) followed by cDNA synthesis utilizing oligo dT and random hexamers. This method allows the sequencing of mRNA and non-polyadenylated RNA including histone mRNAs, precursors for Cajal body-related small RNAs, and lncRNAs. Sequences map to exons and intergenic regions.
Note: Strand specificity - Preserves strand information. Strand specificity can be used to identify antisense transcripts, determine the transcribed strand of non-coding RNAs and may help to demarcate the boundaries of overlapping genes.
RNA Access (Human mRNA-Seq for FFPE samples) generates cDNA from total RNA then captures the transcriptome coding regions. This protocol is optimized for sequencing RNA from degraded or FFPE samples and samples with limited starting material. RNA-Access enables the discovery of novel features such as alternative splicing, fusion transcripts and coding variants. It preserves strand information.
TCR Profiling - The SMF performs Human T-cell receptor (TCR) repertoire analysis using TCR a/b NGS. We can generate data for both TCR-alpha and TCR-beta chain diversity by using a 5' RACE-like approach to capture complete V(D)J variable regions of TCR transcripts from total RNA.
Small RNA-Seq is used to profile and identify changes in small RNA expression and to identify novel microRNAs.
The SMF offers low input protocols for both total RNA-Seq and mRNA-Seq.
ChIP-Seq is used to identify transcription factor (protein) binding sites in genomes and specific cell types. The investigator performs chromatin IP and provides antibody captured DNA to the SMF. Both ChIP sample and mock or IgG control are required.
Exome resequencing. The Human Genome is comprised of approximately 3 billion base pairs, of which only 1.2%-1.6% is coding. Exome resequencing selectively enriches for and sequences the coding regions. The SMF provides exome capture using solution-based capture methods. The SMF offers Agilent and Nimblegen Exome captures.
Targeted Resequencing selectively enriches for and sequences inestigator-defined regions of interest. The SMF provides custom targeted capture using hybridization-based capture probes and amplicon-based enrichment methods (Illumina, Haloplex and Qiagen designs). The facility stocks the T200.1 panel.
The SMF provides budget planning, technology consultations and project planning with a NGS specialist and a MD Anderson faculty bioinformatician. We strongly recommend that first time NGS service users and investigators with large-scale projects schedule a meeting before initiating a project. To schedule a consultation meeting please contact Erika Thompson firstname.lastname@example.org.
All samples should be accompanied by a completed sample submission form. Sample submission requirements (mimunum quantity and recommended sequence length) vary based on the service and sample type.
Bioinformatics Faculty Collaborators
- ChIP-Seq: Shoudan Liang, PhD.
- All other services: Xiaoping Su, PhD.
Submit forms to
Illumina Next Generation Sequencing Prices (In Practical Terms)
The SMF provides DNA Sequencing from single stranded or double stranded DNA, from purified plasmids, PCR products, and BACs. Sequencing is performed primarily on ABI 3730XL and 3730 DNA sequencers using Big Dye terminator cycle sequencing chemistry. The facility quantifies all samples, performs sequencing reactions, cleanup and capillary electrophoresis, analyzes the data and provides sequence as text files and chromatograms.
View our service pricing schedule for more information about DNA sequencing pricing.
Turnaround Time: 24-48 hours (weekdays)
Longer turnaround times may occur when our sample volume is very high, when special conditions are requested and when we experience instrument problems.
Guidelines for DNA submission
Sample submission requirements
Plasmid concentration must be 100 ng/µl. Submit 10 µl per reaction. Custom primers must be at 1 pmol/µl. Submit 10 µl per reaction. The DNA and primer must be submitted in 0.5-ml Eppendorf tubes with the sample name written on the sides and tops of the tubes. BAC DNA should be submitted at 500 ng/µl and primers at 25 pmol/µl. PCR products should be submitted at a minimum concentration of 20 ng/µl for products less than 1 kb. Products 1 kb or greater should be submitted at 30 ng/µl. Submit 10 µl per reaction
Quantitation of DNA
All DNA submitted to the facility needs to be accurately quantified. We recommend visual determination of DNA quality and quantity on an agarose gel using a quantitative DNA ladder. Alternatively DNA concentration can be determined fluorometrically. DNA concentrations determined using a spectrophotometer are often artificially high due to the presence of RNA, proteins, bacterial genomic DNA and other contaminants.
Note: Low DNA concentration is the most common cause of poor quality sequence and failed reactions. Too much DNA can, however, be as bad as too little. The presence of too much template results in top-heavy data (strong peaks at the beginning which fade rapidly), pull-up peaks (non-specific peaks that appear below the correct peak) and loss of peak resolution. In addition, it shortens the life of our capillaries.
Custom sequencing primer design guidelines
- Thermal cycling conditions: Denaturing step heats the reaction mix to 96°C. The annealing step cools the reaction mix to 55°C, and the polymerization step extends at 60°C. Primers must have annealing temperatures of 56°C or higher.
- Primer length should be at least 20- to 25-mers. GC content should be 50% or more.
- Primers should be designed with a tightly binding 3' end.
- When designing a primer, do not pick a region that is closer than 50 bases to the region of interest.
- Primers for PCR reactions tend to work fine for automated sequencing.
Sequencing primers provided by SMF
The facility currently provides the following primers free of charge:
- T3: CCT CAC TAA AGG GAA CAA AAG C
- T7: TAA TAC GAC TCA CTA TAG GGC GA
- T3-0: ATT AAC CCT CAC TAA AGG GA
- T7-0: TAA TAC GAC TCA CTA TAG GG
- M13(-21): GTA AAA CGA CGG CCA G
- M13rev: TCA CAC AGG AAA CAG CTA TGA C
- Sp6: GAT TTA GGT GAC ACT ATA G
- Bluescript KS: TCG AGG TCG ACG GTA TC
- Bluescript SK: CGC TCT AGA ACT AGT GGA TC
- BGH Rev: TAG AAG GCA CAG TCG AGG
- PCMV For: CGC AAA TGG GCG GTA GGC GTG
- pGEX 3': CCG GGA GCT GCA TGT GTC AGA GG
- pGEX 5': GGG CTG GCA AGC CAC GTT TGG TG
- T7 Term: GCT AGT TAT TGC TCA GCG G
The facility will repeat a sample at no cost to the investigator if there is an instrument failure, or if the quality of sequence is compromised due to an error in the facility. If the investigator recommends a sample be repeated, the same DNA and primer will be used to repeat the sequencing reaction. If the reaction fails a second time, the investigator will be charged for the repeated reaction. If an incorrect primer has been requested for sequencing and as a result no sequence data is produced by the sequencing reaction, the cost will be borne by the investigator.
The facility will assist investigators in designing primers to difficult regions. Please contact Erika Thompson if you require this service
Frequently asked questions
View answers to frequently asked questions such as: Why did my template not sequence? Why should I resuspend in water? Does the host strain matter? Does it matter which media I use to grow my cells? Why did my sequence stop short?
*Note: The text sequence provides you with unedited, raw sequence data. Please view the chromatograms to correct minor errors made by the base-calling software. The Sequencing and Microarray Facility provides a server for the rapid dispersal of data to the principal investigators. The results remain on the server for 30 days, after which files will automatically be deleted. Investigators should copy all data to their drives.
When using AppleTalk, the server will only allow 20 users to log on at a time. The server has been set up to allow 15 minutes per user log on to download files. If your workstation logs on automatically at start up you will be bumped off the system after 15 minutes.
(If you do not have a folder online, please contact the SMF and we will have a folder created for you.)
New Directions to Access Your Online Sequencing Results
- Click "Start"
- Click "Run"
- Type in “\\mymdafiles\seq”
- Click "OK"
- A dialog box will appear. Type in your "username" and "password".
- Click on "Go" (located at top of screen)
- Click on "Connect to Server"
- Type in “smb://mymdafiles/seq”
Both PC and Mac users will use the same username and password that you use to access your MD Anderson account through Entourage and/or Outlook.
Viewing chromatograms (free downloadable software)
Chromas Software: http://technelysium.com.au/wp/chromas
4peaks Software: http://downloads.nucleobytes.com/4peaks
Sanger-based gene resequencing enables gene mutation detection by evaluating an entire gene or individual SNPs in a single experiment. The SMF performs this assay using custom designed primers. View the genes available for this service.
View the service pricing schedule for more information about gene resequencing pricing.
- Customization available
- New assay design free of charge
- Gold standard for next-generation sequencing validation
Quantification is performed on a Qubit fluorometer. Sequencing is performed on a 3730XL DNA Analyzer (Thermo) and comparative analysis uses SeqScape software (Thermo).
- Quantification and normalization of samples
- PCR amplification and purification
- Sanger sequencing in both directions (where possible)
- Comparative alignment to the reference sequence
- Re-analysis (if needed)
- Report of all mutations found and all sequences generated sent to customer
Send completed submission form to D.J. Doss or submit hard copy with samples. Samples may be submitted in 1.5 or 0.5 microcentrifuge tubes with the name clearly written on the top of the tube. The amount of gDNA needed is dependent on the gene requested, please contact D.J. Doss for more information.
Scheduling sample drop-off
Contact D.J. Doss to schedule sample drop-off and to determine the amount of gDNA needed.
For new assay design please contact D.J. Doss or Erika Thompson.
The 10X Genomics Chromium system for Single Cell mRNA Sequencing is now available in the SMF.
- Capture 100 to 10,000 single cells in minutes
- High single cell capture efficiency: Up to 65%
- Low doublet rates
- Fast turnaround times: Cell capture to sequencing-ready cDNA library in as little as 2 days
- High number of transcripts detected per cell
- Cost Effective: Lower cost of processing compared to other available systems
- Libraries are compatible with the Illumina NextSeq 500 sequencer which is currently available in the SMF
Sample concentration and viability of the cell suspension is evaluated using automated cell counting (Countess II FL Auto counter) and manual (hemacytometer) counting. Samples are normalized for input onto the Chromium A Chip, which will create the GEMs (gel beads in emulsion) droplets. Potential target capture of single samples can range from 100 single cells to over 10,000 single cells. Single cells, RT priming-beads with specific barcode and universal molecular identifier (UMI), as well as lysis and RT reagents are partitioned into oil droplets where the beads are dissolved and cells are lysed for reverse transcription to take place. The droplets are then broken and the pooled single-stranded, barcoded cDNA are amplified and fragmented for library preparation. During the library preparation process, appropriate sequence primer sites and adapters are added where the final product contains the 10X Barcode, UMI, the appropriately sized cDNA insert, Illumina adapter and Illumina P5 and P7 primer sequences for sequencing on the NextSeq 500 sequencer.
Cell suspensions are required for submission for this service. We are not able to process fresh frozen or formalin-fixed paraffin embedded tissue. Cell dissociation from tissue is required and cell debris must be removed from the culture prior to sample submission.
Cell enrichment by flow cytometry, magnetic bead positive or negative cell selection should be performed prior to submitting samples to the facility.
Viable cell suspensions can be cryopreserved and then submitted at a later time. Please note that the viability of these types of samples can vary greatly. Low viability will have an impact on final cell capture amounts as well as sequencing data. Viability should be at least 70-90% with an ideal viability above 90%.
Cell suspensions should be submitted in culture media or PBS. If bringing samples in PBS, they should have been freshly harvested (minimal transit time before submission); otherwise, cells in media are best. Cell media shouldn’t have more than 10% fetal bovine serum.
The ideal cell suspension concentration is between 700-1200 cells/uL (700,000-1,200,000 cells/mL) in at least one mL of media/PBS. Lower amounts of cells may be run. Samples <10,000 cells/mL may be harder to perform quality check for accurate amount and viability measurements.
For initial projects, 10X Genomics suggests running about 1700 cells input. The Chromium instrument has a cell capture efficiency of 65%, so if you input 1700 cells your return will have an estimated target cell capture of 1000 cells.
Submitting samples for 10X processing
It's appreciated if you could provide us a week’s notice of your intended submission (a minimum 48 hours’ notice), so we can be sure there are no scheduling conflicts and that we have sufficient reagents/consumables for your project.
Prior to submitting samples, we ask for a completed 10X Genomics Single Cell Service Request Form to be completed. Please send an electronic copy of the form to email@example.com. Please be sure that your request form includes an active account number and signature from your financial administrator on the account. For external customers, an additional external billing form and PI signature will be required in addition to the service request form.
10X Genomics recommends a minimum of 50,000 reads/cell. Optimal is 75,000+ reads per cell. Single cell mRNAseq runs can be performed on the Illumina NextSeq 500.
The SMF only provides raw data generated from our service. Base Call files (bcl) are converted to fastq files using specific software from 10X Genomics. Modified fastq files are used for QC metrics (html files) and demultiplexing of the sequencing data using Cell Ranger software. cLoupe files are created in Cell Ranger and can be viewed using Loupe Cell Browser, which is freeware that can be downloaded from 10X Genomics’ website (please be aware of minimum computer requirements prior to downloading the software).
Please see the Capture and Library Preparation as well as the Sequencing-NextSeq 500 pricing on the SMF Price List.