Skip to Content

Biostatistics Services

Most of the projects carried out by Science Park investigators result in data that require statistical analysis. Statistical expertise is available throughout the project cycle, including study design, data collection and analysis. Incorporation of statistical insight early in the project cycle results in more efficient realization of the research aims of Science Park investigators.

Study Design

Experimental design addresses the problem of the number of animals required to detect an effect such as genotype or treatment. This type of analysis is known in statistical parlance as a "power calculation" since the object is to estimate how many subjects would be required to "power" the experiment, i.e., to detect with high probability (the power) a difference at a fixed significance level (or p value, usually taken to be 0.05).

Power calculations are required at many levels of the research project, for example in writing up the project as a grant proposal, or in applying to the Animal Care and Use Committee for approval of the experiments. Ideally, power calculations are based on pilot studies that provide some idea of the size of the effect to be expected. Not unusually, however, such pilot studies are not feasible due to limited time and resources. In this case, power calculations are based on the minimum effect size that the investigator judges to be "scientifically interesting."

Data Collection

The majority of projects are recorded using standard spreadsheet software, most often Microsoft Excel. To avoid misunderstanding, it is important that investigators, their staff and the biostatistics support staff are clear on the meaning of the data that are recorded. A further issue is the avoidance of errors that arise from the manipulation of the data by the biostatistics support staff, made necessary by the use of non-numeric and other formats. It is always preferable to enter data in numeric format, and to provide keys to assist interpretation.


Biostatistics support personnel make use of well-known commercial statistical software such as STATA, R and SAS. Typical analysis involves differences in tumor incidence by genotype, type of treatment, diet or other factors. Among the many endpoints that can be analyzed are (1) difference between number of tumors per animal at the final time point, (2) ratio of incidence rates of tumor formation based on all the time points, and (3) time to first tumor. Other standard techniques, e.g., analysis of variance, regression, and chi-square analysis are also used.


Biostatistics support staff conduct periodic workshops aimed at educating Science Park staff on specific issues related to data interpretation. Topics include introductory biostatistics, experimental design and power calculation.


Li Zhang, Ph.D.
Assistant Professor
Phone: (713) 563-4298

Kevin Lin
Research Statistical Analyst
Phone: (512) 237-9379

© 2014 The University of Texas MD Anderson Cancer Center