GP3: GENEPIX POST-PROCESSING SCRIPT FOR AUTOMATED ANALYSIS OF RAW TOXICOGENOMIC MICROARRAY DATA.

E Dere1, M RFielden1,3,4, R G Halgren1,2 and T R Zacharewski1,3,4. 1Department of Biochemistry & Molecular Biology, 2Genomics Technology Support Facility, 3Institute for Environmental Toxicology, 4National Food Safety & Toxicology Center, Michigan State University, East Lansing, MI, USA

Toxicogenomic studies generate vast amounts of data that require processing prior to clustering and other analyses. We describe a script to automate the post-processing of raw microarray data captured with GenePix, a widely used commercial microarray image analysis application. The script, written in Perl, filters outlying signal intensities, performs background corrections and normalizes signal intensities between the Cy3 and Cy5 channels. Signal intensities are filtered to ensure they exceed a defined signal threshold of detection while also falling below the saturation level of the microarray scanner. After filtering, valid signals undergo local background correction prior to subsequent normalization. Systematic and experimental biases between the two fluor-labeled cDNA populations being compared in a two-color fluorescence-based cDNA microarray assay can result in inaccurate quantitation of relative differences in gene expression. To minimize this affect, signal intensities for each fluor are adjusted to normalize the gene expression distribution in log2 space. By doing so, signal intensities across the entire microarray can be expressed as a percentage thereby minimizing systematic biases in fluor incorporation and intensity characteristics and differences in RNA quality and sample handling. The script can operate in batch mode to process data in a high-throughput manner. Other features include abilities to override default parameters of the script through a command line interface. Filtered and normalized values are appended to the GenePix results file. In addition, averages and standard deviations of replicate spots are calculated and provided in a summary file. Therefore this script automates the high-throughput post-processing of raw toxicogenomic microarray data generated by GenePix and minimizes human error introduced with repetitive manual adjustments.