|
Research Project
My work at the Zacharewski lab involved several major projects.
My first project began as the creation of a set of complex queries based upon
normalized microarray feature signal intensities and statistical data values,
and the production of corresponding user interfaces for each of these
queries. These queries were to unique in terms of the results they produced,
allowing users to perform cross-tissue/species comparisons and retrieve
quality-specific results. The corresponding interfaces were developed in
Java, using the Swing graphical libraries. To provide a common framework for
the interfaces, as well as creating a degree of standardization among them, I
developed a central Query Control Center (QCC) application. The QCC
application handles database connection, query selection, option
specification, and result output in multiple file formats, including data
export to Microsoft Excel spreadsheets. Users may choose from the list of
existing queries, or may write custom queries at runtime using SQL query
generation functionality. The application serves as a full-featured center,
and is easily extendible for the purposes of adding future queries.
Another project that I was involved in was a collaborative effort between Lyle
Burgoon, fellow co-op student Willis Lang, and myself. This project involved
analysis of a data set produced by studies performed by Jeremy Burt, and had
us attempting to find short sequence fragments ("words") were being over-expressed in the genomes of treated mice. To accomplish this task, we created
a design that involved database insertion of sequences for the entire mouse
genome, sequence-based word generation, word insertion, word count generation,
and statistical analysis of the word count information. Due to the sheer
volume of data produced by this effort, we were often forced to develop
innovate ways to feasibly manage the data, and extrapolate necessary results.
My final major project involved the creation of a data visualization
application intended to produce useful visual representations of data to
illustrate biologically significant relationships. The initial goal was to
provide the lab with a central application that could serve as the launch
point for any visualizations that might be developed in-house, as well as to
begin development of the first of these visualizations, intended to represent
Gene Ontology numbers, gene count information, and probability values
represented as a 3D contour plot. Collaborating with Willis Lang on the
launching application for the Visualization Control Center (VCC), we developed
a system by which users may input Excel spreadsheet data (or simply a tab-
delimited text file) to the VCC to have the data visualized. Willis created a
framework to allow filtering of the data and the display of related annotation
information, functionality that would be common to any visualization we may
want to create in the future (and hence forming the application "core"). I
developed the first of the VCC visualizations, a multi-dimensional
visualization tool that allows for the creating of scatter plots, connected
line plots, and height plots, among others. Each plot type features a host of
display options, including coloring options, gradient selection (mapping
colors to data values to illustrate peaks and valleys), and a variety of
others. Three-dimensional plots may also be rotated and zoomed using the
mouse, and individual data points may be clicked to bring up specific
information regarding the data. Data sets of any type may be submitted, and
the plug-in intelligently handles string data using even axis spacing, and
numeric data based upon the extreme values represented by data set. I
developed the multi-dimensional visualization tool using a combination of the
Java3D libraries and the Swing graphical libraries.
In terms of wet-lab experience, I had the opportunity to perform PCR-
amplifications of cDNA, and used gel-electrophoresis to verify the PCR-
amplification results.
|