Welcome to the Zhou Lab homepage!

We are part of the Computational Biology and Bioinformatics program in the Department of Biological Sciences at University of Southern California. We are also affiliated with Department of Computer Science.

Current Research Topics:

(1) Method development

We develop computational and statistical methods for the integrative analysis of diverse genomic sources, including transcriptomic data, epigenomic data, proteomic data, sequence data, and the text data. Our data integration efforts include horizontal integration (combining datasets of the same type, such as multiple gene expression datasets) and vertical integration (combining datasets of different types, such as DNA methylation and expression data).

Develop methods for the integrative analysis of public gene expression repositories: The rapid accumulation of gene expression data (microarray data and RNA-seq data) translates into an urgent need for methods that can effectively integrate data generated by different platforms. Continuing our previous effort, we have been developing novel methods to integrate and analyze public gene expression repositories (GEO, ArrayExpress, and SRA).

Develop methods for the integrative analysis of multi-dimensional genomic data: Recent technology has made it possible to simultaneously perform multi-platform genomic profiling of biological samples, resulting in so-called multi-dimensional genomic data, defined as the genomic profiling of the same set of samples using multiple platforms (e.g. SNP, CNV, gene expression, DNA methylation). For example, the Cancer Genome Atlas (TCGA) project is generating multi-dimensional maps of the key genomic changes for ~2400 tumor samples. Such data provide unique opportunities to study the coordination between regulatory mechanisms on multiple levels, and the impact of this complexity on patient treatment and survival. With the rapid decline of sequencing costs, such data will soon accumulate rapidly. However, suitable analysis methods for multi-dimensional data are currently lacking. Existing tools are designed for one-dimensional or at most two-dimensional genomic data. We are developing novel methods for analyzing multi-dimensional datasets for effective information extraction and hypothesis testing.

Develop algorithms for network-based data mining: Although many methods are available for the analysis of a single network, few algorithms exist for mining patterns across many massive networks. We have been developing a series of algorithms to mine frequent patterns across many biological networks. Utilizing these algorithms, we perform large-scale functional annotation and regulatory network reconstruction for yeast, mouse, and human.

(2) Biological discovery

We have been applying the above developed methods to study particular biological systems, with models including aging, cancer, development, and autoimmune diseases.

(3) Software development:

We develop software for the integrative analysis of diverse data sets/types, including

Integrative Array Analyzer: This is the first software package to perform integrative analysis of cross-platform and cross-species microarray datasets.

Gene Aging Nexus: We are developing a web-based data mining platform Gene Aging Nexus freely accessible to the biogerontological-geriatric research community to query/analyze/visualize various aging-related genomic data sources, in particular, microarray data.