Chengwei LEI, Ph.D.    Assistant Professor

Department of Computer & Electrical Engineering/Computer Science
California State University, Bakersfield

Research Interests

Google Scholar Link

My research interests lie in the broad area of bioinformatics, data mining, network topology analysis, and clustering problems. I am interested in developing data-analytical methods and tools to make complex biological data more understandable and useful.

 

Specifically, I am currently doing the research in the following areas:

Network function prediction and pathway discovery

  • Cancer subtype discovery
        A robust definition of cancer subtypes can lead to better patient prognostic and more effective treatment plans. We have to develop some methods to re-analyze the gene expression data of several independent cancer patient cohorts based on which the current subtypes were defined. (still working on this) 
        My related publication:

  • Network-based classification
        We develope a novel computational method to analyze whole-genome DNA methylation data for endometrial tumors within the context of a human protein-protein interaction network, in order to identify subnetworks as potential epigenetic biomarkers for predicting tumor recurrence. (working together with my colleague Jamiul Jahid)
        My related publication: ACMBCB 2012

Identifying topological properties to characterize network

  • Topological profile based network denoise
        We present a network topological profile based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of complex prediction by reducing the impact of hub nodes.

    •     In order to get the accurate topological profiles, we introduce two types of resistance into the simple random walk model to develop a new algorithm: Random walk with resistance (RWS). The resistances ensure the topological profile for different starting node will be different, and effectively control the impact from the hub nodes.
          My related publication: BIBM 2012, Bioinformatics

    •    Random walk with restart (RWR) is a widely used RW modification. We can combine the ideas of restart and resistance together to take the advantages from both. The result will be interesting. (still working on this) 
          My related publication:

Biological Sequence Analysis

  • Identify TF binding sites on DNA sequence
        We propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimization technique called Particle Swarm Optimization. We use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs. The experimental results show that our method is both more efficient and more accurate than several existing algorithms.
        My related publication: International Journal of Computational Biology and Drug Design, BIBM'08workshops


        We make further modifications of the standard PSO algorithm to handle discrete values, such as characters in DNA sequences. We use both consensus and position-specific weight matrix representations in our algorithm; models gaps explicitly and find gapped motifs without any detailed knowledge of gaps.
        My related publication: BioData Mining, EvoBio'10


  • Next-generation DNA sequencing data analysis
        To prove Nelf-b plays important roles in multiple aspects of transcriptional regulation in mammaian genomes, we process the ChIP-Seq data and analyzed the peak information. The result shows that genetic ablation of Nelf-b leads to deregulation of pol II pausing and defects in cell growth and survival.
        My related publication: Journal of Biological Chemistry


        How to detect the peak when we map ChIP-Seq reads back to the genome? There are many interesting problems here. (still working on this) 
        My related publication:

  • Cis-regulatory elements identification and analysis
        We propose a completely parameter-free and systematic method for constructing gene co-expression networks and predicting functional modules as well as cis-regulatory elements.
        My related publication: BMC Bioinformatics