Chengwei LEI, Ph.D.    Assistant Professor

Department of Computer & Electrical Engineering/Computer Science
California State University, Bakersfield

Research Interests

Google Scholar Link

"Incipient Stage" Research Ideas

My research interests lie in the broad area of bioinformatics, data mining, network topology analysis, and clustering problems. I am interested in developing data-analytical methods and tools to make complex biological data more understandable and useful.

 

Specifically, I am currently doing the research in the following areas:

 

System Modeling, Data Analysis and Risk Evaluation for Power Grid System Failures

  • Probability Based Circuit Breaker Modeling and Risk Evaluation  
         Circuit breakers are widely applied in power system protection by interrupting fault current. Previously, all the research consider the tripping time as a constant number to avoid the difference between individual Circuit Breakers. In this work, we propose a probability based modeling to describe the property of Circuit Breaker.
    •     A brand new simulation model that contains probability tools is developed to realistically describe the tripping characteristics, and a product failure rate is also considered to reflect possible circumstances in real-world application of the thermal-magnetic circuit breakers.
          My related publication: APEC 2017

    •    Circuit breakers can be manually or automatically reset after being tripped. With time elapsing, some mechanisms inside the circuit breakers may get aged and fatigued, and some connections and contacts may become loose and misaligned. In this work, we propose a probability based simulation modeling methodology for worn circuit breakers.
          My related publication: IEEE CYBER 2017 (still working on this)

  • Fuse modeling and fault study
         Fuses act as sacrificial protective devices against over-current faults in AC and DC electronic circuit, and widely applied in modern power generation, transmission and distribution, and load service systems. In order to study the thermal energy in the process of fuse melting, data analysis is conducted on the time/current curve provided by the manufacturer.
    •     We present a new modeling method to provide the most optimal equation to describe the relationship between time, different level of fault current and the thermal energy that melts the fuse.
          My related publication: PESGM 2016

    •     The melt for the fuse is related to current and time. However in practical, environmental temperature is a huge fact, especially when the fuses are exposed to seasonal temperature extremes. In this work, we will analyse the time/current curve by considering the temperature effect.
          My related publication: (still working on this) 

 

 

Identifying topological properties to characterize network

  • Topological profile based network denoise
        We present a network topological profile based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of complex prediction by reducing the impact of hub nodes.

    •     In order to get the accurate topological profiles, we introduce two types of resistance into the simple random walk model to develop a new algorithm: Random walk with resistance (RWS). The resistances ensure the topological profile for different starting node will be different, and effectively control the impact from the hub nodes.
          My related publication: BIBM 2012, Bioinformatics

    •    Random walk with resistance (R) is working well when the parameter is setting properly. How to choose the parameters is a difficult problem for the user. To address this problem, we developed a fully automated algorithm to predict the protein complex.
          My related publication: Proteome Science

 

 

Network function prediction and pathway discovery

  • Cancer subnetworks discovery
        A robust definition of cancer subnetworks can lead to better patient prognostic and more effective treatment plans. We have to develop some methods to re-analyze the gene expression data of several independent cancer patient cohorts based on which the current subnetworks were defined. 
        My related publication: Genomics

  • Network-based classification
        We develope a novel computational method to analyze whole-genome DNA methylation data for endometrial tumors within the context of a human protein-protein interaction network, in order to identify subnetworks as potential epigenetic biomarkers for predicting tumor recurrence. (working together with my colleague Jamiul Jahid)
        My related publication: ACMBCB 2012

 

 

Biological Sequence Analysis

  • Identify TF binding sites on DNA sequence
        We propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimization technique called Particle Swarm Optimization. We use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs. The experimental results show that our method is both more efficient and more accurate than several existing algorithms.
        My related publication: International Journal of Computational Biology and Drug Design, BIBM'08workshops


        We make further modifications of the standard PSO algorithm to handle discrete values, such as characters in DNA sequences. We use both consensus and position-specific weight matrix representations in our algorithm; models gaps explicitly and find gapped motifs without any detailed knowledge of gaps.
        My related publication: BioData Mining, EvoBio'10


  • Next-generation DNA sequencing data analysis
        To prove Nelf-b plays important roles in multiple aspects of transcriptional regulation in mammaian genomes, we process the ChIP-Seq data and analyzed the peak information. The result shows that genetic ablation of Nelf-b leads to deregulation of pol II pausing and defects in cell growth and survival.
        My related publication: Journal of Biological Chemistry


        How to detect the peak when we map ChIP-Seq reads back to the genome? There are many interesting problems here. 
        My related publication: (still working on this) 

  • Cis-regulatory elements identification and analysis
        We propose a completely parameter-free and systematic method for constructing gene co-expression networks and predicting functional modules as well as cis-regulatory elements.
        My related publication: BMC Bioinformatics