Chengwei LEI, Ph.D.    Associate Professor

Department of Computer and Electrical Engineering and Computer Science
California State University, Bakersfield

 

Data Science

 

Hidden Markov Models

 



 

Background about Markov Chain

 



An easy example for the weather model

Try to write a program to generate 3 year's (1000 days) weather data based on the above model, and try to fill these forms 

Counting for All data
  Total Sunny Rainy
Days      
Percentage      

What is the weather after a sunny day
  Total Sunny Rainy
Days      
Percentage      

What is the weather before a sunny day
  Total Sunny Rainy
Days      
Percentage      





Let's look at a more complicated model

Here is a model for Sunny, Rainy, Cloudy weather

Counting for All data
  Total Sunny Rainy Cloudy
Days        
Percentage        

What is the weather after a sunny day
  Total Sunny Rainy Cloudy
Days        
Percentage        

What is the weather before a sunny day
  Total Sunny Rainy Cloudy
Days        
Percentage        





Based on the following weather data, can you build a estimate model ??

Sunny_Rainy_Cloudy (recorded by numbers)



 

Graudate Student Observation Example

 

Matrix Magic

 

Hidden Markov Model

 



The boxes of casinos:

  • 99% fair, 1% loaded (50% at six)
  • randomly pick a die and roll, what is the chance to get a six?
  • If we get 3 six in a row, what’s the chance that the die is loaded?
  • If we get 5 six in a row, what’s the chance that the die is loaded?

Probability Basics




!!!DISHONEST CASINO!!!

A casino has two dice, casino player switches back and forth between fair and loaded die once in a while.







The dice are unfair!!!!!!!!!!



Fair die
P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6

Loaded die
P(1) = P(2) = P(3) = P(4) = P(5) = 1/10
P(6) = 1/2
Transition probability (switching between fair and loaded die)
P(trans) = 0.05

More detailed information






Beat the DISHONEST CASINO with HMM !!!

How?





Consider the above dice model, we have a series of die rolling outcomes

61226351241253662512

what is the probability that first half are generated by the fair die, and second half are generated by the loaded die?


Evaluation














Consider the above dice model, we have a series of die rolling outcomes

61226351241253662512......123643452416243

Try your best to guess which outcomes are generated by the fair die, and which are by the loaded die


Decoding

Are you sure you can do it? Yes? Then try this one! (1000 outcomes)

You can try to evaluate your guess on Odin by following CMD:
/home/fac/clei/checker/HMM/DiceChecker1000 WhateverYourAns.txt
Sample Answer (F stands for Fair, L stands for Loaded)










Our inside source was caught by the evil sheriff!!

We only know that the casino use 2 dice, no more information!!

With the following series of die rolling outcomes

61226351241253662512......123643452416243

Try your best to guess which outcomes are generated by the First die, and which are by the Second die


Learning

Try this! (10000 outcomes)

You can try to evaluate your guess on Odin by following CMD:
/home/fac/clei/checker/HMM/DiceChecker10000 WhateverYourAns.txt
Sample Answer (F stands for First die, S stands for Second die)










Day and Night is a television series directed by Wang Wei and written by Zhiwen. The series tells the story of Guan Hongfeng, the former captain of the Changfeng Criminal Investigation Detachment, who solves many cases to get his brother Guan Hongyu exonerated.



Guan Hongfeng, a former police captain suffering from nyctophobia, returns to solving mysteries alongside the hot-tempered Captain Zhou Xun and rookie officer Zhou Shutong. However, he has a hidden agenda, which is to clear his identical twin brother Guan Hongyu's name from the alleged murder of an entire family.



Assume you are Captain Zhou Xun, with a Computer Science Ph.D background. :)
Can you design a HMM to figure out their identities based on the behavior patterns? How?



My data for SR: SR1 SR2 SR3 SR4 SR5

My data for SRC: SRC1 SRC2 SRC3 SRC4 SRC5

136244352