next up previous contents
Next: Optimizations Up: Hidden Markov Models Previous: The Viterbi Algorithm   Contents

The Baum-Welch Training Method

So far, we have seen some really nice results about HMM's, but we still cannot do anything about the dice man, because we do not know what $\lambda$ is. Specifically, we need to figure out what $\pi$, $A$, and $B$ are before we can calculate any probabilities or state sequences. But how can we find these out? In short, we guess what they are. In actuality, we make a very educated guess about what they are.

First, since we don't know any of the values for the model parameters, we randomize all of the parameters. Thus, every sequence of states or transitions should be equally likely. Next, we calculate $\alpha$, $\beta$, and $\gamma$, and determine some new values. The first value we compute is the expected number of times we'll be in a given state $s_{i}$ at time $t=1$. This is the value of $\gamma_{1}(i)$, and will be the new value of $\pi_{i}$. Similarly, we can find new values for $A$ and $B$. The proof of why we can do this relies on some heavy duty statistics, specifically gradient descent and EM procedures, and is therefore beyond the scope of this paper. The interested reader should refer to [2]. What's imporant about the Baum-Welch Training Method is that we can feed in observation sequences, and as long as we present enough data, we will get an HMM that is pretty close to optimal for the given training sequences. Thus, if the training sequences are characteristic of the actual distribution of observation sequences in the system we're modelling, our HMM will be able to tell us what state the system is in with a high degree of certainty. It must be noted, however, that as with any gradient descent algorithm, it is not guaranteed that the HMM will end up in a state of globally minimal error. Instead, it settles into a local minimum, which hopefully is not too far from the global minimum.


next up previous contents
Next: Optimizations Up: Hidden Markov Models Previous: The Viterbi Algorithm   Contents
Ben Taitelbaum 2003-12-31