2.7 DERIVING THE PARAMETERS OF A MARKOV MODEL FROM SLIDING WINDOW COUNTS
The Markov model parameters are defined from the sliding window counts of 2-grams {N(i, j)} derived from a large sample x = (x0, x1, …, xn−1) of text as follows:
We assume the sample size n is large enough so that for 0 ≤ i < m and that π satisfies
To prove Equation (2.16), we start with Equations (2.13) to (2.15), writing
This book provides three sets of Markov source parameters:
- Smarkov1 and Smarkov2: These Markov source parameters were derived from a nonsliding window count of 67,320 2-grams in the alphabet {A, B, …, Z} appearing in Abraham Sinkov's book [Sinkov, 1968]. P(j/i) was derived using Equation (2.15) from Sinkov's 2-gram counts and written to Smarkov2; thereafter, π(i) was calculated to satisfy Equation (2.3) and written to Smarkov1.
- Gmarkov1 and Gmarkov2: These Markov source parameters were derived ...
Get Computer Security and Cryptography now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.