OK, so I was looking for some programming examples using Markov chains for my work. (This was for a new method of generating packet delays and loss patterns for the NIST Net network emulator, if you're curious.) I didn't really find anything that useful, but I came across one of those Markov chain text generators, Frog, by Kai Risku. Later, in an idle moment (not on government time, of course), I looked for something to do with it.
If you're not familiar with these things, they work in two passes. In the first pass, the analyzer looks at some corpus of text, the bigger the better. From this, it builds up a probability table of the likelihood of a word or words following some given sequence of words. In the case of Frog, it builds a table of the likelihood of a word X to follow two previous words A and B, or in other words, a Markov chain with history of two. Then in the second pass, the generator uses this probability table to construct random text.
To further explain, let's do a tiny example. Say the text I'm starting with is the following collection of sentences (deliberately made kind of repetitious):
OK, so my example was boring. What happens when you run this on bigger bodies of text? Applying it to something like a novel usually doesn't work so well - after a little bit, the text degenerates into incoherency. It seems to work best when there is some repetitive structure to the text to keep it from wandering too far, but not so much repetition that no interesting results are possible.
Well, the text I happened to have handy that meets this criterion is the UNIX fortune database. ( Click here for the latest (9708) version.) So anyway,
The ibiblio site also has a fairly small French fortune database. Un"fortun"ately, it's a little too small, but what the heck. We frog it and:
For completeness' sake, here is what you get running frog on the "offensive" fortunes. Often, this seems to give the most sublimely absurd results of all. Be warned, though, that some offensive content may survive the transformation.
I guess after assaulting everyone else's text, I should do my own, too, so I applied it to my manga translations. The result reads like:
O Fortuna, velut Luna statu variabilis, semper crescis aut decrescis; vita detestabilis nunc obdurat et tunc curat ludo mentis aciem, egestatem, potestatem dissolvit ut glaciem. Sors immanis et inanis, rota tu volubilis, status malus, vana salus semper dissolubilis, obumbrata et velata michi quoque niteris; nunc per ludum dorsum nudum fero tui sceleris. Sors salutis et virtutis michi nunc contraria, est affectus et defectus semper in angaria. Hac in hora sine mora corde pulsum tangite; quod per sortem sternit fortem mecum omnes plangite!Fortuna Imperatrix Mundi, Carmina Burana, att. Nicholas de Bracton of Leicester.
1. I am reliably informed that UNIX is still
the trademark of somebody or other.
2. Unfortunately, the Fortuna song is too
small a text to do much of anything with...
2. Unfortunately, the Fortuna song is too small a text to do much of anything with...
Comments, complaints, contributions? Please to be so kind to address them to email@example.com
|Don't trust my translations? Let Babelfish do it:|
|from English to:||du français à:||von Deutsch nach:|
|Copyright 2000-2002 by Mahousu, but: May be freely redistributed by any means, for any purpose. I don't even care whether or not you acknowledge me. Have fun! [Portions, as marked, may be copyrighted by somebody else, with other restrictions. You should still have fun, though.]|