OK, so I was looking for some programming examples using Markov chains for my work. (This was for a new method of generating packet delays and loss patterns for the NIST Net network emulator, if you're curious.) I didn't really find anything that useful, but I came across one of those Markov chain text generators, Frog, by Kai Risku. Later, in an idle moment (not on government time, of course), I looked for something to do with it.
If you're not familiar with these things, they work in two passes. In the first pass, the analyzer looks at some corpus of text, the bigger the better. From this, it builds up a probability table of the likelihood of a word or words following some given sequence of words. In the case of Frog, it builds a table of the likelihood of a word X to follow two previous words A and B, or in other words, a Markov chain with history of two. Then in the second pass, the generator uses this probability table to construct random text.
To further explain, let's do a tiny example. Say the text I'm starting with is the following collection of sentences (deliberately made kind of repetitious):
OK, so my example was boring. What happens when you run this on bigger bodies of text? Applying it to something like a novel usually doesn't work so well - after a little bit, the text degenerates into incoherency. It seems to work best when there is some repetitive structure to the text to keep it from wandering too far, but not so much repetition that no interesting results are possible.
Well, the text I happened to have handy that meets this criterion is the UNIX[1] fortune database. ( Click here for the latest (9708) version.) So anyway,
The ibiblio site also has a fairly small French fortune database. Un"fortun"ately, it's a little too small, but what the heck. We frog it and:
For completeness' sake, here is what you get running frog on the "offensive" fortunes. Often, this seems to give the most sublimely absurd results of all. Be warned, though, that some offensive content may survive the transformation.
I guess after assaulting everyone else's text, I should do my own, too, so I applied it to my manga translations. The result reads like:
The fortunes are automatically updated every time I update the webpage, so check back often. Or not...
O Fortuna, velut Luna
statu variabilis,
semper crescis aut decrescis;
vita detestabilis
nunc obdurat et tunc curat
ludo mentis aciem,
egestatem, potestatem
dissolvit ut glaciem.
Sors immanis et inanis,
rota tu volubilis,
status malus, vana salus
semper dissolubilis,
obumbrata et velata
michi quoque niteris;
nunc per ludum dorsum nudum
fero tui sceleris.
Sors salutis et virtutis
michi nunc contraria,
est affectus et defectus
semper in angaria.
Hac in hora sine mora
corde pulsum tangite;
quod per sortem sternit fortem
mecum omnes plangite!
Fortuna Imperatrix Mundi, Carmina Burana, att. Nicholas de
Bracton of Leicester.[2]
1. I am reliably informed that UNIX is still
the trademark of somebody or other.
2. Unfortunately, the Fortuna song is too
small a text to do much of anything with...
Comments, complaints, contributions? Please to be so kind to address them to mahousu@gmail.com
| Don't trust my translations? Let Babelfish do it: | ||
| from English to: | du français à: | von Deutsch nach: |