Frog Fortune

OK, so I was looking for some programming examples using Markov chains for my work. (This was for a new method of generating packet delays and loss patterns for the NIST Net network emulator, if you're curious.) I didn't really find anything that useful, but I came across one of those Markov chain text generators, Frog, by Kai Risku. Later, in an idle moment (not on government time, of course), I looked for something to do with it.

Totally unnecessary background

If you're not familiar with these things, they work in two passes. In the first pass, the analyzer looks at some corpus of text, the bigger the better. From this, it builds up a probability table of the likelihood of a word or words following some given sequence of words. In the case of Frog, it builds a table of the likelihood of a word X to follow two previous words A and B, or in other words, a Markov chain with history of two. Then in the second pass, the generator uses this probability table to construct random text.

To further explain, let's do a tiny example. Say the text I'm starting with is the following collection of sentences (deliberately made kind of repetitious):

  1. The other day, I saw a horse.
  2. The other day, my friend became a monk.
  3. The other night, my friend couldn't sleep.
  4. The other night, I saw a bat.
So looking at this, we can construct a series of rules, such as:
  1. "The other" is followed by "day," 50% of the time and "night," 50% of the time.
  2. "other day," is followed by "I" 50% of the time and "my" 50% of the time.
  3. "I saw" is followed by "a" 100% of the time.
... and so on and so on. Then, given this set of rules, we can randomly construct new text, using "The other" as a starting point. If you do this, you'll get some (rather uninspiring) result like:
  1. The other night, my friend became a monk.
  2. The other night, my friend couldn't sleep.
  3. The other day, my friend became a monk.
  4. The other day, my friend couldn't sleep.
  5. The other day, I saw a horse.
  6. The other night, I saw a horse.
  7. The other day, I saw a bat.
... and so on into infinity. The point, though, is the resulting text tends to be close to grammatical, even if it is not always so sensible.

The fics and fortunes

OK, so my example was boring. What happens when you run this on bigger bodies of text? Applying it to something like a novel usually doesn't work so well - after a little bit, the text degenerates into incoherency. It seems to work best when there is some repetitive structure to the text to keep it from wandering too far, but not so much repetition that no interesting results are possible.

Well, the text I happened to have handy that meets this criterion is the UNIX[1] fortune database. ( Click here for the latest (9708) version.) So anyway,

The ibiblio site also has a fairly small French fortune database. Un"fortun"ately, it's a little too small, but what the heck. We frog it and:

For completeness' sake, here is what you get running frog on the "offensive" fortunes. Often, this seems to give the most sublimely absurd results of all. Be warned, though, that some offensive content may survive the transformation.

I guess after assaulting everyone else's text, I should do my own, too, so I applied it to my manga translations. The result reads like:

The fortunes are automatically updated every time I update the webpage, so check back often. Or not...

Omake

O Fortuna, velut Luna
    statu variabilis,
semper crescis aut decrescis;
    vita detestabilis
nunc obdurat et tunc curat
    ludo mentis aciem,
egestatem, potestatem
    dissolvit ut glaciem.

Sors immanis et inanis,
    rota tu volubilis,
status malus, vana salus
    semper dissolubilis,
obumbrata et velata
    michi quoque niteris;
nunc per ludum dorsum nudum
    fero tui sceleris.

Sors salutis et virtutis
    michi nunc contraria,
est affectus et defectus
    semper in angaria.
Hac in hora sine mora
    corde pulsum tangite;
quod per sortem sternit fortem
    mecum omnes plangite!
Fortuna Imperatrix Mundi, Carmina Burana, att. Nicholas de Bracton of Leicester.[2]

1. I am reliably informed that UNIX is still the trademark of somebody or other.

2. Unfortunately, the Fortuna song is too small a text to do much of anything with...


Return to: [French manga home] [Mahousu home]

Comments, complaints, contributions? Please to be so kind to address them to mahousu@gmail.com



Don't trust my translations? Let Babelfish do it:
from English to: du français à: von Deutsch nach:

Copyright 2000-2002 by Mahousu, but: May be freely redistributed by any means, for any purpose. I don't even care whether or not you acknowledge me. Have fun! [Portions, as marked, may be copyrighted by somebody else, with other restrictions. You should still have fun, though.]