Markov’s revenge

Posted:

I’ve been getting about 5-10 comments per day from what are either deranged people or mindless automatons. I think the latter is the more likely suspect.

There is a certain irony in my blog being attacked by robots spewing nearly intelligent prose like:

If you’ve never heard Radiohead, you’re missing out. ,

or

You’re absolutely right — I have not attended the program, and what I’ve supposed about the community and its cohesiveness is pure speculation on my part. ,

While I can’t be sure of the exact technique used to generate the bodies of these comments, it looks like a large body of human-generated text was used. Perhaps this corpus was gathered from forum posts or twitter. In any case, it looks like a random number of words are selected from this corpus and plopped into a message to taskboy.

The irony here is that I once used a slightly more sophisticated version of this technique to auto-generate journal posts to use.perl.org. The technique that generates nearly passable text is called markov chains. Markov chains use a simple statistical approach to figuring out which words commonly follow other words. It’s sort of fun looking at the results the first time, but some people on use.perl.org got annoyed pretty fast by this.

In a future post, I will revisit the markov chain implementation I used so that you too can annoy your friends will robot blather.