Posted by Jay Livingston
Can the mood of a piece of writing be graphed?
For his final project in Andrew Gelman’s course on statistical communication and graphics, Lucas Estevem created a “Text Sentiment Visiualizer.” Gelman discusses it on his blog, putting the Visualizer through its paces with the opening of Moby Dick.
Without reading too carefully, I thought that the picture – about equally positive and negative – seemed about right. Sure things ended badly, but Ishmael himself seemed like a fairly positive fellow. So I went to the Visualizer (here) and pasted in the text of one of my blogposts. It came out mostly negative. I tried another. Ditto. And another. The results were not surprising when I thought about what I write here, but they were sobering nevertheless. Gotta be more positive.
But how did the Visualizer know? What was its formula for sussing out the sentiment in a sentence? Could the Visualizer itself be a glum creature, tilted towards the dark side, seeing negativity where others might see neutrality? I tried other novel openings. Kafka’s Metamorphosis was entirely in the red, and Holden Caulfield looked to be at about 90%. But Augie March, not exactly a brooding or nasty type, scored about 75% negative. Joyce’s Ulysses came in at about 50-50.
To get a somewhat better idea of the scoring, I looked more closely at page one of The Great Gatsby. The Visualizer scored the third paragraph heavily negative – 17 out of 21 lines. But many of those lines had words that I thought would be scored as positive.
Feeling slightly more positive about my own negative scores, I tried Dr. Seuss. He too skewed negative.
I’m not sure what the moral of the story is except that, as I said in a recent post, content analysis is a bitch.
Gelman is more on the positive side about the Visualizer. It’s “far from perfect,” but it’s a step in the right direction – i.e., towards visual presentation – and we can play around with it, as I’ve done here, to see how it works and how it might be improved. Or as Gelman concludes, “Visualization. It’s not just about showing off. It’s a tool for discovering and learning about anomalies.”