Tuesday, February 12, 2008
Visualizing the English Language
On the heels of yesterday's image search post here's another item connecting words and images, this one from researchers at MIT*. These guys have produced a "visualization of all the nouns in the English language arranged by semantic meaning**." I had thought the English language looked like a large, disturbing bunny, but aparently it looks like an enormous mosaic of tiny colored blobs.From their intro: "large-scale groupings correspond to broad categories such as plants or people." Which lets us discover interesting trends, like that plants are green. That green blob at the bottom, floating around like Australia? Plants.
Then there's this: "each tile [is] the average of 140 images. The average reveals the dominant visual characteristics of each word. For some, the average turns out to be a recognizable image; for others the average is a colored blob." I clicked on dozens of tiles, and the average image was always a colored blob. This strikes me as analogous to taking all the synonyms for the word "person," grinding them through an averaging algorithm, and claiming the average word for "person" is "aoviksv". Which is to say, some things don't make much sense, averaged.
So this is pretty useless, even by my low standards of what constitutes utility. What it really appears to be is an eye-candy outcropping of a larger, more meaningful research effort--machine recognition of objects in images. And who knows, maybe some fancy algorithm can make better sense of "aoviksv" than our tiny little brains.
Let me insert my standard caveat to digs at academia: what the hell do I know. These guys represent MIT. Errata represents... New Jersey. If that.
* Via ReadWriteWeb, via Tech_Space.
** The source of their semantic meanings? WeirdNet, natch. Everybody love the WeirdNet.
Labels: images, MIT, visualization, wordnet
Wednesday, January 16, 2008
More Definitions
Wordie has displayed a definition for most words for the past few months, but it had been displaying only the most common one, in order to keep the focus on the fun stuff: citations and comments added by members.You can now see all the definitions available for a word, in case you want to save a trip to a proper dictionary or just want to see what other strange tricks WeirdNet has up its sleeve. I tried to keep it subtle, so you still see only the top-ranked one, but now with a "more" link just below it. Click and the rest appear.
I decided to leave out example sentences, thinking it might get in the way of people providing their own, but I'm happy to revisit that if people would like. Let me know.
Labels: features, weirdnet, wordnet
Friday, November 2, 2007
The Cupertino Effect
Ben Zimmer* has an interesting and amusing post in today's OUPblog about the Cupertino Effect: the tendency of spellcheckers, due to outdated dictionaries, bad algorithms, or a combintion thereof, to insert or suggest nonsensical words.
The recent addition of WordNet definitions to Wordie (which I'll blog at greater length on Monday) was resulting in a version of this before I tweaked the algorithm. As someone famous once said (Barbie, I think), natural language processing is hard.
* update: I incorrectly called Ben "Bill Zimmer" when this was first posted. Not sure where that came from, sorry Ben!
The recent addition of WordNet definitions to Wordie (which I'll blog at greater length on Monday) was resulting in a version of this before I tweaked the algorithm. As someone famous once said (Barbie, I think), natural language processing is hard.
* update: I incorrectly called Ben "Bill Zimmer" when this was first posted. Not sure where that came from, sorry Ben!
Labels: ben zimmer, cupertino, OUP, wordnet




