04.08.08

ICWSM 2008, part 2

Posted in tech tagged at 11:36 pm by stacywong

I know it’s been over a week, but FWIW here’s a post about my ICWSM highlights.

Being a polyglot

One of the things I really enjoyed was research which spanned multiple languages. It’s fascinating because so much of understanding language is cultural and contextual, and the task of performing objective analysis across multiple languages is extremely difficult.

Nairan presented on word use in depression forums and contrasted between English and Spanish. She found that depression-speak in both languages tended to revolve around the self — “I”, “me”, “mine”. However, the words used by English speakers had to do with recovery, such as “medication”, whereas the themes in Spanish forums had to do with the causes for their depression, e.g. “family”, “boyfriend”, “school”.

There was a poster on cross-lingual blog analysis* which used wikipedia to link topics from one language to another. The results reflected cultural stereotypes, for example, Japanese blogs on whaling were laudatory and nationalistic whereas American blogs were anti-whaling.

The internet, it’s alive!

There was a great preso on Wikipedia and self-governance which really made me realize how all social media sites have grown so quickly and organically in the past few years.

There were also several other presentations on sentiment analysis of blogs, which were very cool. Papers are posted on the ICWSM blog. When I told her about this, Lisa reminded me of wefeelfine which is a pretty way to see how the internet is feeling.

Pretty pictures

I’m not really a graphics person, but it was cool to see Marc Smith’s work on Picturing Usenet (jump to the analysis section).

I got to play around with E15:FB (3D visualization of facebook). E15 is a web visualization tool developed at MIT Media Labs. You can see some videos of E15:FB on Takashi’s site.

Scaling Innovation

Here’s a good transcript of Dave Sifry’s talk. Even though he’s preaching to the choir, I still enjoyed hearing his take on different engineering tradeoffs… and the gratuitous mention of 2-pizza teams. :)

If it wasn’t already apparent, I had an awesome time at ICWSM. Spotaneous conversations about privacy and politics, a crash course in NLP and mining the blogosphere… I definitely learned a lot and would love to go back again next year. +1 for it being in Seattle again.


* I can’t remember the context for this, but there was a step which required manual translation. I saw that “チョコ ウェハー” didn’t have a translation at all and I was like, “Oh, that’s a Kit Kat!” Hiroyuki explained that even though the translation was manual, they used dictionaries so that translations would be consistent… so no Kit Kat. I realized that it’s pretty arbitrary choice since it could just have easily been translated to Time Out or Loacker biscuits. Translation is hard because sometimes you’re trying to reconcile your own experience with what you think others will understand; i.e. everyone else’s experience.

Leave a Comment