04.19.08

Frame of reference is everything

Posted in funny at 4:56 pm by stacywong

stacy: “Did you see that techcrunch and popsugar were throwing a party in hollywood?”
andi: “I have no idea what you’re talking about, but it sounds delicious!”

04.14.08

The 10 plagues

Posted in ramble at 1:20 am by stacywong

Today we went shopping for toys for Passover dinner. Apparently some families like to throw plastic toys at each other when describing the 10 plagues. Archie McPhee was a huge success as we managed to get:

  • Finger puppet frogs
  • Plastic babies with mohawks
  • Plastic ants (but they’re supposed to be lice)
  • Miniature cows which we will splotch with red nail polish as they are diseased
  • Other miscellaneous beasts and insects

Given my current obsession with all things bacon, I was sorely tempted to buy bacon bandaids, but resisted as I still have a pack of cowboy ones.

04.08.08

Code cleanup

Posted in tech at 11:37 pm by stacywong

I caught up with Dave Sifry the day after his keynote and asked where code refactor and cleanup came into the picture, since that’s something I’ve been wrestling with for a while. He took it from an optimization standpoint and said that it only makes sense to optimize the code that is most used. He also said that it’s hard to make a case for code cleanup unless it’s measurable, so you need to make sure that you write your software with metrics in mind so that you can use them to drive improvements.

True story. So now… how do I measure code maintainability? One of the things I see a lot is that if something works, we invoke the “if it ain’t broke don’t fix it” rule and never try to rethink it. What happens then is that surrounding code gets refactored and redesigned, so now you have the old stuff which is still functional, but it’s incongruous with the new stuff. You let this happen a few times, and all of a sudden you’ve got a huge mess because you’ve got several codepaths which all employ a different pattern and behave ever-so-slightly differently. This really bites when you have to debug something that requires tracing through your code jungle. Yes a debugger helps, but that is no excuse for unreadable code.

I believe in allocating time for code cleanup with every feature you write, but there’s a fine balance between cleanup work that is actually for the greater good, and cleaning up for the sake of cleaning up. How do you decide what is “good” cleanup work, and how do you measure its success?

ICWSM 2008, part 2

Posted in tech tagged at 11:36 pm by stacywong

I know it’s been over a week, but FWIW here’s a post about my ICWSM highlights.

Being a polyglot

One of the things I really enjoyed was research which spanned multiple languages. It’s fascinating because so much of understanding language is cultural and contextual, and the task of performing objective analysis across multiple languages is extremely difficult.

Nairan presented on word use in depression forums and contrasted between English and Spanish. She found that depression-speak in both languages tended to revolve around the self — “I”, “me”, “mine”. However, the words used by English speakers had to do with recovery, such as “medication”, whereas the themes in Spanish forums had to do with the causes for their depression, e.g. “family”, “boyfriend”, “school”.

There was a poster on cross-lingual blog analysis* which used wikipedia to link topics from one language to another. The results reflected cultural stereotypes, for example, Japanese blogs on whaling were laudatory and nationalistic whereas American blogs were anti-whaling.

The internet, it’s alive!

There was a great preso on Wikipedia and self-governance which really made me realize how all social media sites have grown so quickly and organically in the past few years.

There were also several other presentations on sentiment analysis of blogs, which were very cool. Papers are posted on the ICWSM blog. When I told her about this, Lisa reminded me of wefeelfine which is a pretty way to see how the internet is feeling.

Pretty pictures

I’m not really a graphics person, but it was cool to see Marc Smith’s work on Picturing Usenet (jump to the analysis section).

I got to play around with E15:FB (3D visualization of facebook). E15 is a web visualization tool developed at MIT Media Labs. You can see some videos of E15:FB on Takashi’s site.

Scaling Innovation

Here’s a good transcript of Dave Sifry’s talk. Even though he’s preaching to the choir, I still enjoyed hearing his take on different engineering tradeoffs… and the gratuitous mention of 2-pizza teams. :)

If it wasn’t already apparent, I had an awesome time at ICWSM. Spotaneous conversations about privacy and politics, a crash course in NLP and mining the blogosphere… I definitely learned a lot and would love to go back again next year. +1 for it being in Seattle again.


* I can’t remember the context for this, but there was a step which required manual translation. I saw that “チョコ ウェハー” didn’t have a translation at all and I was like, “Oh, that’s a Kit Kat!” Hiroyuki explained that even though the translation was manual, they used dictionaries so that translations would be consistent… so no Kit Kat. I realized that it’s pretty arbitrary choice since it could just have easily been translated to Time Out or Loacker biscuits. Translation is hard because sometimes you’re trying to reconcile your own experience with what you think others will understand; i.e. everyone else’s experience.