Author Archives: naaman

Radio Silence Over: Updates, Mahaya, TimeSpace, Moscow

Ayman has been on my case, and for a good reason this time. We kind of neglected you, good readers of our blog. It’s been a long and winding few months years. We both fully intend to write more but for now, here’s a quick update from the Naaman half. And it’s exciting (at least for me).

The quick update, for those who don’t know, is that I have co-founded a company called Mahaya which aims to organize the world’s memories: make sense of the world’s stories and events as they are shared on social media. We are currently beta testing a new product called Seen, which automatically makes it fun and simple to see what happened anywhere. This week, The New York Times announced that Mahaya will be one of the three companies in the inaugural run of the TimeSpace program (whoever named it should receive the Pulitzer).

In related news, next week in Moscow, I will be giving a keynote at ECIR 2013, talking about how the work we have done in the last 8 or so years have informed the vision (and technology) for Mahaya. Here’s the motivation for the talk below. I will try to post the full notes after I give the talk (Ayman, keep me honest here).

Time for Events.

In the last 8 years, my work and research had focused on the ways in which social media reflects and interacts with “the real world” — by which I mean actual occurrences, atoms clashing, people performing acts that are tied to a specific location and, often, to a time.

2005 was the onset of location-based social media as we know it. Flickr got popular (and got acquired by Yahoo). In 2006, Flickr formally introduced geotagging by supporting geo-metadata and providing a map interface; they thus created an easy way for people to associate location data with content, at scale. Almost immediately, we had… lots of dots on on a map! Surely, we thought, these dots can tell us more about the world than where photos were taken. Can they tell us *what* are the most interesting places/landmarks, instead?

Tag Maps was our attempt at Yahoo! Research Berkeley to do that. For any world region, any zoom level, we extracted (using fairly simple IR tools), the most salient and important topics for that area; we built an interactive prototype that exposed this information, a video of which you can find here (see if you can spot Yoda!). We realized (read more here) that one could extract a strong signal about the real world, about people, their geographic activities and interests from social media data.

Tag Maps / World Explorer Demo from Mor on Vimeo.

We then noticed a funny entry on the Paris Tag Map. It read “Les Blogs”; an explanation can be found here: a bunch of bloggers at a conference, posting Flickr photos until our algorithm thought this was the main descriptor for that area (and Paris). In other words, events started showing up on our map. That got us thinking: can we do a better job modeling, identifying and presenting the data that is specifically associated with events?

tagmaps paris

At SIGIR 2007 we showed that the answer is yes. With Tye and Nathan, we described a system that discovers real-world events from Flickr geotagged data, including hyper-local events such as BYOBW (an old favorite for me to show in talks, and an event I literally learned about from our results). The takeaway? social media can reflect real-world events, via content created by a collective of mostly uncoordinated contributors.

After Tahrir square these “discoveries” seem rather obvious, but that was not the case in 2007, before Facebook and Twitter gained any mainstream popularity, and well before iPhone popularized media and location (iPhone 3G was released in July 2008).

In my talk, I am going to address the challenges in developing event technologies, show some of the solutions and technologies we developed in my research, complain that in 2013 that problem is not yet solved commercially (case in point: the link I had to use for BYOBW above), and give a demo of Mahaya’s recent product, Seen, where we start solving the “event problem”. I’ll also talk about social media as the next step in the evolution of information systems, and what it means for Information Retrieval.

Come and say hi if you are in Moscow next week!

Cheer Up! Some Holiday Hacking

With my star undergrads Ian and Abe, and backend support from Ziad, we put together this mashup for the holidays! We use the data from the Twitter streaming crawler we built (for our NSF-funded work) to get Instagram photos posted on Twitter that have the word Christmas in the tweet, and where the photo location is available on Twitter. We then add the Google Streetview of the photo location and, well, mash them all together.

Cheerbeat Screenshot

The result is an interesting juxtaposition (as one comment on my Facebook post captured well) of the “small instagram-style photos (typically close-up, indoors) against the backdrop of the (typically distant, outdoors) Google street views”. As such, the StreetView gives context to the Instagram photo and maybe provides the settings in which the activity in the photo is taking place, another dimension of understanding, often much stronger than the text of the tweet itself.

Cheerbeat Screenshot

The app is also an interesting (and mostly unintended) statement about privacy — I don’t know what these users would feel like knowing their environment is exposed to all, and not just in a default bland zoomed-in map format.

Cheerbeat Screenshot

The Cheerbeat application (instacheer was our first name choice but, perhaps not amazingly, already taken with another Instragram Christmas mashup!) mostly runs as javascript in the browser. We continuously crawl Twitter data using the streaming API on our server. When the app loads, it grabs from our server a .json file with the latest 250 tweets with “Christmas”, “Instagr.am” that have geo coordinates that are not empty. We then (in the browser) use the Google Streetview API to find which of these insta-tweets’ locations are available. The app then rotates through the tweets/photos showing the tweet, picture, location, time and Streetview of each.

As a side note, after all this filtering,  surprisingly little data satisfied all these criteria, mostly (I suspect) because Twitter requires specific user authorization for location information to be posted in tweets. In other words, even though many (most?) Instagrams will have location data, a lot of those will not have their data available when posted on Twitter.

There are extra features coming for this app (e.g., choosing your own keywords), but more on that later.

Happy holidays and enjoy the beat!

 

Putting on a SMILe (Plus: Winners!)

They say academia is the art of becoming world-renown without appearing to be self-promoting. Sometimes, however, you gotta make some noise. In our case, we (that’s my team and I, don’t blame Ayman) have recently launched a new lab, the Social Media Information Lab. We thought we’d like to get the word out, especially as we are looking for new PhD students (and maybe postdocs) to join our ranks.

As the CHI 2011 conference is the most popular conference that matches our research area, we decided to do something for it. It also helps that CHI has traditionally been a very playful gathering, with people allowing their badges to be decorated with a host of badges (formal and informal), stickers, puppets, and various other household items. Love the CHI academics. We decided to have a little game.

With our convenient lab name acronym, SMIL (perhaps not accidental), we zeroed in on a Smile theme pretty quickly. We picked four exceptionally smily CHI luminaries as our SMILe ambassadors: Ben Shneirderman, Judy Olson, Elizabeth Churchill, and Ed Chi. The fantastically talented Funda Kivran-Swaine has turned their smily regular pictures into a monochromatic image (Ed now carries his proudly on his Twitter profile), which we printed on some 1000 stickers using the wonderful-yet-pricy Zazzle service. Of course, the stickers included the URL of the SMIL website.

From left: Judy Olson, Ed Chi, Elizabeth Churchill, Ben Shneiderman

We devised a conference game, with very simple mechanics: collect all four heroes on your badge, post it on Flickr/Twitter (#chismil) and you have a chance to win a CHI-SMIL t-shirt. We also made it somewhat difficult: different team members (and friends) distributed different stickers, and Ed’s sticker was the most rare, and access to it tightly controlled by Funda only.

Did it work? We think it did. Soon enough, people I didn’t know approached me begging for “A Judy Olson” (or some other sticker), and a rumor was start that there is a secret, fifth member.

Second, the luminaries themselves were great sports, and seem to enjoy the commotion and exchanges around the stickers. They each had a roll of their kind, except for Ed of course (access controlled to the end!).

In addition, people went to our website and commented on it to me (and perhaps to others).

And, finally, many people labored to collect all four stickers! (partial set of images). We put names in a hat, drew them out, and have five lucky winners. There you go, people. T-shirts are coming. You’re welcome.

Stay tuned for CHI 2012. Who knows what games will be played.

Using Sociology(!) to Explain Unfollows on Twitter

What gives, @ayman is no longer following me on Twitter!

Well, he still does, not least because he knows I will send roadkill to his office address if he stops. But surely, people stop following one another on Twitter all the time. Right? Right? Yes, right, as we show in our recent paper (caution, PDF), with my PhD students Funda Kivran-Swaine and Priya Govindan, to be published at CHI 2011.

Many studies, in academia and industry, in computer science and sociology (this one too), examine creation of new ties in social networks, but very few examine tie breaks and persistence. Why? One reason is that, in computer science, models of tie creation have immediate consequences for systems (e.g., recommending new contacts). Another reason is that tie breaks are rare, or hard to detect/define in many social networks, especially those networks studied by sociologists (when does Naaman’s tie with Ayman break? after 3 years on not communicating? 20?). Ron Burt‘s work is an exception, but Ron is always an exception, isn’t he.

Enter Twitter, where we can witness a dynamic social system, and where ties are created and broken for all to observe. Op-por-tu-ni-ty! Can we shed some light on the tie break phenomena in Twitter? How wide-spread is this phenomena, and what are the factors that can help predict tie breaks?

We started with a random set of 715 Twitter users, and the 245,586 Twitter users that “follow” them at Time 1 (July 2009). We looked at these users and followers again after nine months (April 2010, Time 2). Did these follow edges still exist? How many dropped over that period? The image below captures one of our 715 users, the network around them in Time 1. Those users that stopped following our user at Time 2 (the “unfollowing” users) and their connections are marked in blue. Now it’s time to pause and see what you think the overall drop “unfollow” is in our data: 5%? 15%? 25%? 75%? OK, scroll down.
Unfollowing on Twitter.
Turns out, over nine month, 30% of the follow edges disappeared. On average, a single user lost about 39% of their followers over that period. How come it’s not 30%? Because the 39% is an average of averages; probably due to the fact that people with a large number of followers — of which there are fewer — lost a smaller portion of their followers, but still a large number. Does more followers mean relatively fewer unfollowers? I’ll come back to that in a second.

For this work, we were mainly interested in looking at whether well-known sociological processes are in play on Twitter in respect to unfollowing activity. So we did our lit review, and discovered that strength of ties, embeddedness within networks, and power/status are some of the key related sociological concepts (the paper explains those in detail, of course). The question then was: can we look at the network structure alone, and based on these theories, see if there are network factors that are highly correlated with unfollows?

The details of the dataset are in the paper, but for now, just imagine that for each “follow” relationship, we had the complete network graph of both nodes. So if “@ayman following @informor” was one of the edges we looked at, we could get the entire network neighborhood of @informor, and @ayman. (This network data is presented to you courtesy of Kwak et al.). What properties of @informor’s network, and of the network around @informor and @ayman, correlate with higher probability that @ayman would stop following me?

We calculated a bunch of variables, including for example, for each of our 715 initial users (let’s call them “seeds”):

  • The seed’s number of followers.
  • The seed’s clustering coefficient: how connected their followers are.
  • The seed’s reciprocity rate: what portion of the people following them, they follow back?
  • The seed’s follow-back rate: what portion of the people they follow, follow them back?
  • The seeds follower-to-friends ratio.

And for each seed and follower pair in our data, we computed aspects of their relationship:

  • How many connections they have in common (i.e., users the seed and follower both connect to)?
  • What is the different in prestige between the two (in terms of number of followers)?
  • Does the seed reciprocate the connection to the follower?

So, which factors correlated most with unfollow activity? We ran quite a sophisticated analysis (multi-level logistic regression), but I’ll keep it simple for here with a basic analysis of the factors that our analysis had shown to contribute to the probability that a follower will unfollow a seed. For the more “scientific” study, check out the paper.

First, what did *NOT* have impact: the number of followers a seed had at Time 1 had very limited impact on the probability of unfollows for that seed, and that impact was mitigated by other factors. A figure (limited to seeds who had less than 500 followers) demonstrates this.
num followers

So what played a major role? Reciprocity, for one, did. Do you follow someone that follows you? If you do, they are much less likely to unfollow you. Remember our 245,586 connections? Half of them were reciprocated (the seed also followed their follower). When the relationship was reciprocated, 16% of the followers unfollowed. When it wasn’t, a whopping 45% did. Before I throw a figure in, an important note about causality: we don’t know the causality. For example, pairs of users who are closer in real life (“strong ties”) are likely to have a reciprocated relationship and of course, their connection is not likely to break (because they are close). A deeper examination is needed to show whether the reciprocity act *alone* helps in maintaining the tie, although the analysis in the paper suggests that it contributes more than other factors that typically signify strong relationships.

reciprocated

We can even look at the user’s tendency to reciprocate follow relationship, and its effect on the percent of followers they lose:
reciprocity

Here’s one more thing to think about: a user’s follow-back rate was highly correlated with a lower ratio of unfollows, but the ratio of followers to followees wasn’t. The follow-back rate is portion of the people a user follows that follow them back. For example, I may have 15 followers and 10 followees (people I follow) on Twitter. Out of the people I follow, 8 follow me back. So my follow-back rate is 80%, and my follower-to-followee ratio is 1.5. Both these metrics are potential measure of “importance” on Twitter, but the fact that only one — the follow-back rate — impacts the rate in which people stop following me, hints that the follow-back rate might be a better measure of importance and success on Twitter. Makes sense, Ayman? What’s your follow-back rate?

Unfollowing on Twitter: followback rate.

What else? the embeddedness is the last thing I will touch on, you can read the paper for more (it’s only a 4 pager, don’t be too easy on yourself). And by embeddedness I do not mean the number of YouTube videos you post on your Twitter stream, but the sociological concept that captures set of relationships that exists between the individuals in a relationship through third parties (i.e., common friends). More common friends? Your relationship is presumed to be stronger. It is not a surprise, then, that the larger the number of common neighbors two Twitter users have, the less likely one is to unfollow the other. From our data, this figure shows, for each level of common neighbors a “follow” relationship had, what percent of these follows became “unfollows”. For example, from all follow relationships that had no common neighbors at Time 1, 78% did not exist at Time 2; one common neighbor was enough to drop that number to 46% (and it keeps dropping — I stopped at 15 because you get the idea).

common neighbors

What didn’t we look at? Pretty much everything else! We relied on network structure alone to investigate these unfollows, as a first step. But there’s a lot more: how often do you tweet (or not)? How interesting are your posts? How similar your topics are to the people following you? We are now exploring all these factors and additional variables. Stay tuned.

[update: slideshare presentation here].

Who, What, When, Where: The Semantic Web is Alive and Well (and on Facebook)

I have killed the semantic web before (at least in my provocative title), but pointed out that the future of semantics are light-weight semantics created by programmers, users or individual companies. And here it comes: the future of the Semantic Web (and by that I also mean the Web, the life and the Universe) is now owned by Facebook.

A recent Yahoo! patent, dug up by SEO by the Sea reminded me of the work I’ve been involved with at Yahoo!, driven by the vision of Marc Davis: being able to semantically connect the four most important dimensions of Web objects, Who, What, When and Where, directly to the user experience on Yahoo!. But while Yahoo! dragged its feet, Facebook is making real steps to becoming the true W4 platform for the Web. The identity (Who) war at least seems to have been won, at least for the time being; for most people, the real identity on the Web is the one they expose on Facebook. Controlling the Who has immediate consequence (e.g., de-facto communication platform for people trying to reach contacts), but had also allowed Facebook to expand into the When (Events), What (Pages) and now Where (Places). And as I am doing the linking here, I notice the Facebook title for the Places page — interesting…

Facebook W4

In other words, the Facebook W4 network allows people to connect their experiences to well-defined concepts that “live” in the Facebook objectverse. This is one of Facebook’s greatest successes, and greatest leverage going forward.

Going forward means allowing other developers and companies to build on the Facebook W4 semantics. Yahoo! only partially succeeded in doing that with “Where”, using the Yahoo! Geo platform. Facebook now allows Websites and applications to connect via the Who (Facebook identity). Increasingly, Facebook will increase the usefulness of there “What” and “When” for other applications. The Places feature, cleverly, was already launched with integration of various companies (e.g., FourSquare) that can use the Facebook Places platform. There is no reason why this platform will soon be open (and used) by many other developers, giving Facebook ownership of Who and Where on the Web.

Going forward also means improving the capabilities of the Facebook platform in connecting and mashing the various entities. For example, to be able to record the fact that I “this picture was taken in the event Elvis Perkins in Dearland at Governor’s Island with my friend Kathleen”. Seems like that may be coming! Many other applications are of course possible (e.g., “all the Statuses ever posted from this classroom”).

And where is Twitter? With the less specific “annotation” feature, and lagging behind in the Who space, Twitter is struggling in the  objectverse, despite a strong geo-bend and a major push last year.

The Secret Life of (One) Professor: Two Years In

Matt Welsh of Harvard recently wrote on the Secret Lives of Professors, a post that stirred a lot of discussion and struck a chord with a somewhat less experienced professor (that would be me; two years on the job vs. Matt’s seven). I found my self nodding at many of Matt’s well framed observations.

Matt’s main “surprises” and lessons that he offers to grad students in his post include:

Lots of time spent on funding request. I have had a similar experience, because (like Matt) I enjoy working with, and leading, a large group of researchers. Of course, the batting averages are low for funding requests (Matt downplays his success rate but I bet it’s better than average). In my first two years, I submitted 3 NSF proposals, 2 of which were declined and one outstanding (a good sign); I am currently working on two more. Each of these took significant effort, in one case at least (an estimated) two full months of my time. In addition, I submitted a number of smaller-scale proposals, most of them to quick and easy to write, and was fortunate enough to get a Google Research Award (thanks again Goog!), and to be assigned as a faculty mentor to a superstar two-year postdoc Nick Diakopoulos. Together with some other odds and ends (thanks SC&I!) I feel pretty happy after two years regarding the group and resourced I amassed; but the cost on my time is still substantial. On the bright side, as Sam Madden points out in the comments to Matt’s article, some of the grant proposal process is actually helpful in helping me think about future work and research agendas, even if the specific proposal does not get funded.

The job is never done. Even as I write this, I could (and feel that I should!) be editing a paper, or looking at some data, or catching up on email, or working on one of two said proposals. Matt’s admits:

For years I would leave the office in the evening and sit down at my laptop to keep working as soon as I got home.

I can’t say my experience is far from that, although I still insist on taking good vacations. And a 2-year old kid certainly makes for a compelling reason to stop working at any time.

Can’t get to “hack”. True enough, most of the interesting work is delegated to students, as Matt complains that he doesn’t find time to write code. However, that is partially the decision that Matt (and I) knowingly take when we decide to work (and try to fund) a large group of students. Managing fewer or no students might allow more individual research work, which is certainly a path taken by some faculty that skip on the funding requests and the resultant students meetings. However, I am no Ayman, do not miss writing code, and am happy to farm that out to students. I do enjoy thinking about the intellectual and research issues, and often get to do that with the students. I would like to have fewer meetings and less email, but unlike Matt I feel involved enough in the intellectual work, at least so far. Nevertheless, I can’t dive into it like the grad students who indeed “have it good”.

Working with students. Matt writes:

The main reason to be an academic is… to train the next generation.

I see it the same way (the intellectual pursuit is also up there, but it could be claimed that you can perform similar intellectual pursuits in other settings like research labs). The students is why I am in academia, and the advising is by far my favorite activity. From solving someone else’s problems (e.g. a student not sure how do approach X or Y) to, more substantially, showing students a path from a first-year confusion to an experienced researcher that understands how to ask (and answer) research questions, and communicate it effectively. Well, I am clearly not quite there yet having just recently started doing it (and just started funding my first PhD student). But I am enjoying it already. Like Matt, for me it is not just working with the PhDs and Masters students; the undergrads play a big role. I started working with several star undergrads, some of them have never SSH’ed into a server before, most of them have never seen how research is done. Their wide-eyed excitement is an energy source, an inspiration and a cause of constant enjoyment.

So, the bottom line?

It is certainly not for everybody. It remains to be seen if it is even for me.

I will buy that, Matt. At the end of the day, for me, it’s the students, and the freedom to carve my own path. This summer I am lucky enough to be working with my group at SC&I consisting of one postdoc, 2-3 Phd students, 3 Masters students, and 1-3 undergrads (at any given time). With teaching (more on this topic later) out of the way, I spend two full days a week with this gang talking about research, writing papers or grants, having other “good” meetings, or playing Rock Band on our Wii. It’s definitely one of the best work summers I have had, much like my summers at Yahoo! Research Berkeley where we had most of our fantastic interns join in on the fun.

Speaking of the defunct Y!RB, and regarding that path-carving freedom, I feel a lot less constrained in academia compared to industry research. I have had a fantastic experience at Yahoo!, and was lucky to have a great team at the Berkeley lab. However, to start my own project at Yahoo!, that follows my own personal vision, and involved multiple people, would have taken a lot of convincing (and would need to be ultimately tied to corporate agenda). I know Ayman does not agree, so maybe this is just a false sense that I have, that moving a bunch of people towards a vision that I choose and craft is easier in academia. To do that with the students might be, as Matt put it, “the coin of the realm”.

Apple Does Migrations (Almost) Perfectly

Just got a new Macbook pro. I’ve been on Mac for about 5 years now, and the number one most impressive feature to me is the migration. As someone lucky enough to be in a place with a fantastic IT department (yes, I know that’s unlikely, but our IT people are superstars) it means just dropping off my old Mac, and, voila! few hours later I have all the setup I had before (down to the browser history items), reproduced on a lovely new machine.

Just a few things went wrong, most of which are Apple’s fault, and some of which are quite annoying.

First, the Mac didn’t recognize the iPhone. Luckily I was clever enough to think of checking for a Mac software update, and sure enough, the only update available was a fix to this bug. +1 point, Apple.

But it got worse once the iPhone was recognized. Soon enough I got this notice right here:
Annoying.

OK, a little scary, and totally wrong (not getting into DRM discussion here) but not so bad as a user experience — the dialog allowed me to continue, give me options, I can live with that (but why didn’t the migration carry forward my authorization?). Anyway, I asked to authorize, only to get another prompt: Something like “sorry, you already have 5 authorized computers”.  This time, I was offered no way out other than acknowledging that lovely, yet curious fact (which 5 machines I had authorized? Ayman certainly didn’t get my permission for any content!). I was too shocked to take a screen grab of that pesky dialog. Still, this wasn’t a big deal, because I knew what to do – de-authorize all my computers (the only one I knew I had authorized was not with me — I migrated from it, see — so I couldn’t just de-authorize it). But that’s wrong, Mr. Jobs. Why would a “normal” (i.e., not 6’8″) user know how to de-authorized their other computers? Instead, I would like to have seen this process:

1. “Hey, it seems like you already reached the maximum number of computers allowed to access your licensed content! Would you like to fix that?”

Options: But of course! / No, I’ll just curl up in the corner and cry

2. “Here are the details of your 5 authorized computers. Which one(s) would you like to de-authorize?”

Options: Select any number of computers to de-authorize.

3. Done!

Easy, Steve? -gazillion points, Apple!

Another thing that didn’t migrate properly was my Screensaver (although my desktop pictures preference were kept). I guess that’s because in Snow Leopard you need to use iPhoto albums to choose screensaver photos. But why would Desktop background work and screensaver break? Slightly bizarre.

The wifi was also a mild annoyance, forgetting all my preferences (but at least remembering the networks’ credentials for secure networks).

Finally (geek/grad student topic alert), I lost my Latex (MacTex) installation in the migration to the new Mac. I mean, the files were still there but the migration broke a few symbolic links and just tampered with a folder structure enough to make my various Latex editors not find the MacTex installation. MacTex have a several-step solution, but you know me, I take my short cuts (just upgraded to MacTex 2009), which fixed all these issues.

So, Apple could have made this really close a perfect game, but allowed a couple of walks in there late in the innings, just to have Naaman complain. Well, what would I do without them.

Annotations (Twitter reads the Ayman and Naaman Show?)

Hey, Naaman’s back for my favorite type of activity: the “I told you so / I called it”. Twitter just announced Annotations, here are some technical details, and here’s the New York Times coverage:

Another new tool is called annotations. Already, individual posts show which app someone used to write the post and the date, time and (if users choose to make it public) location. With annotations, software developers will be able to add other material, which Twitter calls metadata, to Twitter posts.

This could significantly expand the amount of information a post includes, beyond its 140 characters, and could enhance the way Twitter is used.

Posts could include the name of the restaurant where a post was written and its star rating on Yelp, for instance. Then, someone could find Twitter posts about restaurants nearby with five stars. Or developers could add a way to make a payment and purchase, so retailers could sell items from within a post.

Twitter does not know what developers will decide to do with the tool, said Ryan Sarver, who manages the Twitter platform. “The underlying idea is think big, push yourself.”

Sounds very close to what I asked for. Of course, there are the Machine-tag skeptics but they just need a good moment alone with Aaron Cope, Clay Shirky and a machete. Free the information hierarchies!

Milgram to TagMaps like Lynch to Flickr Alpha Shapes

After we came up with Tag Maps at Yahoo! Research Berkeley, Morgan Ames (then one of our star interns) pointed out the surprising similarities to a study that was done 30 years earlier, by Stanley Milgram, the famous social psychologist. In his study, Milgram asked 30+ participants to list names of attraction in Paris. He then visualized these on a map, in a size according to the number of times each was mentioned. Here are the automatically-created, Flickr-based TagMap of Paris (based on geotagged photos taken in that area), and the same exact area as represented by Milgram’s visualization.

tagmaps paris

milgram paris

I have been showing both these images in my talk for a while now — can’t seem to get sick of them, even if my audience might just be…

I’ve also been talking for a while on how we can use the aggregate contributions on Flickr to mark boundaries of geographical objects, such as, say, neighborhoods, using all the photos tagged with a neighborhood name. Talk is cheap, but the smart people at Flickr not only figured out how to do it (with slightly different data than tags) but also released the data and the source code for anyone to use. Blame Aaron Cope and Rev Dan Catt.

Well, here’s the thing: turns our a famous scholar also beat Flickr to it, some 40 years ago. Kevin Lynch, in his groundbreaking essay/book The Image of the City, collected people’s descriptions and hand-drawn maps of three cities (Boston shown here, also Jersey City and downtown LA). In one study, he extracted the “maximum boundaries” for each neighborhood as drawn by all the interviewees, and plotted them on the map.

Here are the automatically-created, Flickr-based map of Boston Neighborhoods, visualized using the excellent Tom Taylor’s Neighborhood Boundaries, and Lynch’s maximum boundaries of neighborhoods in the same area.

Neighborhood Boundaries for Boston

Lynch's neighborhood boundaries

I have pre-ordered Milgram’s book of essays, to arrive in February. Might as well find out what’s there before we re-invent something else!

Social Media Definition (redefinition, that is)

Ayman, I know you’re sick of it by now, but I am revisiting a popular theme for this blog, “What is Social Media”. A definition of social media was attempted (by me) here, and I later added a note about a practical definition of social media in the context of teaching an interdisciplinary class on the topic.

So now, after teaching the first session of that class, let me try again. The following definition will try to broadly scope the topic as described in my Social Media class. But I also believe that this would make a good working definition of this widely-and-wildly-used phrase.

In this definition, I try to follow closely the original meaning of both “social” and “media”. Media is defined as:

the main means of mass communication (esp. television, radio, newspapers, and the Internet) regarded collectively. (Apple Dictionary)

And Social:

needing companionship and therefore best suited to living in communities “we are social beings as well as individuals.

These definitions are echoed in the following, although did not directly dictate it.  Social media, then, is any media that supports these two characteristics:

  • Posting of lasting content in public/semi public settings within an established service or system.
  • Visible and durable identity, published profile, and recognized contribution.

This definition would then include Facebook, Flickr, Delicious, MySpace, Yelp, Vimeo, Last.fm, Twitter, Dogster, YouTube and their many, many likes.

The definition does not include purposely excludes old media services that allow for comments from users (no durable identity); Wikipedia (no “recognized contribution” that is easily associated with a user); or say, mobile-social applications (no posting of content => not a media!). The definition also does not newsgroups and discussion forums (no published profile, no expectation of lasting content). And it does not include communication services like IM and email that are not public, not even semi-public in nature.

Does this make sense?