Social Media

If you were lisitening to NPR’s “All Things Considered” broadcast on January 18, you might have heard a brief report on research that reveals regional differences (“dialects”) in word usage, spellings, slang and abbreviations in Twitter postings.  For example, Northern and Southern California use spelling variants koo and coo to mean “cool.”

Finding regional differences in these written expressions is interesting in its own right, but I’ve just finished reading the paper describing this research and there’s a lot more going on here than simply counting and comparing expressions across different geographic regions.  The paper is an excellent example of what market researchers might do to analyze social media.

The study authors–Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing–are affiliated with the School of Computer Science at Carnegie Mellon University (Eisenstein, who was interviewed for the ATC broadcast, is a postdoctoral fellow).  They set out to develop a latent variable model to predict an author’s geographic location from the characteristics of text messages.  As they point out, there work is unique in that they use raw text data (although “tokenized”) as input to the modeling.  They develop and compare a few different models, including a “geographic topic model” that incorporates the interaction between base topics (such as sports) and an author’s geographic location as well as additional latent variable models:  a “mixture of unigrams” (model assumes a single topic) and a “supervised linear Dirichlet allocation.”    If you have not yet figured it out, the models, as described, use statistical machine learning methods.  That means that some of the terminology may be unfamiliar to market researchers, but the description of the algorithm for the geographic topic model resembles the hierarchical Bayesian methods using the Gibb’s sampler that have come into fairly wide use in market research (especially for choice-based conjoint analysis).

This research is important for market research because it demonstrates a method for estimating characteristics of individual authors from the characteristics of their social media postings.  While we have not exhausted the potential of simpler methods (frequency and sentiment analyses, for example), this looks like the future of social media analysis for marketing.

Copyright 2011 by David G. Bakken.  All rights reserved.

The 20th occurrence of the Advanced Research Techniques Forum, an annual conference sponsored by the American Marketing Association, took place in San Francisco a couple of weeks ago (June 6-9).  For those of you not familiar with A/R/T, this conference brings academic researchers together with market research practitioners in a format that produces (nearly) equal representation of contributions from each of these two groups.  Half of the twenty presentation slots are reserved for “practitioner” papers (where the lead author is not an academic researcher) and half are held for papers from academics.  One of these academic slots is assigned to the winner of the annual Paul Green award for the best article published in Journal of Marketing Research in the previous calendar year.  More papers than in the past are collaborations between academics and practitioners, and choice of one or the other as lead author can impact the chances of getting on the program given the limited number of slots.

The program is assembled by a committee comprised of academics and practitioners (disclaimer–I’ve been on the committee a few times and was program chair for 2008).  In a typical year, the call for papers might yield around 70 submissions.   In addition to the presented papers, “poster” presentations are considered, and the program includes optional tutorials (extra cost) before and after the main conference sessions.

The A/R/T papers, especially those presented by academic researchers, can be dragged down by the weight of too much algebra.  Over the years, the “advanced” has more often referred to “models” than to “research techniques” in general, and this year was no exception.  Still, there were a few presentations that are noteworthy. (more…)

The winner of the advertising Superbowl that took place on Sunday, February 7, that is.  This is not just my opinion.  Comments captured from the digital ether by Alterian SM2 give the Sunday night victory to Google’s “Parisian Love” spot that ran at the end of the third quarter ( has a summary of the results).  Alterian SM2 looked at three measures for each of the 44 advertisers who aired commercials during the 2010 Superbowl:  total mentions, reach, and sentiment.  Google was the leader in mentions by a wide margin (almost 7,000 mentions, compared to 2,100 for the next highest ad–the Tim Tebow ad from Focus on the Family–and an average of  just over 500 mentions for all advertisers).  Google also came out ahead on Alterian’s Social Engagement Index (SEI), which weights the conversations by the popularity of the source.  The SEI for Google’s spot was 1,703 (versus an average of 100 for all ads).  Finally, Alterian weighted the SEI by sentiment to create a second index.   Google came in second on this measure, behind Doritos (SSEI of 673 and 941, respectively, against an average SSEI of 100).  It’s probably worth noting that Doritos ran three different ads during the telecast, against Google’s one spot, and these results do not separate out specific commercials.

Of course, not everyone who has expressed an opinion about the commercials aired during Superbowl XLIV put Google’s ad at the top.  The spot was not, for example, among the “top 10” Superbowl commercials listed at Fanhouse.  But in it’s way, Google’s ad may be the best example of what advertising is supposed to do.  Google’s dominant position in online search (and the revenues that search advertising generates) is under attack from Microsoft’s Bing, and Microsoft has been running ads showing how easy it is to use Bing to do things like find a dimly lit restaurant (apparently a plus for hungry vamps, if we take a recent ad literally). (more…)

The current issue of The Economist (January 30 -February 5 2010) features a 15-page special report on social networking.  Typically thorough, the report covers history, the differences between major players (Facebook, Twitter, and MySpace), benefits for small businesses, potential sources of profit for social networking sites, and some of the “peripheral” issues–such as the impact on office productivity and privacy concerns.  For any marketers who’ve been caught by surprise by the emergence of social media and social networking as marketing forces or been watching out of the corner of their eye, this special report might be especially informative. (more…)

Looking back over the last year in market research offers an opportunity to consider just which transformations, new ideas, industry trends, and emerging techniques might shape MR over the next few years.  Here’s a list of eight topics I’ve been following, with thoughts on the potential impact each might have on MR over the next two or three years. (more…)

…spontaneous complaints and complements are to customer loyalty management.  Like these forms of customer experience feedback, tweets are unsystematic, unorganized, and representative of who knows what underlying sentiments in the broader universe of individual experiences. (more…)

Have you heard about the Facebook Gross National Happiness Index?  On Monday, October 12, the Times ran an article (by Noam Cohen) reporting some of the findings based on analysis of two years’ worth of Facebook status updates from 100 million users in the U.S.  The index was created by Adam D. I. Kramer, a doctoral candidate in social psychology at the University of Oregon, and is based on counts of positive and negative words in status updates.  According to the article, classification of words as positive or negative is based on the Linguistic Inquiry and Word Count dictionary.

Among the researchers’ conclusions:  we’re happier on Fridays than on Mondays; holidays also make Americans happy.  The premature death of a celebrity may make us sad.  According to a post by Mr. Kramer on the Facebook blog, the two “saddest” days–days with the highest numbers of negative words–were the days on which actor Heath Ledger and pop icon Michael Jackson died.  Mr. Kramer points out that, coincidentally, Mr. Ledger died on the day of the Asian stock market crash, which might have contributed to the degree of negativity.

We’re going to see a lot more of this kind of thing as researchers delve into the rich trove of information generated by users of search engines and web-enabled social networking.  The happiness index, based as it is on simple frequency analysis of words, is the tip of the iceberg.  At the moment, “social media”–I’m not exactly sure what that label means–is getting incredible attention in the marketing and marketing research community.  The question that has yet to be posed, let alone answered, is, “what exactly do we learn from all this information?”