Interesting Articles

If you were lisitening to NPR’s “All Things Considered” broadcast on January 18, you might have heard a brief report on research that reveals regional differences (“dialects”) in word usage, spellings, slang and abbreviations in Twitter postings.  For example, Northern and Southern California use spelling variants koo and coo to mean “cool.”

Finding regional differences in these written expressions is interesting in its own right, but I’ve just finished reading the paper describing this research and there’s a lot more going on here than simply counting and comparing expressions across different geographic regions.  The paper is an excellent example of what market researchers might do to analyze social media.

The study authors–Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing–are affiliated with the School of Computer Science at Carnegie Mellon University (Eisenstein, who was interviewed for the ATC broadcast, is a postdoctoral fellow).  They set out to develop a latent variable model to predict an author’s geographic location from the characteristics of text messages.  As they point out, there work is unique in that they use raw text data (although “tokenized”) as input to the modeling.  They develop and compare a few different models, including a “geographic topic model” that incorporates the interaction between base topics (such as sports) and an author’s geographic location as well as additional latent variable models:  a “mixture of unigrams” (model assumes a single topic) and a “supervised linear Dirichlet allocation.”    If you have not yet figured it out, the models, as described, use statistical machine learning methods.  That means that some of the terminology may be unfamiliar to market researchers, but the description of the algorithm for the geographic topic model resembles the hierarchical Bayesian methods using the Gibb’s sampler that have come into fairly wide use in market research (especially for choice-based conjoint analysis).

This research is important for market research because it demonstrates a method for estimating characteristics of individual authors from the characteristics of their social media postings.  While we have not exhausted the potential of simpler methods (frequency and sentiment analyses, for example), this looks like the future of social media analysis for marketing.

Copyright 2011 by David G. Bakken.  All rights reserved.

The current issue of The Economist carries an article titled, “Riders on a swarm.”  The article describes the use of swarm intelligence–the collective behavior that results from the individual actions of many simple “agents”–that is inspired by the behavior of insects like ants and bees or flocks of birds.  Although–unlike a column that appeared in a previous issue –“agent-based simulation” is not mentioned by name, these models have all of the relevant attributes of agent-based simulations, and you can find example models of collective insect and flocking bird behavior in agent-based toolkits such as NetLogo

As noted in the article, these models have found some business applications in logistics and problems like traffic control.  Ant-based foraging models, for example, have been applied to solving routing problems for package delivery services.  Route optimization, given a set of delivery locations, is a fixed problem with a large number of potential solutions that probably can be solved analytically (or by simple brute force) with enough computing power.  Swarm models have the advantage that they can arrive at a good and often optimal solution without needed to specify and solve a linear programming problem.  By programming simple individual agents, such as artificial ants, with a simple set of rules for interacting with their environment and a set of goal-directed behaviors, the system can arrive at an optimal solution, even though no individual agent “solves” the problem. 

Something that was new to me in this article is “particle swarm optimization” (PSO) which is inspired by the behavior or flocking birds and swarming bees.  According to the article, PSO was invented in the 1990’s by James Kennedy and Russell Eberhart.   Unlike the logistics problems, there may be no closed form or analytically tractable solution to problems such as finding the optimal shape for an airplane wing.  In that case, a simulation in which thousands of tiny flowing particles follow a few simple movement rules may be just the ticket.

This stuff is fascinating, but it’s not clear that there are many useful applications for this type of modeling in marketing or marketing research, at least as long as the unit of analysis is the intersection of an individual “consumer” and a specific purchase or consumption occasion.   Of course, if imitation and social contagion are at least as important in our purchase decisions as the intrinsic attributes of t products and services (as research by Duncan Watts and his collaborators has shown in the case of popular music), then agent-based simulations may turn out to be one of the best ways to understand and predict consumer behavior.

Copyright 2010 by David G. Bakken.  All rights reserved.

The current issue of The Economist (January 30 -February 5 2010) features a 15-page special report on social networking.  Typically thorough, the report covers history, the differences between major players (Facebook, Twitter, and MySpace), benefits for small businesses, potential sources of profit for social networking sites, and some of the “peripheral” issues–such as the impact on office productivity and privacy concerns.  For any marketers who’ve been caught by surprise by the emergence of social media and social networking as marketing forces or been watching out of the corner of their eye, this special report might be especially informative. (more…)

Have you heard about the Facebook Gross National Happiness Index?  On Monday, October 12, the Times ran an article (by Noam Cohen) reporting some of the findings based on analysis of two years’ worth of Facebook status updates from 100 million users in the U.S.  The index was created by Adam D. I. Kramer, a doctoral candidate in social psychology at the University of Oregon, and is based on counts of positive and negative words in status updates.  According to the article, classification of words as positive or negative is based on the Linguistic Inquiry and Word Count dictionary.

Among the researchers’ conclusions:  we’re happier on Fridays than on Mondays; holidays also make Americans happy.  The premature death of a celebrity may make us sad.  According to a post by Mr. Kramer on the Facebook blog, the two “saddest” days–days with the highest numbers of negative words–were the days on which actor Heath Ledger and pop icon Michael Jackson died.  Mr. Kramer points out that, coincidentally, Mr. Ledger died on the day of the Asian stock market crash, which might have contributed to the degree of negativity.

We’re going to see a lot more of this kind of thing as researchers delve into the rich trove of information generated by users of search engines and web-enabled social networking.  The happiness index, based as it is on simple frequency analysis of words, is the tip of the iceberg.  At the moment, “social media”–I’m not exactly sure what that label means–is getting incredible attention in the marketing and marketing research community.  The question that has yet to be posed, let alone answered, is, “what exactly do we learn from all this information?”


In the October 4, 2009 edition of The NY Times “Sunday Business” section (“It’s Brand New, but Make It Sound Familiar“), Mary Tripsas, an associate professor at the Harvard Business School, writes about the challenge of finding the right consumer reference points for innovations.  In a nutshell, consumers have a hard time figuring out innovation unless they can compare it to something that is more familiar.  One example offered in the column  comes from Arthur Markham, a professor of psychology at the University of Texas in Austin:  the less than blockbuster introduction of the Segway motorized personal transport device.  In a similar vein, Dan Ariely (Predictably Irrational) argues that comparison is a fundamental process in consumer decision making.

Estimating demand for really new innovations may just be the most difficult endeavor in market research.  A decade ago Robert Veryzer, Jr. identified six factors that make it difficult for consumers to react to innovation (“Key Factors Affecting Customer Evaluation of Discontinuous New Products,” Journal of Product Innovation Management, 1998, 15, 136-150) .  The first factor listed is “lack of familiarity with the product, with the way in which the product is used, or with the underlying technology.”  And one way consumers try to understand a discontinuous product is by comparison with things they already know about.

By and large, I think marketers and market researchers underestimate the fundamental role of comparison and contrast in the way we make judgments about products.  As Professor Tripsas makes clear, humans (consumers included) rely on categorization to understand the world.  Looking at a new, discontinuous product, we’re likely to ask, is it this or that? (more…)

In my post titled “I can’t tell you what ‘insight’ looks like, but I’ll know it when I see it” (28 May 2009) I mentioned a study conducted by Jonah Berger and Gael Le Mens on the rise and fall of popularity in given names. The latest issue of Knowledge@Wharton describes this research, and a link to the article, “How adoption speed affects the abandonement of cultural tastes.” For your convenience, clicking on this title here will take you to the article.  A key finding from this study is that the faster a name rises in popularity, the more quickly it falls out of favor.