Editor’s note: Today we publish the final post in our series on the seven sins of consumer psychology from the presidential address of professor Michel Tuan Pham at the recent conference of the Society for Consumer Psychology. Read the introduction here.
The final sin is certainly not a recent one, but it is still a major one. More than 35 years ago, in a JCR editorial titled “Research by Convenience,” Robert Ferber (1977) already complained about the over-reliance on student samples in studies purported to be about consumers in general. Ferber questioned whether students were really the right respondents for certain topics such as financial decision making or family purchases. He also questioned the degree to which, independent of the topic, results obtained from college students could be generalized to the broader population of consumers that our samples are meant to represent (see also Sears 1986).
A variant of the “Research by Convenience” criticism includes complaints that too much of our research is North-American-centric (Gorn 1997)—a criticism that has also been made about psychology in general (Arnet, 2008). Another variant includes the criticism that too much of our theorizing is based on the upper end of the knowledge-expertise continuum, whereas large segments of the consumer population are bound to be less educated and less “intelligent” than the student population that we typically sample in our studies (Alba 2000).
On the surface, it would appear that new sources of inexpensive experimental respondents such as Mechanical Turks, which has become very popular in our field, should help address this research-by-convenience problem. Indeed, from a demographic point of view, MTurk participants seem to be a little more like “real consumers” than the typical college undergrad (Berinsky, Huber, and Lenz, 2012). There is also some evidence that some well-known judgment biases can be replicated on MTurk participants (Goodman). However, before we declare the “sin of research by convenience” partially absolved by MTurks, we need to temper our optimism in three respects. First, regardless of what has been shown or claimed to date, it is not clear to me that a particular sample of individuals who self-selected to participate in this peculiar marketplace—that is, individuals who are willing to perform computer-mediated mindless tasks for a couple of dollars an hour—are necessarily that more representatives of “real-world” consumers than are typical college undergrads. Second, there is disturbing evidence of increased MTurk sophistication in seeing through and “gaming” our studies (Chandler, Mueller, and Paolacci ACR 2012). Finally, and most seriously, I see a real danger that the low data collection costs of Mechanical Turk is gradually shifting our research agendas toward studies than can be done on MTurks—i.e. short online, survey-type experiments—as opposed to studies that should be conducted to advance our field. This last point taps into another meaning to the phrase “Research by Convenience”—one that Ferber did not discuss, but is, in my opinion, perhaps even more serious.
Finally, I should be noted that the sin of research by convenience is not limited to the convenience of the sample of respondents that we study. It extends to the convenience of the instruments that we use to study them. Instead of studying actual consumption behavior, much of our research is based on vignette-like studies, in which respondents are asked to imagine a certain consumption situation and report how they would respond in such situations. The real question is whether the observed responses in these studies are good representations of the actual responses that we would observe had actual consumption behavior been analyzed.
Our colleagues in economics often criticize such studies because vignette-based responses entail no costs and no rewards. “Without some incentive compatibility,” they would say, “this is just cheap talk.” I am not sure that this is the main problem, however. My concerns are a bit different. First, scenario-based studies tend make the focal aspect of the treatment very prominent (e.g., “imagine buying insurance two years from now vs. next month”), thereby potentially exaggerating the strength of the effects. Second, I suspect that participants who are asked to project themselves into a certain consumption situation tend to adopt an overly analytical mindset that is not representative of how consumers would actually respond to the situation in real-life (see, e.g., Dunn & Ashton-James, 2008; Snell, Gibbs, & Varey, 1995, for relevant findings). Finally, I believe that scenarios are poorly-suited for the studying of the effects of “hot” variables such as emotional responses and motivational states (Pham, 2004), whose influence on our behavior is difficult to imagine with a genuine experience.
Conclusions: Increasing our Relevance and Impact
- Expand our research focus to non-purchase dimension of consumer behavior, especially need and want activation, nonpurchase modes of acquisition (sharing, borrowing, stealing), and every aspect of actual consumption.
- Embrace broader theoretical perspectives on consumer behavior beyond information processing and BTD, especially motivation, social aspects, and deep cultural aspects (as opposed to cross-cultural aspects). Less emphasis on unique and micro-level explanations.
- Expand our epistemology to encourage (a) further phenomenon-based research (provided that phenomenon is robust and really grounded in CB), (b) more descriptive research, and (c) tests of popular industry theories
- Greater attention to content aspects of CB with corresponding increase in domain specificity (and decrease in presumed generality). Key opportunity in area of motivational content.
- Lower tolerance of theories of studies.
- Greater emphasis on replication, robustness, and sensitivity testing.
- Greater reliance on studies with real consumers, as opposed to students or Mturks. Encouragement of field studies. Decreased reliance on scenarios, especially when studying hot processes of CB.
- CB syllabi need to be revamped (especially those structured in terms of information processing and JDM) to reflect broader theoretical perspectives
- Greater substantive grounding in how we teach CB to our graduate students
- Encourage PhD students to take or TA MBA-level course in CB and in basic marketing
- Encourage to a limited extent (rather than strongly discourage) activities that strengthen our grounding in and understanding of business issues (executive teaching, consulting, book writing).
- Pay more attention to citations and impact as opposed to mere number counting in promotions. (Simple new metric proposed: average citation percentile rank in given journal in given year)
Editor’s comments: We’d like to thank Michel Tuan Pham for kindly letting us publish his presidential address, and would welcome comments from readers. What do you think about these sins?
Pingback: 7Sins: #6 Overgeneralisation | :InDecision:
I just want to jump in and clarify a few things about MTurk.
I am not sure that I agree with the approach of assuming that a particular group is *not* representative, especially with no consideration of whether differences are of theoretical importance to a particular research question. This assumption effectively requires people who use a particular sample to “prove the null” on an infinite number of unspecified dimensions. Similar arguments can be made about people inclined to participate in panel surveys, or to own landlines, and self-selection is inherent across any research method that relies on self report.
While I agree that people who use MTurk do differ in measurable ways from both other samples, and the population at large (for example, they are younger and more educated, more socially anxious, and more likely to be unemployed – see Shapiro, Chandler & Mueller, 2013), I am not sure that these differences are relevant to all research questions.
Likewise, we should be careful in dismissing a population as unrepresentative because they perform “computer-mediated mindless work for a few dollars per hour.” First, workers frequently report finding research studies interesting and informative, reflecting some degree of intrinsic motivation. Second, to truly get a HIT completed in a timely fashion requires a wage of around 10 cents per minute. While $6 an hour is low, it is not so low as to be unreasonable, especially considering i.) the flexibility (in tasks and time commitment) afforded by MTurk, ii.) the currently available employment alternatives for many Americans and iii.) the sheer volume of time people spend performing “computer mediated mindless tasks” for free – or even at cost (e.g. Angry Birds, Farmville etc. ).
Turning to the second point raised by this post, I am not sure whether it is appropriate to characterize workers as “gaming” the system. By and large, most of the problems we identify do not stem from “bad” workers. They stem from “lazy” researchers, who are further encouraged by the low cost of MTurk data. My favorite example of this is the use of “catch trials” or “Instructional Manipulation Checks” (Oppenheimer et al., 2010) to identify whether people are paying attention. The Oppenheimer measure asks people to perform a specific action in the instructions (e.g. follow a link embedded in the title) that is counter to the action strongly implied by other contextual cues (e.g. answer the questions on the page and proceed). Obviously, this measure only assesses attentiveness if people have not seen it before, otherwise they do not need to read the instructions to know what to do. Despite this, you do not have to spend much time on MTurk to find researchers who use the example provided by Oppenheimer verbatim, with little consideration that they or other researchers have used it before.
The way that IMCs are actually used highlights and amplifies the final point of this post by identifying yet another form of research by convenience. Not only are we perhaps over-reliant on vignette studies and surveys, but we are over-reliant on the *same* vignettes slavishly copied and pasted again and again in subtly different iterations. It is a little sad really, because MTurk does not have to result in this lowest common denominator of research. At the end of the day it is a marketplace where researchers can find participants. What researchers do with them is their own business. While the majority of researchers have used MTurk to conduct simple vignette studies, there are opportunities to use it in all sorts of clever ways, including as a tool to collect longitudinal data, or conduct real time experiments in group dynamics and collaboration (see work by Winter Mason and Jeff Nickerson for examples). Moreover, MTurk is a pool from which people can selectively be drawn, so the characteristics of the population as a whole constrain, but do not determine the sample that researchers can ultimately use.
In short, I largely agree with the spirit of this post, but I think we need to be careful about framing the problem as simply being about studying the wrong population and the solution as finding a new one, or believing that certain contexts make it impossible to conduct any kind of research that is *not* by convenience alone.