Research that is not reproducible is like pollution: We should treat it as such

When a measure becomes a target, it ceases to be a good measure - Goodhart’s law

I was recently interviewed by Vinay Patel and Joe Meyer of Louisiana Tech University (shout out to the Bulldogs!) on the topic of open science in IO psychology. In their article titled “We want open science in I-O!…Do we?” Vinay and Joe gave a broad overview of what open science is and why it particularly relevant for IO psychology.

As Vinay and Joe allude to, science seems to be going through some growing pains. For instance, the BBC reported on a survey by Nature, which found that over 70% of scientists have tried and failed to reproduce another scientist’s findings. Slate reported on a series of studies suggesting that when industry scientists tried to replicate effects from the published literature, they quite often failed to do so. One particular project (reproducibility project in cancer biology) selected 50 influential studies for replication. Unfortunately, the results of only 18 studies will be revealed. Why? Because a lack of methodological detail made it difficult to reproduce the original findings. Such detail is often lost in the publication process because of concerns that seem less relevant for the 21st century, such as journal space.

In my segment, I focused on how the reproducibility crisis might be like fighting pollution. I think the problem of reproducibilty is wide-spread throughout science…even in I/O psych and the remainder of the organizational sciences.

Keeping the problem of reproducibility in mind, I think we can learn something from viewing a certain cause of irreproducibility - questionable research practices - which are surprisingly common in psychological science, in a way akin to how we view pollution as a cause of climate change. Most of us would agree that pollution is undesirable (e.g., producing acid rain, smog, climate change, etc.). Pollution can be linked to a number of things that we do everyday, the most obvious of which is driving a car. Since the vast majority of us would have to take actions that are undesirable (e.g., drive our cars less often, more efficiently, etc.), collectively compelling people to change their behavior in order to reduce pollution is going to have some difficulties. In other words, pollution is a collective action problem: it is easy for people to take actions that harm the environment - such as driving cars - because the private benefits of doing so outweigh the private costs. Unfortunately, the social costs do not typically factor into our decision to buy a car.

The same might be said of scholarship writ-large and of organizational scholarship particularly: publishing scholars can take steps (e.g., engage in questionable research practices) that pollute the scientific literature to their own benefit (e.g., tenure, promotion, etc.)…and to the detriment of the field, practice, and society at large (Honig et al. 2018). Scholars, should not be viewed as the villains, however. Rather, the system within which they operate deserves criticism. Scholars are merely responding to incentives to publish in high-impact outlets furnished by the system (e.g., pursuing publication in outlets with high impact factors)…or otherwise perish (Honig et al. 2018).1 QRPs, which have been defined as “design, analytic, or reporting practices that have been questioned because of the potential for the practice to be employed with the purpose of presenting biased evidence in favor of an assertion” (G. C. Banks, Rogelberg, et al. 2016), are prevalent in applied psychology. Examples include selectively reporting results that are statistically significant, p-hacking, and adding and removing cases and variables to make results favorable to a particular point of view. Research by Banks et al. suggest that upward of 91% of studies contain evidence of questionable research practices. Scholars noted that QRPs are required to publish in top-tier outlets (G. C. Banks, O’Boyle, et al. 2016).

To resolve the collective action problem of pollution control, a number of policies have been proposed and enacted, such as incentivizing the purchase of more energy efficient or use of clean sources of energy (e.g., solar, wind, etc.)2 and imposing taxes on carbon. The former rewards environmentally conscious behavior while the latter punishes undesired behavior by helping consumers see the social cost imposed by their decisions. Such an approach is preferred by economists writ-large, who have suggested that any revenues generated by such carbon taxes can be refunded via a flat, universal dividend (see Exchange 2019)

What if we viewed QRPs such as p-hacking (Wicherts et al. 2016) and failing to report all conducted analyses (Crede and Harms 2019) as a collective action problem? Indeed, this has been discussed elsewhere). What would be analogous to the incentivizing robust (see Grand et al. 2018) or open (see Nosek et al. 2015) research practices to reduce the prevalence of QRPs? Simple ideas include allocating greater weight in promotion, tenure, and in the publication process for scholarship that is open and reproducible (Chambers 2017). This can occur within departments, colleges, and journals. One beneficial by-product would most certainly involve greater transfer over to practice (and back from practice). The models that academics test can be ‘test-driven’, per se, over in practice (and vice versa). Another idea involves tying grants and publishing to robust practices. For instance, Chambers (2017) noted that journals using a two-stage review process whereby methods are critiqued prior to data collection can grant a conditional publication (i.e., publication will occur if data are gathered as agreed to in the review process). An example journal that does this in IO psych is the Journal of Business and Psychology. Such a practice will drastically cut down on QRPs in the reporting phase of a project (e.g., dropping results that don’t align with predictions). Additionally, and most importantly, funding (e.g., Small Grants offered by the Society for Industrial & Organizational Psychology) can prioritize studies that are pre-registered and will be published. This helps to efficiently and effectively allocate resources to those studies that are conducted in a truly scientific manner. Conversely, studies that do not conform to robust or open methodology should be penalized heavily in both academic and practitioner circles. Over time, things will improve.

What do you think?

References

Banks, George C., Ernest H. O’Boyle, Jeffrey M. Pollack, Charles D. White, John H. Batchelor, Christopher E. Whelpley, Kristie A. Abston, Andrew A. Bennett, and Cheryl L. Adkins. 2016. “Questions About Questionable Research Practices in the Field of Management: A Guest Commentary.” Journal of Management 42 (1): 5–20. https://doi.org/10.1177/0149206315619011.

Banks, George C., Steven G. Rogelberg, Haley M. Woznyj, Ronald S. Landis, and Deborah E. Rupp. 2016. “Editorial: Evidence on Questionable Research Practices: The Good, the Bad, and the Ugly.” Journal of Business and Psychology 31 (3): 323–38. https://doi.org/10.1007/s10869-016-9456-7.

Brembs, Björn. 2018. “Prestigious Science Journals Struggle to Reach Even Average Reliability.” Frontiers in Human Neuroscience 12. https://doi.org/10.3389/fnhum.2018.00037.

Brembs, Björn, Katherine Button, and Marcus Munaf‘o. 2013. “Deep Impact: Unintended Consequences of Journal Rank.” Frontiers in Human Neuroscience 7. https://doi.org/10.3389/fnhum.2013.00291.

Chambers, Chris. 2017. The Seven Deadly Sins of Psychology a Manifesto for Reforming the Culture of Scientific Practice.

Crede, Marcus, and Peter Harms. 2019. “Questionable Research Practices When Using Confirmatory Factor Analysis.” Journal of Managerial Psychology 34 (1): 18–30. https://doi.org/10.1108/JMP-06-2018-0272.

Exchange, Free. 2019. “Brave New Deal.” The Economist, February, 67.

Grand, James A., Steven G. Rogelberg, Tammy D. Allen, Ronald S. Landis, Douglas H. Reynolds, John C. Scott, Scott Tonidandel, and Donald M. Truxillo. 2018. “A Systems-Based Approach to Fostering Robust Science in Industrial-Organizational Psychology.” Industrial and Organizational Psychology 11 (01): 4–42. https://doi.org/10.1017/iop.2017.55.

Honig, Benson, Joseph Lampel, Joel A. C. Baum, Mary Ann Glynn, Runtian Jing, Michael Lounsbury, Elke Schüßler, et al. 2018. “Reflections on Scientific Misconduct in Management: Unfortunate Incidents or a Normative Crisis?” Academy of Management Perspectives 32 (4): 412–42. https://doi.org/10.5465/amp.2015.0167.

Nosek, B. A., G. Alter, G. C. Banks, D. Borsboom, S. D. Bowman, S. J. Breckler, S. Buck, et al. 2015. “Promoting an Open Research Culture.” Science 348 (6242): 1422–5. https://doi.org/10.1126/science.aab2374.

Wicherts, Jelte M., Coosje L. S. Veldkamp, Hilde E. M. Augusteijn, Marjan Bakker, Robbie C. M. van Aert, and Marcel A. L. M. van Assen. 2016. “Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid P-Hacking.” Frontiers in Psychology 7 (November). https://doi.org/10.3389/fpsyg.2016.01832.


  1. Ironically, like the quote used to open this post, impact factors do not seem to correlate with high quality contributions. They do, however, predict retractions (Brembs, Button, and Munaf‘o 2013) and are negatively correlated with replicability (Brembs 2018).

  2. On a personal note, I took advantage of this in buying myself a new car (a Honda Clarity).

Avatar
Christopher Castille
Assistant Professor of Management
comments powered by Disqus