guest post: Reproducibility Project: Ecology and Evolutionary Biology

Written by: Hannah Fraser

The problem

As you probably already know, researchers in some fields are finding that it’s often not possible to reproduce others’ findings. Fields like psychology and cancer biology have undertaken large-scale coordinated projects aimed at determining how reproducible their research is. There has been no such attempt in ecology and evolutionary biology.

A starting point

Earlier this year Bruna, Chazdon, Errington and Nosek wrote an article citing the need to start this process by reproducing foundational studies. This echoes early research undertaken in psychology and cancer biology reproducibility projects attempting to reproduce the fields’ most influential findings. Bruna et al’s focus was on tropical biology but I say why not the whole of ecology and evolutionary biology!

There are many obstacles to this process, most notably obtaining funding and buy-in from researchers, but it is hard to obtain either of these things without a clear plan of attack. First off, we need to decide on which ‘influential’ findings we will try to replicate and how we are going to replicate them.

Deciding on what qualifies as an influential finding is tricky and can be controversial. In good news, this year an article came out that has the potential to (either directly or indirectly) answer this question for us. Courchamp and Bradshaw (2017)’s “100 articles every ecologist should read” provides a neat list of candidate influential articles/findings. There are some issues with biases in the list which may make it unsuitable for our purposes but at least one list is currently being compiled with the express purpose of redressing these biases. Once this is released it should be easy to use some combination of the two lists to identify – and try to replicate – influential findings.

What is unique about ecology and evolutionary biology?

In psychology and cancer biology where reproducibility has been scrutinised, research is primarily conducted inside and based on experiments. Work in ecology and evolutionary biology is different in two ways: 1) it is often conducted outside, and 2) a substantial portion is observational.

Ecology and evolutionary biology are outdoor activities

Conducting research outdoors means that results are influenced by environmental conditions. Environmental conditions fluctuate through time, influencing the likelihood reproducing a finding in different years. Further, climate change is causing directional changes in environmental conditions, which may mean that you might not expect to reproduce a finding from 20 years ago this year. I’ve talked to a lot of ecologists about this troublesome variation and have been really interested to find two competing interpretations:

1) trying to reproduce findings is futile because you would never know whether any differences were reflective of the reliability of the original result or purely because of changes in environmental conditions

2) trying to reproducing findings is vital because there is so much environmental variation that findings might not generalise beyond the exact instance in space and time in which the data were collected – and if this is true the findings are not very useful.

Ecology and evolutionary biology use observation

Although some studies in ecology and evolutionary biology involve experimentation, many are based on observation. This adds even more variation and can limit and bias how sites/species are sampled. For example, in a study on the impacts of fire, ‘burnt’ sites are likely to be clustered together in space and share similar characteristics that made them more susceptible to burning that the ‘unburnt’ sites, biasing the sample of sites. Also, the intensity of the fire may have differed even within a single fire, introducing uncontrolled variation. In some ways, the reliance on observational data is one of the greatest limitations in ecology and evolutionary biology. However, I think it is actually a huge asset because it could make it more feasible to attempt reproducing findings.

Previous reproducibility projects in experimental fields have either focussed on a) collecting and analysing the data exactly according to the methods of the original study, or b) using the data collected for the original analysis and re-running the original analysis. While ‘b’ is quite possible in ecology and evolutionary biology, this kind of test can only tell you whether the analyses are reproducible… not the pattern itself. Collecting the new data required for ‘a’ is expensive and labour intensive. Given limited funding and publishing opportunities for these ‘less novel’ studies, it seems unlikely that many researchers will be willing or able to collect new data to test whether a finding can be reproduced. In an experimental context, examining reproducibility is tied to these two options. However, in observational studies there is no need to reproduce an intervention, so only the measurements and the context of the study need to be replicated. Therefore, it should be possible to use data collected for other studies to evaluate how reproducible a particular finding is.

Even better, many measurements are standard and have already been collected in similar contexts by different researchers. For example, when writing the lit review for my PhD I collated 7 Australian studies that looked at the relationship between the number of woodland birds and treecover, collected bird data using 2ha 20 minute bird counts and recorded the size of the patches of vegetation. It should be possible to use the data from any one of these studies to test whether the findings of another study are reproducible.

Matching the context of the study is a bit more tricky. Different inferences can be made from attempts to reproduce findings in studies with closely matching contexts than those conducted in distinctly different contexts. For example, you might interpret failure to reproduce a finding differently if it was in a very similar context (e.g. same species in the same geographic and climatic region) than if the context was more different (e.g. sister species in a different country with the same climatic conditions). In order to test the reliability of a finding you should match the context closely. In order to test the generalisability of a finding should match the context less closely. However, determining what matches a study’s context is difficult. Do you try to match the conditions where the data were collected or the conditions that the article specifies it should generalise to? My feeling is that trying to replicate the latter is more relevant but potentially problematic.

In a perfect world, all articles would provide a considered statement about which conditions they would expect their results to generalise to (Simons et al 2017). Unfortunately, many articles overgeneralise to increase their probability of publication which may mean that findings appear less reproducible than they would have if they’d been more realistic about their generalisability.

Where to from here?

This brings me to my grand plan!

I intend to wait a few months to allow the competing list (or possibly lists) of influential ecological articles to be completed and published.

I’ll augment these lists with information on the studies’ data requirements and (where possible) statements from the articles about the generalisability of their findings. I’ll share this list with you all via a blog (and a page that I will eventually create on the Open Science Framework).

Once that’s done I will call for people to check through their datasets to see whether they have any data that could be used to test whether the findings of these articles can be reproduced. I’m hoping that we can all work together to arrange reproducing these findings (regardless of whether you have data and/or the time and inclination to re-analyse things).

My dream is to have the reproducibility of each finding/article tested across a range of datasets so that we can 1) calculate the overall reproducibility of these influential findings, 2) combine them using meta-analytic techniques to understand the overall effect, and 3) try to understand why they may or may not have been reproduced when using different datasets. Anyway, I’m very excited about this! Watch this space for further updates and feel free to contact me directly if you have suggestions or would like to be involved. My email is hannahsfraser@gmail.com.

Why ‘MORE’ published research findings are false

In a classic article titled “Why most published research findings are false”,  John Ioannidis explains 5 main reasons for just that. These reasons are largely related to large ‘false positive reporting probabilities’ (FPRP) in most studies, and ‘researcher degrees of freedom’, facilitating the practice as such ‘p-hacking’. If you aren’t familiar with these terms (FPRP, researcher degrees of freedom, and p-hacking), please read Tim Parker and his colleagues’ paper.

I would like to add one more important reason why research findings are often wrong (thus, the title of this blog). Many researchers simply get their stats wrong. This point has been talked about less in the current discussion of the ‘reproducibility crisis’. There are many ways to getting stats wrong, but I will discuss a few examples here.

In our lab’s recent article, we explore one way that biologists, especially when statistically accounting for body size, can produce unreliable results. Problems arise when a researcher divides a non-size trait measurement by size (e.g., food intake by weight), and uses this derived variable in a statistical model (a worryingly common practice!).Traits are usually allometrically related to each other, meaning, for example, food intake will not increase linearly with weight. In fact, food intake increases slower than weight. The consequence of using the derived variable is that we may find statistically significant results where no actual effect exists. (see Fig 1). An easy solution for this issue is to log-transform and fit the trait of interest as a response with size as a predictor (i.e., allometrically related traits are log-linear to each other).

But surprisingly, even this solution can lead to wrong conclusions. We discussed a situation where an experimental treatment affects both a trait of interest (food intake) and size (weight). In such a situation, size is known as an intermediate outcome, and fitting size as a predictor could result in wrongly estimating an experimental effect. I have made similar mistakes because it’s difficult to know when and how to control for size. It depends on both your question and the nature of relationships between the trait of interest and size. For example, if the experiment affected both body size and appetite and also body size influences appetite as well, then, you do not want to control for body size. This is because the effect of body size on appetite is due to the experimental effect (complicated!).

Although I said, ‘getting stats wrong’ is less talked about, there are exceptions. For instance, the pitfalls of pseudoreplication (statistical non-independence) have been known for many years, but researchers continue to overlook this problem. Recently my good friend Wolfgang Forstmeier and his colleagues devoted part of a paper on avoiding false positives (Type I error) to explaining the importance of accounting for pseudo-replication in statistical models. If you work in a probabilistic discipline, this article is a must read! As you will find out, not all pseudoreplication is obvious. Modelling pseudoreplication properly can reduce Type I dramatically (BTW, we recently wrote about statistical non-independence and the importance of sensitivity analysis).

What can we do to prevent these stats mistakes? Doing stats right is more difficult than I used to think. When I was a PhD student, I thought I was good at statistical modelling, but I made many statistical mistakes including ones mentioned above. Statistical modelling is difficult because we need to understand both statistics and biological properties of a system under investigation. A statistician can help with the former but not the latter. If statisticians can’t recognize all our potential mistakes, I think this means that we as biologists should become better statisticians.

Luckily, we have some amazing resource. I would recommend all graduate students read Gelman and Hill’s book. Also, you will learn a lot from Andrew Gelman’s blog where he often talks about common statistical mistakes and scientific reproducibility. Although no match to Gelman and Hill, I am doing y part to educate biology students about stats by writing a new kind of stats book, which is based on conversations, i.e. a play!

I would like to thank Tim Parker for detailed comments on an earlier version of this blog.

Replication: step 1 in PhD research

Here are a few statements that won’t surprise anyone who knows me. I think replication has the potential to be really useful. I think we don’t do nearly enough of it and I think our understanding of the world suffers from this rarity. In this post I try to make the case for the utility of replication based on an anecdote from my own scientific past.

A couple of years ago Shinichi Nakagawa and I wrote a short opinion piece about replication in ecology and evolutionary biology. We talked about why we think replication is important and how we can interpret results from different sorts of replications, and we also discussed a few ideas for how replication might become more common. One of those ideas was for supervisors to expect graduate students to replicate part of the previously published work that inspired their project. When Shinichi and I were writing that piece, I didn’t take the time to investigate the extent to which this already happens, or even to think of examples of it happening.

Then out of the blue the other day, it occurred to me that I’d seen this happen up-close with one of my own findings. First some background. Bear with me and I’ll try to be brief. When I was a naïve master’s student (with a hands-off adviser who had at least one foot in retirement), I decided to test Tom Martin’s ideas about nest predators shaping bird species co-existence, but in a new study system: the shrub nesting bird community at Konza Prairie in Kansas (by the way, this anecdote is NOT about my choice to do a conceptual replication for my MSc work). Anyway, I was gathering all the data myself, trying to find as many nests as I could from multiple species, monitoring those nests to determine predation outcomes, and measuring vegetation around each nest. I bit off more than I could chew, but I wanted to be done in one field season. I was in a hurry for some reason – not a recipe for sufficient statistical power. Instead, it was a recipe for an ambiguous test of the hypothesis since I didn’t find many nests for most bird species. I did, however, find a decent number of nests of one species: Bell’s vireo. Among the more than 60 vireo nests I found, I noticed something striking – brood parasitic cowbirds laid eggs in many of them, and if a cowbird egg hatched in a vireo nest, all vireo chicks were outcompeted and died. What was really interesting is that vireos abandoned many parasitized nests before cowbird eggs hatched and these vireos appeared to re-nest up to seven times in a season. I first thought this was evidence of an adaptation in Bell’s vireos to avoid parasitism by cowbirds via re-nesting (that’s another story), but I ended up publishing a paper that pointed out that the number of vireo eggs in the nest (rather than the number of cowbird eggs) was the best predictor of vireo nest abandonment. Thus it seemed like a response to egg loss (cowbirds remove host eggs) by Bell’s vireos might explain their nest abandonment and therefore how they could persist despite high brood parasitism. Now on to the heart of the story.

Several years later, after doing a PhD elsewhere, I found myself back in Kansas. A new K-State PhD student (Karl Kosciuch – who was one of Brett Sandercock’s first students) arrived and was excited about the Bell’s vireo –cowbird results I had reported. Looking back on it, this is a textbook case of how exploratory work and replication should go together. I found a result I wasn’t looking for. Someone else came along and thought it was interesting and wanted to build on it but decided to replicate it first. Karl did several things for his PhD, but one of them was simply to replicate my observational data set with an even bigger sample. He found the same pattern, thus dramatically strengthening our understanding of this system, and strongly justifying follow-up experiments. I actually joined Karl for one of these experiments, and it was very satisfying behavioral ecology. It turned out that it really is loss of their own eggs that induce Bell’s vireos to abandon and that cowbird eggs do not induce nest abandonment on their own.

This study had a happy ending for all involved, but what if Karl’s replication of my correlative study had failed to support my result? Well, for one it hopefully would have saved Karl the trouble of pursuing an experiment based on a pattern that wasn’t robust. Such an experiment would presumably have failed to produce a compelling result, and then would have left Karl wondering why. Were the experimental manipulations flawed? Was his sample size too small? Was there some unknown environmental moderator variable? Further, although the population of Bell’s vireo we studied is not endangered, the sub-species in Southern California is and one of the primary threats to that endangered population has been cowbird parasitism. My result had been discussed as evidence that Bell’s vireo populations might be able to evolve nest abandonment as an adaptive response to cowbird parasitism. If no replication had been conducted and only an unconvincing experiment had been produced, this flawed hypothesis might have persisted with harmful outcomes to management practices of Bell’s vireo in California.

I think there’s a clear take-home message here. Students benefit from replicating previously published studies that serve as the basis for their thesis research. Of course it’s not just students who can benefit here – anyone who replicates foundational work will reduce their risk of building on an unreliable foundation. And what’s more we all benefit when we can better distinguish reliable and repeatable results from those which are not repeatable.

I’m curious to hear about other replications of previously published results that were conducted as part of the process of building on those previously published results.

Is overstatement of generality an Open Science issue?

I teach an undergraduate class in ecology and every week or two I have the students in that class read a paper from the primary literature. I want them to learn to extract important information and to critically evaluate that information. This involves distinguishing evidence from inference and identifying assumptions that link the two. I’m just scratching the surface of this process here, but the detail I want to emphasize in this post is that I ask the students to describe the scope of the inference. What was the sampled population? What conclusions are reasonable based on this sampling design? This may seem straightforward, but students find it difficult, at least in part because the authors of the papers rarely come right out and acknowledge limitations on the scope of their inference. Authors expend considerable ink arguing that their findings have broad implication, but in so doing they often cross the line between inference and hypothesis with nary a word. This doesn’t just make life difficult for undergraduates. If we’re honest with ourselves, we should admit that it’s sloppy writing, and by extension, sloppy science. That said, I’m certainly guilty of this sloppiness, and part of the reason is that I face incentives to promote the relevance of my work. We’re in the business of selling our papers (for impact factors, for grant money, etc.). Is this sloppiness a trivial outcome or a real problem of the business of selling papers? I think it may lean towards the latter. Having to train students to filter out the hype is a bad sign. And more to the point of this post, it turns out that our failure to constrain inferences may hinder interpretation of evidence that accumulates across studies.

For years my work to encourage recognition of constraints on inference has been limited to my interaction with students in my class. That changed recently when I heard about a movement to promote the inclusion of ‘Constraints on Generality’ (COG) statements in research papers. My colleagues Fiona Fidler and Hannah Fraser made the jaunt from Melbourne over to the US to attend ESA in August (to join me in promoting and exploring replication in ecology), but they first flew to Virginia to attend the 2nd annual SIPS (Society for the Improvement of Psychological Science) conference where they heard about COG statements (there’s now a published paper on the topic by Daniel Simons, Yuichi Shoda, and Stephen Lindsay). In psychology there’s a lot of reflection and deliberation regarding reducing bias and improving empirical progress, and the SIPS conference is a great place to feel that energy and to learn about new ideas. The idea for a paper on COG statements apparently emerged from the first SIPS meeting, and the COG statement pre-print got a lot of attention in the 2nd meeting this year. It’s easy to see the appeal of a COG statement from the standpoint of clarity. But there’s more than just clarity. One of the justifications for COG statements comes from a desire to more readily interpret replication studies. A perennial problem with replications is that if the new study appears to contradict the early study, the authors of the earlier study can point to the differences between the two studies and argue that the second study was not a valid test of the conclusions of the original. This may seem true. After all, whenever conditions differ between two studies (and conditions ALWAYS differ to some extent), we can’t eliminate the possibility that the differences between the two studies result from the differences in conditions. However, we’re typically going to be interested in a result only if generalizes beyond the narrow set of conditions found in a single study. In a COG statement, the authors state the set of conditions under which they expect their finding to apply. The COG statement then sets a target for replication. With this target set, we can ask: What replications are needed to assess the validity of the inference within the stated COG? What work would be needed to expand the boundaries of the stated COG? As evidence accumulates, we can then start to restrict or expand the originally stated generality.

In a COG statement, authors will face conflicting incentives. Authors will still want to sell the generality of their work, but if they overstate the generality of their work, they increase the chance of being contradicted by later replication. That said, it’s important to note that a COG doesn’t simply reflect the whims of the authors. Authors need to justify their COG with explicit reference to their sampling design and to existing theoretical and experimental understanding. A COG statement should be plausible to experts in the field.

I started this post by discussing the scope of inference that’s reasonable from a given study, but although this is clearly related to the constraints on generality, a COG statement could be broader than a statement about the scope of inference. Certainly as presented by Simons et al., COG statements will typically expand the scope of generality beyond the sampled population. I haven’t yet resolved my thinking on this difference, but right now I’m leaning towards the notion that we should include both a scope of inference statement and a constraints on generality statement in our papers, and that they should be explicitly linked. We could state the scope of our inference as imposed by our study design (locations, study taxa, conditions, etc.), but then we could argue for a broader COG based on additional lines of evidence. These additional lines of evidence might be effects reported by other studies of the same topic, or might be qualitatively different forms of evidence, for instance based on our knowledge of the biological mechanisms involved. Regardless, more explicit acknowledgements of the constraints on our inferences would clearly make our publications more scientific. I’d love to have some conversations on this topic. Please share comments below.

Before signing off, I want to briefly mention practical issues related to the adoption of COG (and/or scope of inference) statements. Because scientists face an incentive to generalize, it seems that a force other than just good intentions of scientists may be required for this practice to spread. This force could be requirements by journals. However, many journals also face incentives to promote over-generalization from study results. That said, there are far fewer journals than there are scientists, so it might be within the realm of possibility to convince editors, in the name of scientific quality, to add requirements for COG statements. I can think of roles that funders could play here too, but these would be less direct and maybe less effective than journal requirements. I’m curious what other ideas folks have for promoting COG / scope of inference statements. Please share your thoughts!

Ecologists and evolutionary biologists can and should pre-register their research

I wrote a draft of this post a few weeks ago, and now seems like a good time for it to see the light of day given the great new pre-print just posted on OSF Preprints by Brian Nosek, David Mellor, and co-authors. They describe the utility of pre-registration across a variety of circumstances. I do something similar here, though I focus on ecology and evolutionary biology and I don’t try to be as thorough as Nosek et al.. For greater depth of analysis, check out their paper. On to my post…

Transparency initiatives are gaining traction in ecology and evolutionary biology. Some of these initiatives have become familiar – data archiving is quickly becoming business as usual – though others are still rare and strange to most of us. Pre-registration is squarely in this second category. Although I know a number of ecologists / evolutionary biologists who are starting to pre-register their work (and I’ve participated in a few pre-registrations myself), I would guess that most eco/evo folks don’t even know what pre-registration is, and many who do know probably wonder if it would even be worth doing. My goals here are to explain what pre-registration is, why it’s useful, and why most ecologists and evolutionary biologists could be using it on a regular basis.

 

-What is pre-registration?

At its most thorough, a pre-registration involves archiving a hypothesis and a detailed study design, including a data analysis plan, prior to gathering data. However, as you’ll read below, the data analysis plan is typically the core element of a useful pre-registration, and a pre-registration can happen after data gathering as long as the analysis plan is declared without knowledge of the outcome of the analysis or its alternatives.  Pre-registrations are archived in a public registry (the Open Science Framework, OSF, for example) so that they can later be compared to the analysis is ultimately conducted. Depending on the pre-registration archive, the pre-registration may be embargoed to maintain confidentiality of a research plan until it is completed. Once a pre-registration is filed, it cannot be edited, though it could potentially be updated with further pre-registrations. When a pre-registered study is published, the paper should cite (or better yet, link to) the pre-registration to show the extent to which the plan was followed.

 

-Why is pre-registration a useful component of transparency?

People (and, including all of us) are worryingly good at filtering available evidence so that they end up seeing the world that they expect to see rather than the world as it actually is. In other circumstances, after noticing a pattern, we readily convince ourselves that we predicted (or would have predicted) that particular outcome. All the while, we fool ourselves into believing we’re being unbiased. Science is all about avoiding these biases and taking honest stock of available evidence, but in the absence of adequate safeguards, there is good evidence that scientists can fall prey to cognitive biases (for a striking example, see van Wilgengurg and Elgar 2013). Pre-registration is one of a number of tools that helps scientists take a clear eyed view of evidence, and it helps those of us reading scientific papers to identify evidence that is less likely to have been run through a biased filter.  When scientists fiddle with analyses and can see how that fiddling impacts results, there is a great temptation to choose the analyses that produce the most desirable outcome. If this biased subset of results gets published and other results go unreported, we get a biased understanding of the world. In my ignorant past I’ve conducted and presented analyses this way, and nearly every other ecologist and evolutionary biologist I’ve talked to about this admits to doing this sort of thing at least once. For this and other reasons (Fidler et al. 2016, Parker et al. 2016), I think this problem is common enough to reduce the average reliability of the published literature. Pre-registration could improve average reliability of this literature and help us identify papers that are less likely to be biased.

 

-Why is pre-registration a viable tool for ecologists and evolutionary biologists?

I’ve written this section as a series of hypothetical concerns or questions from ecologists or evolutionary biologists, followed by responses to those concerns / questions.

 

“I work in the field and I have to refine my methods, or even my questions, over weeks or months through trial and error”

You can pre-register after your methods are finalized. When starting work in a new system or with a new method, you generally won’t be ready to complete a particularly useful pre-registration until you’ve gotten your hands dirty. You’ll need to figure out what works and what doesn’t work through trial and error. Unless you have excellent guidance from experts in the system / method, you probably want to hold off finalizing your pre-registration until you’ve been in the field and landed on a method that works. It would still be good to think long and hard about the project before heading to the field. Develop as detailed a methodological plan as is reasonable (in many cases, you’ll have done this already at the proposal stage) and talk to a statistician to develop a tentative analysis plan. Once you’ve begun to implement a set of methods you feel good about, then complete your pre-registration.

 

“What if I have to change my methods part way through the project?”

Of course, even if you go through the trouble of field testing your methods before finalizing your pre-registration, things still might change. You might come back a second year to find that conditions demand a revised protocol. If you have to scrap your first year’s data because you can’t continue, then you probably want to create an entirely new pre-registration based on your new methods. On the other hand, if your data from last year are still usable and you’ve just had to make modest changes, then you have some choices. You could just wait until you write the manuscript to explain why your data gathering methods changed, or you could file a new pre-registration that acknowledges (and links to) the earlier protocol but also introduces the new methods. The old protocol won’t disappear, but the evolution of your project is now transparent.

 

“I work with existing data (e.g., from long-term projects, from existing citizen science projects, from my own metaphorical file drawer, for meta-analysis, etc.), so I can’t pre-register prior to data gathering.”

Pre-registration can be useful at any point before you start to examine your data for biologically relevant patterns either through examining data plots or through initial statistical analyses. If you haven’t peaked at the data yet, go for it. Pre-register a detailed analysis plan.

 

“What if I see patterns in my data that I want to follow-up on with analyses that I didn’t pre-register?”

Not a problem. Just distinguish your post hoc analyses from your preregistered analyses in your paper. Ideally you’d also report all your post hoc exploration and declare that you have done so. If you have too many to report in your paper, present them in supplementary material or even in a data repository.

 

“I focus on discovery. I don’t typically have a priori hypotheses when I start a project.”

Pre-registration can still be for you. The primary purpose of pre-registration is to promote transparency. Exploratory work is vital. We just want to know that we’re not being shown a biased subset of your exploratory outcomes. Thus if you have a study and analysis plan, you pre-register it, and then present results from the full set of analyses you presented in your pre-registration, we know we’re not getting a biased subset.

 

“I don’t develop an analysis plan until I have my data so that I can see how they are distributed and how viable different modeling alternatives are with the real data”

There are several options here. You could develop a decision tree that anticipates modeling decisions you will need to make and lays out criteria for making those decisions. Other options include working with some form of your actual data in a trial phase. For instance, you could sacrifice a portion of your data for model exploration, select a set of models to test, pre-register those, and then assess them with your remaining (unexplored) data. Alternatively you could scramble your full data set, or add some sort of noise, refine your analysis plan with these ‘fake’ data, then pre-register and re-run the analysis with the real data.

 

“I don’t want to develop a detailed analysis plan. There are too many unforeseen circumstances and I’m bound to ultimately deviate from my plan”

I have two responses to this concern. The first is to see my previous reply – there are ways to pre-register after you have your data and have confirmed that an analysis is likely to be appropriate with your data. My second point is that, just as field methods change in response to circumstances, so do statistical methods. A pre-registration doesn’t prevent us from changing an analysis, it just helps us be transparent about these changes. Among other things, this transparency probably helps us make sure that when we do change our plan, we’re doing so for a good reason.

 

“If I can just pre-register an analysis plan after collecting my data, why should I bother to pre-register the other portions of my study methods?”

Although I think it’s much better to pre-register an analysis plan than to not pre-register at all, pre-registering the whole study design is helpful for a variety of reasons. For one, pre-registering prior to completion of data gathering (or better yet, before data gathering), help makes it clear that your pre-registered analysis plan could not have been influenced by any knowledge (conscious or unconscious) about patterns in the data. Early pre-registration also facilitates transparency about the project as a whole. Later when you publish the results, other researchers can understand the scope of your work and can be shown (hopefully), that you’re not just publishing subset (potentially a biased subset) of the project. And if you never publish your work, then your pre-registration is evidence that someone at least considered doing this project at some point, and this could be useful information to other researchers down the line. A well-executed pre-registration might also help set expectations for the role of individual collaborators.

 

“Pre-registration is just extra work”

In most cases, pre-registration should not dramatically change workload. If you’ve written a grant proposal, much of the work of pre-registration will already be done. If your grant proposal doesn’t include a detailed analysis plan, presumably the manuscript you write to report your results will include a detailed explanation of your analytic methods, and so a pre-registration just shifts the timing of this writing. Likewise, if this isn’t grant funded research, some other parts of your methods, and presumably parts of your introduction, will be ready and waiting in draft form when you complete your pre-registered study and go to write it up. To the extent that you end up writing more about your analyses in a pre-registration than you would have in a paper that reported only a subset of your analyses, this is the price for doing transparent and reliable science. You should have been reporting all this information somewhere anyway.

 

“If I pre-register, I might be scooped”

You can embargo your pre-registration so that it’s private until you choose to share it. Pre-registrations on the site AsPredicted can remain private indefinitely. On the OSF, embargos are limited to four years.

 

“I’m a student just starting a project and so I don’t know enough about my system to pre-register”

If you’re mentored by someone familiar with this system, then you’ll want to work closely with your mentor to develop your pre-registration. If this isn’t possible, read through my suggestions above. There are various paths forward, from waiting until you’ve worked out the kinks in your methods to various ways of pre-registering after you have data. Think carefully and identify the path that’s best for you.

 

If you have other concerns or questions about how you could apply pre-registration to your work, I’d love to hear about them. Let’s have a discussion.

Not all work needs to be pre-registered, but most work could be pre-registered. And this is important because pre-registration will help ecologists and evolutionary biologists improve transparency and thus, I expect, reduce bias in a wide array of circumstances.

 

Ecological Society of America Ignite Session on Replication in Ecology

by Hannah Fraser

Fiona Fidler and Tim Parker organized an Ignite session on Replication in Ecology at the Ecological Society of America Annual Meeting 2017 in Portland, U.S.A a few weeks ago. Ignite sessions start with a series of 5 minute talks on a similar topic that are followed by a panel discussion. At Fiona and Tim’s session more than 50 attendees listed to talks by Fiona Fidler, Clint Kelly, Kim LaPierre, David Mellor, Emery Boose, and Bill Michener.

Tim introduced the session by describing how it had arisen from discussions with journal editors. Tim and his colleagues have recently been successful in encouraging editors of many journals in ecology and evolutionary biology to support the Transparency and Openness Promotion guidelines but one of these guidelines – the one which encourages the publication of articles that replicate previously published studies – has proven unpalatable to a number of journal editors. The purpose of the Ignite session was to discuss the purpose and value of replication studies in ecology, to raise awareness of the thoughts shared by members of Transparency in Ecology and Evolution, and to take initial steps towards developing a consensus regarding the role of replication in ecology.

Fiona Fidler

Fiona spoke first, describing the tension between the importance of replication and the perception that it is boring and un-novel. Replication is sometimes viewed as the cornerstone of science (following Popper): without replicating studies it is impossible to either falsify or verify findings. In contrast replication attempts are deemed boring if they find the same thing as the original study “someone else has already shown this”, and meaningless if they find different things “there could be millions of reasons for getting a different result”. However, there are actually a range of different types of replication studies which differ in their usefulness in terms of falsification, their novelty and the amount of resources required to achieve them. Fiona broke this down into two scales 1) using the same exact data collection procedure or completely different data collection procedures and 2) using the same exact analysis to using completely different analyses. A study that uses completely different data collation and analysis methods to investigate the same question is often termed a conceptual replication. Conceptual replication is reasonably common in ecology: people investigating whether an effect is true in a new context. However, there are very few studies that attempt to more directly replicate studies (i.e. by using the exact same data collection procedures and data analyses). However, finding a different result in these contexts doesn’t result in falsification or even, in many cases, scepticism about the findings of the original study because there are often so many uncontrollable differences between the two studies and any of these could have caused the studies to find different results. Fiona suggested that one way to enhance the relevance of all replications, but particularly these conceptual replications, could be to adopt a proposal from psychology and include a statement in every article about the constraints on generality. If all articles described the circumstances in which the authors would and would not expect to find the same patterns it becomes possible to use conceptual replications to falsify studies, or further delimit their relevance.

Clint Kelly

Previous work has shown that 1% of studies in psychology are replication studies. Clint described some of his recent work aimed at determining how many replication studies there have been in ecology. He text-mined open access journals on PubMed for papers with the word ‘replic*’ anywhere in their full text. He found that only a handful of studies attempted to replicate a previous study’s findings, and of these, only 50% claimed to have found the same result as the original study. These analyses suggest that there are many fewer replication studies occurring in ecology than in psychology. However, this value only accounts for direct replications that discuss the fact that they are replications of previous work. Conceptual replications are not included in this because it is not common practice to mention that they are replications, possibly because it makes the paper seem less novel. However, this valuable work suggests that the rate at which direct replication studies in ecology is abysmally low.

Kim LaPierre

Kim discussed the Nutrient Network (NutNet) project which she described as a coordinated, distributed experiment but which could equally be seen as concurrent direct replications of the same experiment. The NutNet project aims to “collect data from a broad range of sites in a consistent manner to allow direct comparisons of environment-productivity-diversity relationships among systems around the world”. The same experimental design is used at 93 sites all continents except Antarctica. It’s a massive effort that is achieved with almost no funding as the participating researchers conduct the experiment using their existing resources.

David Mellor

David is from the Center for Open Science and discussed how to guarantee that results of replication studies are meaningful regardless of findings. Like Fiona, David advocated using constraints on generality statements in articles to describe the situations which you would reasonably expect your results to extend to. The majority of David’s talk, however was about preregistration which can be used for replication studies but is actually useful in many types of research. The idea is that, before you start your study, you ‘preregister’ your hypotheses (or research questions) and the methods you intend to use to collect and analyse the relevant data. This preregistration is then frozen in time and referenced in the final paper to clearly delineate what the original hypotheses were (ruling out HARKing – Hypothesizing After Results are Known) and which tests were planned (ruling out p-hacking). The Center for Open Science is currently running a promotion called the Preregistration Challenge, in which the authors of the first 1000 articles that pre-register and get published receive $1000.

Emery Boose

Emery discussed RDataTracker, a package that he helped develop that aids in creating reproducible workflows in R. RDataTracker gives information on data provenance including information on the hardware used to run the analyses and the versions of all relevant software. The package allows you to see what every step of the analysis does; what intermediate values are produced at all points of the analyses. This can be really useful for de-bugging your own code as well as determining whether someone else’s code is operating correctly.

Bill Michener

Bill Michener is from DataOne, an organization that aims to create tools to support data replication. They have developed workflow software (DMPtool.org) that make it easier to collate metadata along with datafiles and analyses. The software links with Ecological Metadata, github, dryad and the Open Science Framework among other programs.

 

The talks were compelling and most attendees stayed to listen and take part in discussions afterwards. Although the session was planned as a forum for discussing opposing views, there was no outright resistance to replication. It is probably the case that the session attracted people who were already in support of the idea. However, it’s also possible that strong opinions expressed in favour of replication made people reluctant to raise critical points or (perhaps wishfully) maybe the arguments made in the Ignite session were sufficiently compelling to convince the audience of the importance of its importance. In any case, it was inspiring to be surrounded by so many voices expressing support for replication. The possibility of replication studies becoming more common in ecology feels real!