One pretty certain fact about science is that much of the public is more interested in the various findings of science, than in the methods by which they are generated. Science generates some “wow, cool!” findings, so this is (somewhat) understandable. Science writers generally oblige this interest by focusing on findings over methods, thereby implicitly trusting that whatever (perhaps mysterious) methods were used to obtain them, were legitimate. Scientists themselves look at the matter pretty differently though.
I, for example, almost always prefer issues in the validity of a study’s methods over the findings themselves, and is why, as soon as I understand the essentials being reported, I go straight to the methods section. That’s because another fact in science is that if your methods are not valid, you will come to wrong conclusions. And a third fact–and this one’s key–is that analytical methods used in studies can be complex, wildly varying, poorly described, and most importantly, not well evaluated before specific application. So I think it’s important to outline some concepts that determine scientific practice, that is, some epistemology, however boring or obvious that might seem.
Inference, roughly, is the process of reaching conclusions about the world based on observations of it. Different branches of science have different inferential constraints; these occupy central places in how they operate, but there are some common logical considerations across all. We start from the general observation that the world is complex: an enormous number of objects interact in a huge number of ways. The most powerful way to arrive at an understanding of cause and effect in any system is to conduct manipulative experiments, varying system components or processes systematically and observing the results—think high school physics or chemistry lab. In a now classic paper 50 years ago, Platt (1964) termed this general approach “strong inference”, and its power derives from its efficient ability to separate and estimate the effects of individual causes (“drivers”), and their various interactions, a task which is often otherwise difficult to impossible. From these estimates one progressively builds a model of the system that tries to explain the dominant sources of variation therein. This process also goes, in philosophy, by the name of reductionism.
The problem with this (one that Platt avoids), is that you very often can’t execute anything like it, for either physical or practical (e.g. cost) reasons. Such limits are obvious for any phenomenon operating at large scales of space or time, which includes much of the earth and environmental sciences; if earth scientists could make 50 earths and run 500 year experiments on them, they likely would. So you have to come up with some other way of identifying and quantifying the magnitudes of the various system interactions. Now this is a truly critical point, because if you can’t do so, you can and likely will come to mistaken conclusions about reality, because a common characteristic of nature is that many variables co-vary with each other to varying degrees. This issue’s importance corresponds to a given system’s complexity, and I would argue that many of the major scientific mistakes result from this problem.
One can conceive of (at least) two major conceptual approaches to resolving the problem. The first recognizes that it’s not always necessary to fully abandon the idea of a controlled experiment. In an approach known as a “natural experiment”, one can sometimes take advantage of the fact that nature at times provides situations in which certain important system drivers vary naturally, while others do not. These variations may not be perfectly ideal, as they might be in a manipulative experiment, but still good enough to give important insights into cause and effect. Using a fire ecology example, you might for example have a landscape across which wildfires burn at varying intensities, or seasons, but in which a number of other factors known to be of potential importance in fire behavior are roughly equal, e.g. topography, starting biomass, or relative humidity. Intentionally burning large landscape areas at different intensities is not politically feasible, but since unplanned wildfires are common, valuable information can often be obtained.
I would argue that variations on this basic idea are the most powerful way of quantifying cause and effect in complex systems not amenable to controlled experimentation, and that in fact humans employ this idea commonly in assessing various things in everyday life. But natural experiments of high enough quality do not always present themselves. What then? Well, that’s right where things start to get analytically hairy, resulting in a large army of statistical techniques and approaches, often discipline specific and jargony, and thus where one has to start being very careful as a reader. And statisticians have frankly not been good at making complex techniques easily understandable to the scientists who need to use them.
One approach is to switch the focus from the classical, step-wise analysis of single elements of the system, to a synthetic analysis of the entire system simultaneously. Mathematically, this is accomplished by the use of multivariate statistical methods, which very roughly, replace the original system variables with a new set of “synthetic” variables that capture, in descending order, the major patterns of variation in those original variables, but which unlike the originals, are uncorrelated with each other. This can be done with either or both of the response and driving variables, resulting in a description of the system dynamics at a more synthetic, i.e. more inclusive, level of organization. The most common of these techniques in climate science is known as principal component analysis (PCA), while in ecology that and several other techniques are common. It’s an interesting and powerful approach, but it necessarily sacrifices an understanding of interactions between individual system components. And that unfortunately is likely to a big problem whenever several processes affect a given response variable, as is common in biology or most any complex system for that matter.
Another, more sophisticated, class of multivariate statistical techniques are designed to address that problem. These use detailed analyses of correlations of all system variables with each other to hypothesize cause and effect dynamics between all of them. (Note that I said “hypothesize” there.) They include techniques known as structural equation modeling and path analysis. In these methods, system observations themselves, rather than outside information (i.e. pre-existing or theoretical knowledge of some type), determine the estimates of the system processes during model building. Scientific insight and training are crucial in choosing between these two approaches, and this decision process is not easily described or explained to non-scientists, because it involves context-specific decisions. It constitutes the important art of knowing which information is most relevant to the particular problem at hand, which in turn involves questions of how broadly generalizable one’s conclusions will be and issues of data quality and availability, among others. Opinions and debate about such decisions can and should form a core part of legitimate scientific debate, because they are by no means always clear-cut, and thus disagreements can readily arise.
Once a model is constructed, by whatever method, then controlled experimentation can begin, by systematically varying the model’s parameters, or even the more basic structures of its equations. In so doing, we have exchanged the experimentation we would like to do on the actual system, with experimentation on the quantitative model(s) of that system. This practice throws a lot of people, especially non-scientists, who wonder what the point is if you don’t really know how well your model(s) reflect reality to begin with. This concern is entirely valid, and we should not use or trust any model that does not get the critical things right, either as a basis for further scientific studies, or for real-world predictions.
But there’s more to it than just that; for one, what it means to get something “right” with a model is a persistent source of confusion. But more generally, in the process of experimenting with a model, you often learn highly important things. Your understanding improves as you inspect model output and gauge it against observations, in a variety of formal and informal ways. It informs you on how equations interact with each other in producing complex output. It can show you where certain kinds of observations are especially needed, thereby making monitoring system evolution more efficient (and thus, e.g., saving money).
There are many more, often detailed, considerations than just these in conducting research; this is just a framework of some important concepts and approaches.
Platt, J. R. (1964). Strong inference. Science 146(3642): 347-353.