The Data Lies. The Crisis in Observational Science and the Virtue of Strong Theory
The problem with data fetishists is their choking down a daily flagon of numerical drivel without analyzing the brew. One of the things that a good scientist knows is how to interrogate the numbers, not waterboard them. Truth is that useful models improve flaky data and the statistical treatment thereof.
An introduction for Eli was a talk that Drew Shindell gave, twenty maybe more years ago with a title that ran, "Which should your trust, the data or the models?" about global temperature data in the late 19th century. The useful conclusion was trust neither, but use them together to produce understanding and improve both. Yes theory can improve measurements and data.
A nice example is how NIST's acoustic thermometer can be used to define the thermodynamic temperature scale. Starting with the theoretical result for the speed of sound in an ideal gas as a function of temperature (theory), a carefully built device to measure the same can be used to build a model of the response of platinum resistance thermometers as a function of temperature and then by applying the model PRTs can be used to more accurately calibrate other thermometers.
How about statistics, well most of what passes for statistical analysis these day is unconstrained, so it can wander off into never never land where never is stuff like thermodynamics and conservation laws. Bart had a nice example of this when discussing the usual nonsense about how observed temperature anomaly data could be explained as a random walk
As you can see, the theory is valid: My weight has indeed remained between the blue lines. And for the next few years, my weight will be between 55 and 105 kg, irrespective of what I eat and how much I sport! After all, that would be deterministic, wouldn’t it? (i.e. my eating and other habits determining my weight)The other side of this is the replication crisis hitting the social sciences, most prominently psychology, well, also other stuff. To disagree with the first link, unlike physical sciences psychology has no well established theoretical consensus against which nutso outcomes can be evaluated. Science is about coherence (a no on that as Alice’s Queen would say) consilience (baskets full of papers having nothing to do with each other but taken together mutually supporting) and consensus (everybunny with a clue agrees on climate change or at least 97%).
Wow, if that’s the case, then I’ll stop my carrot juice diet right now and run to the corner store for a box of mars bars!! And I’ll cancel further consultations with my dietician. Energy balance… such nonsense. Never thought I’d be so happy with a root!
So the question really is what should a lagomorphs's prior be for statistical validity. Clearly, if all you have is the data, the standard of proof for any assertion about the data has to be very high. Wrong answers at low levels of proof are a reason that out on the edge physicists demand 5 sigma data before accepting that a new particle has been found, that's saying that there is 1 chance in 3.5 million that the discovery was in error if that standard is met.
On the other hand, in the well established interior of a field, where there is a lot of supporting, consilient work, a whole bunch of basic theory and multiple data sets, 5 chances in 100 can do the job or even 10 in a hundred. Of course 30 in 100 is pushing it.
Andrew Gelman has a useful set of criteria for priors (same holds for frequentist approaches). Among his recommendations are for weakly informative priors that
should contain enough information to regularize: the idea is that the prior rules out unreasonable parameter values but is not so strong as to rule out values that might make senseand those priors should be
Weakly informative rather than fully informative: the idea is that the loss in precision by making the prior a bit too weak (compared to the true population distribution of parameters or the current expert state of knowledge) is less serious than the gain in robustness by including parts of parameter space that might be relevant. It's been hard for us to formalize this idea.