Wednesday, February 12, 2014

"Psychological plausibility" considered harmful

"goto" statement (
goto is considered harmful in programming languages.

The fundamental enterprise of cognitive science is to create a theory of what minds are. A major component of this enterprise is the creation of models – explicit theoretical descriptions that capture aspects of mind, usually by reference to the correspondence between their properties and some set of data. These models can be posed in a wide variety of frameworks or formalisms, from symbolic architectures to neural networks and probabilistic models.

Superficially, there are many arguments one can make against a particular model of mind. You can say that it doesn't fit the data, that it's overfit, that there are many possible alternative models, that it predicts absurd consequences, that it has a hack to capture some phenomenon, that it has too many free parameters, and so forth. But nearly all of these superficially different arguments boil down to well-posed statistical criticisms.

Consider a theory to be a compression of a large mass of data into a more parsimonious form, e.g. the "minimum description length" framework. For a given set of data, the total description length is the length of the theory (including its parameters), the predicted data from the theory, and some metric over the deviation of the data from those predictions. Under this kind of setup, the critiques above boil down to the following two critiques:
  1. There's a theory that compresses the data more, either (a) by having fewer free parameters, or (b) by being overall more parsimonious. 
  2.  If we add more data, your theory won't compress the new data well at all, where the new data are either (a) other pre-existing experiments that weren't considered in the initial analysis, or (b) genuinely new data ("model predictions"). Concerns about overfitting and generalization fall squarely into this bucket. 
Of course, we don't have a single description language for all theories, and so it's often hard to compare across different frameworks or formalisms. But within a formalism, it's typically pretty easy to say "this change to a model increased or decreased fit and/or parsimony." In linear regression, AIC and BIC are metrics for doing this sort of model comparison and selection. In the general Bayesian or statistical framework, the tradeoff between parsimony and fit to data is a natural consequence of the paradigm and has been formalized to good effect.

In this context, I want to call out one kind of critique as distinct from this set: the critique that a model is not "psychologically plausible." In my view, any way you read this kind of critique, it's harmful and should be replaced with other language. Here are some possible interpretations:

1. "Model X is psychologically implausible" means "model X is inconsistent with other data." This is perhaps the most common argument from plausibility. For example, "your model assumes that people can remember everything they hear."  Often this is an instance of argument (2a) above, only with an appeal to uncited, often non-quantitative data, so it is impossible to argue against. If there is an argument on the basis of memory/computation limits, citing the data makes it possible to have a discussion about possible modifications to model architecture (and the rationale for doing so). And often it becomes clearer what is at stake, as in the case of e.g. asking a model of word segmentation to account for data about explicit memory (discussion here and here) when the phenomenon itself may rely on implicit mechanisms.

2. "Model X is psychologically implausible" means "model X doesn't fit with framework Y." Different computational frameworks have radically different limitations, e.g. parallel frameworks make some computations easy while symbolic architectures make others easy. Consider Marr & Poggio's 1976 paper on stereo disparity, which shows that a computation that could be intractable using one model of "plausible" resources actually turns out to be very doable with a localist neural net.* We don't know  what the brain can compute. Limiting your interpretation of one model by reference to some other model (which is in turn necessarily wrong) creates circularity. Perhaps these arguments are best thought of as "poverty of the imagination" arguments.

3. "Model X is psychologically implausible" means "model X is posed at a higher/lower level of abstraction than other models I have in mind." To me, this is a standard question about the level of abstraction at which a model is posed – is it at the level of what neurons are doing, what psychological processes are involved, or the structure of the computation necessary in a particular environment. (This is the standard set of distinctions proposed by Marr, but there may even be other useful ones to make). As I've recently argued, from a scientific perspective I think it's pretty clear we want descriptions of mind at every level of abstraction. Perhaps some of these arguments are in fact requests to be clearer about levels of description (or even rejections of the levels of description framework).

In other words, arguments from psychological plausibility are harmful. Some possible interpretations of such arguments are reasonable – that a model should account for more data or be integrated with other frameworks. In these cases, the argument should be stated directly in a way that allows a response via modification of the model or the data that are considered. But other interpretations of plausibility arguments are circular claims or confusions about level of analysis. Either way, such arguments lump together a number of different possibilities without providing the clarity necessary to move the discussion forward.

Thanks to Ed Vul, Steve Piantadosi, Tim Brady, and Noah Goodman for helpful discussion, and * = HT Vikash Mansinghka. (Small edits 2/12/14.)