search

How do we show that a model is appropriate?

In the background reading I'm doing to determine exactly what my new research project should be, certain questions keep popping up. Selfishly, this is a good thing, because anything that really seems to be unanswered by the literature is a potential starting point for interesting research, but for today I'm going to write about one that is worrying to me because I feel like it should have already been answered.

The issue is that for any given biological system, there are many possible ways to model that system. To a large extent, model selection has to depend on what the model is intended to achieve; missing out crucial details will invalidate any conclusions, but including more detail than necessary makes experimental data harder to interpret. We're also constrained by how much is known: for instance no-one can make gene regulation model that realistically accounts for the speed of gene transcription until someone finds a way to discover that speed, but this isn't really what I'm worried about. The question I'm interested in is how one can determine whether or not a given model captures all that is relevant about the process/system being modelled. This worries me most with respect to boolean network models of genetic regulatory networks, because there are many such models about, but they make some really big simplifications.

So far, the only papers I've read that really address this do so for the specific model they are presenting, by showing that its output matches observed biological data. There are two pitfalls with this approach: it doesn't tell us anything about model appropriateness in general, and it doesn't really guarantee that novel findings based on the model are meaningful. There's always some danger that the model is overfitted, and this is only made worse by the patchy nature of the biological data available for any given organism.

Over time, a model can be validated by new wet-lab data coming out that confirms its predictions, but how can one demonstrate that it is a sensible choice of model before this happens? I would like to be confident that I'm not investing my time in building models that next year's data will discredit, and given that this is pretty much an inevitable question when presenting work (either at a conference or thesis defence) I'd like to have a confidence-bestowing answer to it worked out.

Trackbacks

Trackback URL for this entry is: http://blog.case.edu/exg39/mt-tb.cgi/12436

Comments

Post a comment