What to try next with coevolution
I'm waiting for a lot of experiments to finish, but interim results from the coevolutionary runs continue to look as unimpressive as those I described at the end of my last post. I'll let the current crop run out for another day, but meanwhile I need to come up with some hypotheses about why these runs are doing so badly, and tests for those hypotheses. This can guide the experiments I set up tomorrow and leave running over the weekend. Ideas after the cut.
7 thin years
hypothesis
My leading candidate explanation is that the constant replacement of least-difficult trials is leaving a set that agents simply can't succeed on. The most likely way for this to happen is for the trials to begin with a block of successive presentations of bad food so long that the agent would die before any good food appears.
This is relatively quick to test for - I just need to have the simulator output the contents of all trials [this option is already there for debugging, but turned off for normal runs because it's very slow and eats up disk space], and run some experiments where I keep track. So I can do that this afternoon.
If this hypothesis is correct then it should be possible to fix the problem by adjusting the generation of trials in a way that I have discussed before: forcing every batch of trials to include at least one presentation of good food in the first n timesteps, with n sufficiently small.
Blind spot
hypothesis
This is the idea that for some stupid reason—it could be a bug in my code or a quirk of C++'s floating-point handling, or whatever—there's a consistent region of the space of food quality-sensory signal mappings on which these agents will always fail. If such a thing exists, it does make sense that the vectors of trials would eventually evolve to exploit it. I think this is relatively unlikely, because in my observations so far individual agents have had quite different failure points.
I can test this in a similar way to the previous hypothesis: if the trials do turn out to converge on a particular range of values then I have a problem, otherwise I can discard this hypothesis.
If this is the issue, then I have a monstrous debugging session ahead of me, but at least in the short term I should be able to work around it by fudging the generation of trials to avoid the bug.
The task may just be too hard
It may be that the population of agents actually needs the easy trials in a randomly-generated set, to scaffold evolution up to the point of reaching controllers that can deal with the difficult trials.
This one would be harder to test, but I feel like it's a almost a null hypothesis - if I can't find anything else wrong, I should probably act on the assumption that this is the right explanation, and see what happens.
If I find myself in this situation, I think the thing to try is more judicious use of thresholds on the coevolution of trials. I already have some experiments going which only replace trials when the fittest agent scores above a threshold, but a quirk of the paramters I've used ensures that the first 200-250 generations all meet the criterion, so there's quite a lot of movement in the trial set at first. I need to use parameters chosen to avoid this problem.

Comments