search

Entries for February 2006

Catching up

I'm in a hurry because I'm going on holiday tomorrow, it's past midnight, and I've only just finished tying up some loose ends and setting a large number of experiments running on various computers, so I'm just going to copy-and-paste the update I sent to my lab recently.

Continue reading "Catching up"

Cheering on the Public Library of Science

I am a big fan of the Public Library of Science initiative.

Continue reading "Cheering on the Public Library of Science"

More on drift

Looking at more best-agent vs last-agent-in-search comparisons today has made me realise something: the one pattern that seems to be emerging is that the last agent tends to be much more consistent in its behaviour than the agent from 1000 generations earlier. They achieve this by being less affected by sensory input, though looking at their genotypes is not enlightening as regards exactly how, so it's nothing as straightforward as the setting of gains to 0 that I saw with the catcher agent.

On a given batch of trials, this often (but not consistently) causes a marginal improvement in the agent's performance, yet in the search this never registered as an increase in fitness. This seems to be because the 'best' agent in a search is actually the agent that gets luckiest in terms of the exact batch of trials it encounters, so on the particular batch that's in use in its generation it actually benefits from the noise. But then, when I compare it to other agents using a standardised batch of trials, it's very unlikely that this batch will happen to favour it like the one it encountered during evolution. Of course, the fitness improvements I'm seeing are all very small (typically in the third significant figure), because if they were bigger then they would swamp the randomness effect, and we'd have a new 'best agent'.

Anyway, the upshot of this is that I'm not sure what's happening really counts as random drift, because there seems to have still been some fitness improvement.

The interaction between evaluation noise and neutral networks

Neutral networks in an evolutionary search are the networks of genotypes that all evaluate to the same fitness value. They are important because they allow the population to drift and diversify, which can get a search out of a dead end. Until recently, I had only ever thought about neutral networks in terms of what I'll refer to as 'absolute' neutrality: the case where all genotypes on the network are of exactly equal fitness. [note: is this the ordinary useage of "neutral network", or was I assuming this incorrectly?]

The presence of evaluation noise in my searches has made me re-think this. If there is any randomness at all in the evaluations, then it doesn't make sense to talk about an absolute fitness value, which in turns means that there can't be absolute neutrality. Yet, when I compare the agent that scored the highest fitness in a search with the best agent from the final generation, I often find behaviour that looks like drift.

Continue reading "The interaction between evaluation noise and neutral networks"

More interneurons don't help either

I tried running the same set of experiments, but giving agents 10 interneurons instead of 3. To be honest, I wasn't expecting this to help, because I suspect that the stupid threshold strategy is a local optimum that would need some change in the environment or trial structure to get around, but I had to try this anyway. If the agents with more interneurons performed significantly better or came up with qualitatively different strategies, it would tell me something useful about the difficulty of the task, and be strong evidence against my hypothesis that this version of the task doesn't support the evolution of learning because it's too easy to perform reasonably well with a non-learning strategy.

As it turns out, the average performance of 10-interneuron agents was marginally better, but again not statistically significantly so. I did the same kinds of comparisons, with fitnesses normalised in the same way, as the data I reported yesterday. On average, the best agent from a 10-interneuron run outperformed that from the equivalent 3-interneuron by 0.0398, and the final agents were better by 0.03323. However, the standard deviations were 0.08486 and 0.08784 respectively, and a t test tells me that the probability of error is 0.25 . Qualitatively, the agents are all using the same basic strategy, so my interpretation of these data is that the larger number of interneurons simply supports slightly better tuning of the thresholds, and even that claim is too weakly supported by the data to be publishable without caveats.

Continue reading "More interneurons don't help either"

I was wrong about trial selection

After a few hiccups due to my own errors and some hardware trouble, I now have all the results I was waiting for from the temporal-correlation experiments with 3 interneurons. They're not very impressive, but I have learned some things from them.

There's only one really surprising result, which is that my preliminary report that the 'comprehensive trials' experiments were doing better than ones with randomly generated trials was in fact wrong. It is true that the fitness scores reported during the search are much closer to the results from a more thorough re-testing if the search used the comprehensive trial generation system, but that's the only advantage. More detail, and some hand-waving about why, behind the cut.

Continue reading "I was wrong about trial selection"

More interneurons, shaping experiments, sensory input

Nothing dramatic to report this week, but things are ticking over. Here's a summary of what I've been up to:

Continue reading "More interneurons, shaping experiments, sensory input"

CPU appeal

I have reached a point where I would benefit from having access to more CPU time than I do right now. So I have two questions:

  1. If I can get hold of a little money, what would be the cheapest way to get hold of some hardware to run my own cluster? The ability to customise heavily would probably be an advantage, because my experiments are so heavily CPU-bound that I can afford to skimp on things like hard drive speed.
  2. Assuming that I can't get hold of my own hardware, does anyone have some spare computers they could loan me some time on? I just need a C++ compiler, a remote shell and a practical way to transfer files, but bear in mind that if I can get any money for this I'll be buying my own hardware, so if I'm doing this I wouldn't have any funding to pay for it. I think the only resources that would be suitable are spare computers that are really sitting idle for a lot of the day.