search

Continuous-sensory experiments

As I set out almost a month ago, I'm now running experiments in which the agents receive sensory input on a continuous scale, and must respond appropriately to that in order to perform well. So far I've found one good agent among a general background of unsuccessful runs, but I'm still tinkering with parameters to try and make the process more reliable. Detailed method and a brief overview of results so far are behind the cut.

For a each trial, an n-dimensional vector of random real numbers in the range 0-1 is generated; I'll refer to this vector as the offsets. They define the association between sensory input and food quality, as follows, where i denotes individual dimensions: food quality = Σisin[2π(inputi + offseti)] . Each inputi is the signal (also continuous from 0 to 1) given to one input channel of the agent's controller (which has n external inputs plus one channel for the presence of food (0 or 1) and one for its own energy level). So the agent's external input is a vector of real numbers that are correlated with the quality of the food before it, in a way that is consistent within a trial but randomised between trials.

A trial consists of a set of (500 in all the current experiments) presentations of food, in which food appears and the agent's controller output (measured as the activation value of a designated neuron) after 20 time steps determines whether it is said to have eaten or ignored the food. The agent's energy level declines by a small amount for every presentation in which it does nothing, and for presentations in which eats the food the quality of the food determines the effect (both magnitude and direction) on its energy level. Agents are scored on their average energy level through a trial, so to perform well agents have to initially sample haphazardly, and subsequently learn which sensory input should be followed by eating the food in the current trial.

So far I'm using agents with 10 interneurons, and only one sensory input channel because I want to make sure I can get the simplest setup working before complicating anything. At present I'm running a set of experiments to try and work out how many trials to present each agent with in each generation, and how to update the trials from one generation to the next. All of the agents in a given generation are exposed to the same batch of trials, but it's always a sampling of all possible trials, and deciding how large a sample to make it is not trivial. If the sample changes too dramatically from one generation to the next then evaluation noise completely swamps any real fitness differences, but if it changes too little then agents simply evolve to work well on the particular set of inputs they're seeing. Meanwhile if the sample is too small then either the overfitting is worsened (if there's little churn) or the trial noise is worsened (if there's enough change in the trials to avoid the overfitting problem). But then if the sample's too big a single run takes forever....

So far I've tested trial set sizes from 5 to 85, in increments of 5, and replacement schemes of either replacing 1 trial each generation or half of the trials (replacing the oldest trial[s] in either case with new randomly generated ones). So far performance is entirely unsatisfying; all but one of the runs that have finished has produced nothing of interest, and while the runs with the largest sets of trials seem to be a bit more promising, they have been going for over 10 days and not yet finished so they're of little practical use. I think that realistically this tells me that I need to do something more sophisticated than just shuffling through the trials in a fixed way; I'll discuss ideas on exactly how to improve on this tomorrow.

Meanwhile, I do at least have this one high-performing agent, which is tremendously reassuring. For one thing, it proves that the task is at least tractable with the sort of controller I'm using. For another, analysing this one's behaviour gives me something else useful to do while I wait for evolution runs to finish.

Except for the one run that produced the good agent, the best agents from these runs score between 0.55 and 0.75 [scale from 0 to 1, where an agent that never makes a mistake scores 0.99] on a fresh, randomly-generated batch of trials. The good agent scores 0.93 on 1000 random trials, suggesting that its high fitness isn't just some artefact of the run that produced. All the same, there are individual trials in that set on which its performance is as low as 0.25, and I have not yet managed to work out what is special about these trials that defeat the agent. I know that it isn't anything as obvious as those trials just happening not to contain any good food, and I think trying to understand what it is about them will be the first step to understanding exactly how this agent works.

Trackbacks

Trackback URL for this entry is: http://blog.case.edu/exg39/mt-tb.cgi/8847 How to deal with evaluation noise
Excerpt: As I've mentioned before, I'm running into problems of evaluation noise with my runs, and the most obvious solution (give each agent more trials) is too costly to be practical. I do have a few ideas of what to do about this, which I'll be experimenting...
Weblog: Eldan Goldenberg's lab notebook
Tracked: July 20, 2006 08:03 PM Looking more closely at individual agents
Excerpt: While I wait for experiments to finish, it's time to take a closer look at those agents I've found so far that seem to be performing well. For many reasons, the performance score during a run is an imperfect gauge of how good the agent actually is, so ...
Weblog: Eldan Goldenberg's lab notebook
Tracked: August 7, 2006 06:26 PM

Comments

Post a comment