search

### Random number generators

Starting to use some tools I haven't used before for original work, I'm wondering about details that didn't matter when I was only using Matlab for course exercises. Specifically, I'm trying to figure out whether to trust its random number generator, considering that issues with the C++ random number generator have caused me trouble before, and in the near future I'm likely to find myself using monte carlo methods to estimate the properties of systems.

As far as I know, there are two issues with the built-in C++ random number generator: it's slow, and it actually produces a somewhat lumpy sampling of the space of possible values. The slowness rarely really matters, and at least when it is a problem the issue is obvious while running the experiments. The unevenness of the random numbers produced, however, is a subtle problem that can go unnoticed for a while but cause whole sets of data to be seriously misleading.

I discovered this in the preliminary work for a paper on selective attention. The agents described in that paper had to catch two ballistically falling objects with different speeds and trajectories, and I evaluated each agent by exposing it to a large number of trials in which the starting locations, speeds and trajectories of the objects were randomly generated. After several weeks of data gathering using the built-in C++ random number generator to generate trials, I serendipitously discovered that there were some special combinations of trial parameters that caused many agents to fail. Unfortunately the random trial generation was reliably failing to generate a single trial that fit these criteria, so agents were evolving without ever being exposed to them, and as a result I could come up with a set of trials on which every evolved agent's performance was reliably terrible.

Obviously for a specific set of trials that I knew about, it was always possible to fix the trial generation and make sure they would be covered, but this leaves open the possibility that there are other such cases I have as yet to discover, so it's not a satisfactory solution. The solution I took was to replace the random number generator with one my advisor had written, drawing on Numerical Recipes in C. As far as I have been able to tell, this samples the space of possible trials considerably more evenly, avoiding any further problems like the one I described above.

Unfortunately I don't know how to evaluate a random number generator myself, beyond just using it and seeing if any problems arise. There must be a principled way to do this, but it's outside my current knowledge. So I'm left wondering if I'm going to run into the same problem with Matlab or am safe relying on the provided functions.

Trackback URL for this entry is: http://blog.case.edu/exg39/mt-tb.cgi/12809

I'm a big fan of the xkcd algorithm but if you want something really good consider using a chaotic source like LavaRnd.

Posted: February 20, 2007 05:27 PM

Well, the xkcd algorithm would at least have the merit of being fast....

Thanks for the LavaRnd tip. I don't think I really need to go as far as to set up a chaotic source—this isn't for a cryptographic application, after all, plus it's sometimes useful for experiments to be precisely repeatable—but there's a lot of useful information on that site. The section on evaluating LavaRnd will be particularly useful if I decide to test the Matlab generator myself, which may well be the only way to get answer to my questions with which I am satisfied.

Posted: February 20, 2007 07:14 PM