Dan is the founder of Rectangular Software, an independent UK software company that provides development services and builds its own mobile and web applications. Dan has over a decade of experience in the software industry, building a wide variety of systems including casino, poker and spread-betting platforms, mobile applications, e-commerce websites, and network security software. His other programming interests include artificial intelligence, particularly evolutionary computation, and functional programming in Haskell. He has authored, or contributed to, a number of open source Java projects. Dan has posted 34 posts at DZone. You can read more from them at their website. View Full User Profile

A Java Programmer’s Guide to Random Numbers. Part 2: Not just coins and dice

05.14.2008
| 18971 views |
  • submit to reddit

This is the second in a series of articles about random numbers in Java programs. The series covers the following topics and also serves as an introduction to the random numbers package of the Uncommons Maths library:

  • True Random Numbers and Pseudorandom Numbers
  • Statistical Quality (making sure the dice are not loaded)
  • Performance (because sometimes you really do need half a million random numbers a second)
  • Different kinds of randomness (because not everything can be modelled by tossing a coin)
  • Degrees of Freedom (is every possible outcome actually possible?)
  • Security and Integrity (when not losing money depends on nobody knowing what happens next)

Part 1 of this series discussed different kinds of random number generators (RNGs), highlighted the issues with using the default Java RNGs (java.util.Random and java.security.SecureRandom) and introduced the Uncommons Maths library, which provides three fast, high-quality alternative Java RNGs.

But there is more to the random numbers package than just replacement RNGs. Uncommons Maths also provides convenient classes for working with several different probability distributions, which is pretty much essential for modelling anything other than the most simple random processes.

The Uncommons Maths random numbers demo application (WebStart required) demonstrates many of the concepts discussed in this article. It also serves as an example of the performance differences between different RNGs as discussed in the previous article.

Different Kinds of Randomness

Uniform Distribution

The Uniform Distribution is what you get by default from java.util.Random (and its sub-classes - including the Uncommons Maths RNGs). The discrete uniform distribution says that there are a certain number of possible outcomes and each is equally likely. The obvious example is rolling a dice. You can get a 1, 2, 3, 4, 5 or 6 and no single outcome is more likely than any of the others. If you record the results of 6 million dice rolls you would expect to get pretty close to a million of each number.

The uniform distribution is useful for some problems, such as shuffling a deck of cards, but it is not a good choice for many other scenarios that don’t involve a set of equally likely outcomes.

Normal (Gaussian) Distribution

The Normal Distribution (also known as the Gaussian Distribution) is the familiar bell curve. It is a good model for many real world phenomena. In a normal distribution, the average outcome is more likely than any other outcome. A slightly above or below average outcome is almost as likely, and extreme outcomes, while still possible, are very unlikely.

An example of this would be the distribution of IQ test scores. Most people would achieve somewhere close to an average score (maybe slightly above or below), but a few people would achieve very low scores and a similarly small number would achieve very high scores.

The nextGaussian() method of the java.util.Random class provides rudimentary support for obtaining normally-distributed values. Uncommons Maths’ GaussianGenerator class builds on this, making it easy to create distributions with different means and standard deviations. The standard deviation parameter controls how spread out the values are. A low standard deviation means most values will be very close to the mean (on either side). A high standard deviation increases the likelihood that a value will be a long way from the mean.

GaussianGenerator is a wrapper around a standard RNG (you can use any of the Uncommons Maths RNGs, or even one of the standard Java ones if you so choose). Once you have created a GaussianGenerator by specifying a mean, standard deviation and source RNG, you can call the nextValue() method ad infinitum:

Random rng = new MersenneTwisterRNG();
GaussianGenerator gen = new GaussianGenerator(100, 15, rng);
while (true)
{
    System.out.println(gen.nextValue());
}

If we were to generate thousands of simulated IQ test scores with this code, we would see that the vast majority of scores would be close to the average (100). Around 68% of the values would be within one standard deviation (15) of the mean (i.e. in the range 85 - 115), and approximately 95% would be within two standard deviations (between 70 and 130). These percentages are a well-known property of normal distributions.

The demo shows how the distribution of generated values compares to a theoretical normal distribution (the more values you generate, the closer the fit will be).

Published at DZone with permission of its author, Dan Dyer.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: