How Does Your Computer Generate Random Numbers with Numpy ?

Rédigé par Tanguy Marchand

What you should know about numpy and pseudo random number generators (PRNG).

The Mersenne-Twister is a pseudo random number generator (PRNG) used by the numpy library. This article explains all you need to know before using it.

As a data scientist, I need to generate random numbers for my algorithms. It puzzled me to know that deterministic processes in my computer could generate random sequences. I had a deeper look into it, and here is what I’ve learned:

Your computer does not generate truly random numbers but uses algorithm such as the Mersenne-Twister to get pseudo random sequences.
It is important to know when and how the seed of your pseudo random generator is set, otherwise you might have bad surprises, especially when developing multiprocess applications.
Standard random generators are highly predictable and should not be used for cryptographic or game gambling purposes.

Generating random numbers with numpy

With numpy, the quickest way to obtain random numbers between 0 and 1 is to use the following:

A first random number: 0.2332029758567754
A second random number: 0.7277750980801885

Or (almost) equivalently:

A first random number: 0.8492693008307766
A second random number: 0.9858307170084044

In both ways, we are using what we call a pseudo random number generator or PRNG. Indeed, whenever we call a python function, such as np.random.rand() the output can only be deterministic and cannot be truly random. Hence, numpy has to come up with a trick to generate sequences of numbers that look like random and behave as if they came from a purely random source, and this is what PRNG are.

Mersenne-Twister: the PRNG you may have used dozens of times without knowing it!

The most spread PRNG is the Mersenne Twister (MT) generator that was designed in 1997 by M. Matsumoto and T. Nishimura. It works schematically as the following:

Step 1: We initialize our random generator with random_state= np.random.RandomState() and we generate an internal state of 624 integers.
Step 2: Each time we call random_state.rand() we apply two operations: first, we perform a twist on the internal state to get a new internal state. Then we apply a tempering function on the newly obtained state that generates a float between 0 and 1 which is return by the random_state.rand(). These two operations are performed each time we ask for a new random number.

Schematic representation of the Mersenne Twister

The twister and the temper functions are totally deterministic. Hence, the value of the first occurrence of random_state.rand() only depends on the initial internal state which therefore should be generated… randomly. In practice, RandomState() generates the initial internal state from a seed value which is taken from the internal clock of your computer. Otherwise, you can also specify the seed value by doing RandomState(seed = your_favorite_seed_value) which can be useful in order to have a deterministic code.

What are these mysterious twister and tempering functions?

The twister and tempering functions are made of elementary bitwise and matrix operations. You can find their definitions in the original paper of the Mersenne Twister, or more simply in their C implementation in the numpy github repository:

rk_state is defined as C structure whose only relevant fields for our case are a list of 624 integers representing the state stored in rk_state.key and a position whose value is rk_state.pos . The twister function from line 14 to 29 is only applied once every 624 called when the state->pos reaches 624. In fact, the internal state stored in state->key contains enough information to generate 624 different random numbers. The tempering function consists of the bitwise operations lines 33 to 36.

Why is the Mersenne Twister PRNG so popular?

The Mersenne Twister is used so much for two simple reasons:

It is fast to evaluate: few nanoseconds on a recent laptop according to time it.
It produces sequences of numbers that look pretty random. In order to evaluate the randomness of a sequence, we can perform different statistical tests such as the (have a look at it!) that are passed by the Mersenne Twister PGNR.

The two caveats of the numpy.random module

MT PRNG should not be used as such for cryptography purposes! In fact, the temper function can be easily invertible. It means that for each output of the rk_random function we can infer one column of state->key and after 624 iterations of rk_random we know perfectly the state of the generator and can predict the remaining of the pseudo random sequence. In conclusion, do not use numpy.random in any cryptography or gambling game development, but look rather for specialized modules such as the secrets, or Crypto.Random.
Be careful when you use numpy.random in multiprocess application as it can lead to misleading behaviors as shown in the following:

Output:

'Bad practice: '
[array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
 array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
 array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
 array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
 array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9])]
'Good practice: '
[array([8, 9, 4, 5, 1, 0, 8, 1, 5, 4]),
 array([5, 1, 3, 3, 3, 0, 0, 1, 0, 8]),
 array([1, 9, 9, 9, 2, 9, 4, 3, 2, 1]),
 array([4, 3, 6, 2, 6, 1, 2, 9, 5, 2]),
 array([6, 3, 5, 9, 7, 1, 7, 4, 8, 5])]

You can see that the different calls to bad_practice in our multiprocess template always generate the same output. In fact, the RandomState is not re-initialized for each thread and all the threads share the same initial seed and initial internal state. While in the good multiprocess practice case, the random state is initialized for each thread differently.

Let’s finish by a fun game!

In one of the square, I have randomly drawn in red 100 000 couples of points (x,y) with a uniform distribution in [0,1], while the other square contains 100 000 points coming from another stochastic process. Can you guess which one is the uniformly distributed one?

Actually, the uniformly random one is the left one! It is quite misleading as uniformly random processes might create more clusters and gaps than what we think. The image on the right was generated by first putting red dots on a regular grid and shifting each dots by random small steps (the output is therefore stochastic but it is not a uniform distribution). In fact, the human brain is quite bad for generating and/or detecting random processes, but that would be the topic of another post!

If you are looking for Machine Learning expert's, don't hesitate to contact us !

Thanks to Pierre Marcenac, Pierre Marcenac, Dan Ringwald, and Nicolas Jean.

Cet article a été écrit par

Tanguy Marchand