Weighted Random Choices (JS)

In programming, we often need to generate random numbers, especially for games. Almost any game will use at least some kind of random number generator (RNG), and most need lots of them to drive the AI, make new levels, and so on.

Any programming language worth using (for games, anyway) will have some way of making an RNG. C has rand(), Javascript has Math.random(), and so on. Game engines usually add in their own ways, like Unity’s Random.value, constructed out of the base provided by whatever language they’re written in. All of these work in basically the same way: you ask the RNG for a value, and it gives you one. Usually, it’s a floating-point number between 0 and 1, which is good if that’s what you need.

Beginners starting out in game development quickly learn how to turn the float value from 0 to 1 (not so useful) into a number in the right range (more useful). The code for rolling a regular, six-sided die usually looks something like this (in Javascript):

var roll = Math.floor(Math.random() * 6) + 1;

It’s a pretty standard technique, and many languages and game engines now have functions that do this for you. And, again, if that’s what you need, then it’s perfect. Sometimes, though, it’s not what you need. This post is for one of those times. Specifically, the case where you need to choose one value from a list.

Simple Choices

When you simply need to pick a single value out of a list (array, vector, whatever), that’s easy. All you’re really doing is the same thing as rolling a die:

var index = Math.floor(Math.random() * array.length);
var choice = array[index];

In effect, we’re choosing a random array index. Easy. Like rolling a die, it gives you an equal chance for each possible value. (If your array contains the numbers from 1 to 6, then you’ve just created a less efficient method of rolling a die.)

Harder Choices

Not everything has an equal probability, though. (Statistics as a science wouldn’t exist if it did. Whether that’s a good or bad thing depends on the way you look at it, I guess.)

Some things can be approximated with the die-rolling method above. Lotteries, for example, work the same way, as do raffles. For some games, that may be all you really need. But other games require more from their RNGs than this.

Take the case of rolling multiple dice. Anybody who has played a role-playing game, or craps in a casino, or even Monopoly knows that, while every number on a die has an equal chance of popping up, things get more complicated when you add more dice. With two dice, for example, a total of 7 is more common than a 2 or 12, because there are more ways to roll numbers that add up to 7: 1+6, 6+1, 2+5, 5+2, 3+4, and 4+3. Add more dice, and the numbers get bigger, the range gets wider, but the idea stays the same: numbers “in the middle” are more likely than those on the outside. (Due to what’s called the central limit theorem, the more dice you roll, the more the graph of possible outcomes starts to resemble a bell curve.)

Rolling a lot of dice is impractical, even on a computer. The probabilities aren’t always so nice and neat that a bell curve works. Maybe you need to choose from a set where the ends are more common than the middle, or a list with a weird distribution of frequencies like the letters in English text. (One out of every eight letters, on average, is E, but Z is over a hundred times less common.) A word game, for instance, would certainly need to do something like this.

Now, there are plenty of different ways of generating random numbers based on frequencies. Here, I’m only going to describe what I think is the simplest. First, we need a problem. Let’s say you’re making a word game that, for whatever reason, uses the letter tiles from Scrabble. (You probably wouldn’t be making an actual Scrabble game, because some people or lawyers might not like that, but we’ll say you’re using the letter frequencies.) Looking at that link, you can probably see that just using random choices won’t help you. We need something different.

First things first, let’s define the set we’re choosing from (note the space at the end, which represents the blank tiles):

var letters = ['a','b','c','d','e','f','g','h',
    'i','j','k','l','m','n','o','p','q','r','s',
    't','u','v','w','x','y','z',' '];

Since this is an uneven distribution, we need some way of representing each probability. In this method, we do this by “weighting” each value. We’ll store these weights in their own array:

var weights = [9, 2, 2, 4, 12, 2, 3, 2,
    9, 1, 1, 4, 2, 6, 8, 2, 1, 6, 4,
    6, 4, 2, 2, 1, 2, 1, 2];

In this case, the sum of all the weights is 102 (100 letters, 2 blanks). Therefore, the ratio of each weight to that sum is the frequency of the letter. (For example, there are 12 E tiles, so E’s frequency is 12/102, or about 11.8%.) That’s the key to this method. Basically, we do something like this:

function randomLetter() {
    var totalWeight = 0;

    for (var w in weights) {
        totalWeight += w;
    }

    var random = Math.floor(Math.random() * totalWeight);

    for (var i = 0; i < letters.length; i++) {
        random -= weights[i];

        if (random < 0) {
            return letters[i];
        }
    }
}

(Obviously, in a real game, we’d need something much more robust, with better error handling, etc., but this is enough to illustrate the point. Also, this isn’t normally how I’d write this function, but I’ve simplified for the same reason.)

The function works by picking a random number from 0 up to the sum of the weights. We then use that as an “index”, but not directly into the array. Instead, we count down from our chosen number, each time subtracting successive weights, until we go below zero, where we produce the corresponding letter.

Let’s say that our random number is 15. We go through the weights array, starting with 9. Subtracting 9 from 15 leaves 6, so we keep going, down by 2 to 4, then by 2 again to 2, then by 4 down to -2. That’s below zero, so that’s where we stop, returning 'd'.

This method isn’t just limited to choosing letters. You can use it anywhere you need a biased sample. Think of a game that has five different types of enemies, each with different chances of spawning. Set up a list of enemy types, another holding their appearance frequencies, and the same method will give you waves of bad guys.

(Note: I’m not claiming credit for any of this. It was all figured out a long time ago, and certainly not by me. I’m not even the first to write about it online. But it’s definitely something beginners should learn, and I hope to post more little “tutorial” articles like this in the coming months, because we were all young, once.)

Simple Choices

Harder Choices

2 thoughts on “Weighted Random Choices (JS)”

Leave a Reply Cancel reply