Shuffle

Tags: [math], [statistics]

Link: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm

Given an array, please write an algorithm to shuffle it (randomize it).


Solution: In-place

import random

def shuffle(arr):
    if not arr:
        return arr

    arr_len = len(arr)
    for i in xrange(arr_len - 1):
        random_index = random.randint(i, arr_len - 1)
        arr[i], arr[random_index] = arr[random_index], arr[i]

    return arr

Revelation:

  • To prove the algorithm generates 100% random shuffle: For example, the original array we can call it 'arr' (such as [A, B, C, D, E, F]), and the shuffled array, we can called 'shuffle_arr'(such as [B, A, E, C, D, F]) (although this algorithm is modifying the array in-place, here we still use two names to mark the array we shuffled before and after). First we can think about just one element, what's the probability the 'A' was put in the position shuffle_arr[1]? The answer is P = (probability of 'A' was not picked at first time * probability of A was picked at the second time) = (1 - (1 / n)) * (1 / n - 1) = ((n - 1) / n) * (1 / n - 1) = 1 / n. We can use this way to calculate each element, finally we will find the probability of each element at each position are all 1 / n, which prove that this algorithm can generate 100% random shuffle.

Note:

  • Time complexity = O(n), n is the number of elements of the given array.
  • For the last element, we don't need to swap it with others, because there is no remaining elements to swap with it, and also, it has been given opportunity to be swapped.

Solution: not in-place

import random

def shuffle(arr):
    if not arr:
        return arr

    arr_len = len(arr)
    index_set = set()
    shuffle = []

    while len(shuffle) < arr_len:
        random_index = None

        while True:
            r_i = random.randint(0, arr_len - 1)
            if r_i not in index_set:
                random_index = r_i
                break

        index_set.add(random_index)
        shuffle.append(arr[random_index])

    return shuffle

Revelation:

  • Sometimes, we may be required not to modify the given array, so we may have two ways to achieve this go, the first one is using the in-place algorithm, but not modify the given array, we can copy the given array first, and modify the copied one. The second way is using not-in-place approach.

Note:

  • It's a little hard to calculate the time complexity for above algorithm, because we don't know how many times it take to generate each random_index.
  • To prove the above algorithm can generate 100% random shuffle, we can use the same way (as in-place solution) to analyze the correctness of the above algorithm.

results matching ""

    No results matching ""