from random import choice from collections import Counter def flip(times): return Counter( choice(['H','T']) for _ in xrange(times) ) for _ in xrange(5): print flip(1000)
Counter({'H': 506, 'T': 494}) Counter({'T': 512, 'H': 488}) Counter({'H': 500, 'T': 500}) Counter({'H': 521, 'T': 479}) Counter({'T': 559, 'H': 441})The result can be evenly balanced but it can also vary wildly. Where the results absolutely must be balanced, then block randomisation can help.
The basic idea behind block randomisation is to "pre-flip" the coin for small ranges (the "blocks") in a way that ensures the range is balanced. Sticking with coin flipping as the example, if we had a block size of 4, there are 6 possible sequences of flips that would give us a balanced result:
H H T T H T H T H T T H T H H T T H T H T T H HSo rather than flip a coin for each item, we randomly select one of those sequences, and uses its values for the next 4 items:
from random import choice from collections import Counter from itertools import permutations def get_block(choices, size): return choice( list(permutations(choices * (size/len(choices)))) ) for _ in xrange(5): print get_block(['H','T'], 4)
('H', 'T', 'H', 'T') ('T', 'H', 'H', 'T') ('H', 'H', 'T', 'T') ('T', 'H', 'T', 'H') ('H', 'T', 'T', 'H')So the first item would be heads, then tails, then head, tails, tails, head, head etc. As long as the sample size is evenly divisible by the block size, the distribution will always be balanced:
def block_flip(times, block_size): assert (times % block_size) == 0 choices = ['H','T'] return Counter( flip for _ in xrange(0, times, block_size) for flip in get_block(choices, block_size) ) for sample_size in [100, 1000, 10000, 100000]: print block_flip(sample_size, 8)
Counter({'H': 50, 'T': 50}) Counter({'H': 500, 'T': 500}) Counter({'H': 5000, 'T': 5000}) Counter({'H': 50000, 'T': 50000})Perfect!
No comments:
Post a Comment