Thursday, October 06, 2011

Stratified Randomisation

The next randomisation protocol to be implemented is used to ensure treatment balance across 'strata', participant characteristics that are relevant for the trial study.

For example, the study might wish to monitor treatment differences between men & women (one strata), and between two ages groups: less than 30, or 30 and over (another strata). This gives us two strata. However, this gives us four categories we're concerned with:

Men <30, Men >=30, Women <30, Women >=30. 

What we're trying to achieve here is as balanced an allocation between treatments for each of those categories as we possibly can. So effectively, we can treat each individual category as a separate schedule and use block randomisation to assign treatments within each one.
# this builds on the code from the previous post

class Stratified(object):
    def __init__(self, strata, arms, block_size):
        self.strata = strata
        self.arms = arms
        self.block_size = block_size
        self.blocks = dict()
        self.record = defaultdict(list)

    def new_block(self):
        return list(get_block(self.arms, self.block_size))

    def add(self, stratum):
        block = self.blocks.get(stratum, self.new_block())
        flip, block = block[0], block[1:]
The Stratified class is used to maintain the current block state for each of the categories, the actual scheduling uses the block randomisation functions.

Unlike block randomisation, there can still be imbalance amongst the categories, as allocation is dependent on participant characteristics.

[Please note that this is incorrect. See the follow up article for correction.]

To test this, the schedule is fed a stream of participants with randomised characteristics:
    strata = ['M<30', 'M>=30', 'F<30', 'F>=30']
    arms = ['H','T']
    stratrand = Stratified(strata, arms, 4)
    for _ in xrange(10000):

    for stratum in strata:
        count = Counter(stratrand.record[stratum])
        count['diff'] = abs(count['H']-count['T'])
        print stratum, 'H=%(H)d T=%(T)d Diff: %(diff)d' % count
    M<30 H=1241 T=1262 Diff: 21
    M>30 H=1241 T=1241 Diff: 0
    F<30 H=1278 T=1244 Diff: 34
    F>30 H=1249 T=1244 Diff: 5
The larger the sample size, the smaller the imbalance tends to be. The final protocol - minimisation - is effectively stratified randomisation with a biased coin (I think) to even further reduce any imbalance.

No comments:

Post a Comment