100 operations is really small sample for something that happens once in a 100.<p>If you have chance 1/100 of death then with sample 100 you have to have 0 or 1 deaths to fit in 60% but with sample 200 you can have 0,1,2,3 deaths to still fit.<p>It's just 60% is harsh limit for small samples. There should be some lower limit on sample size you have to gather for given probability before you start judging. Either that or threshold should depend on sample size and probability, but limiting sample size might be easier to understand for general public.
As with everything there is always multiple factors, one of them being chance, this article (interactive no less) seems to highlight this very well.<p>I guess the rest of us not in the hospital business can use this to be weary of statistics and data presented to us, even when provided from a reliable source. When gathering our own statistics its often necessary to do that for a prolonged period in order to let the data reveal what and what isn't pure chance.
They are describing a basic Monte Carlo simulation. It shows you that, given a certain distribution, some subset of trials will exceed some arbitrary cutoff. But, well, of course! This must be topical in Britain - is there some flap about death rates going on there? Perhaps there is some concern about too-stringent cutoffs being used there to try to catch bad physicians?<p>The real take-home here is that if you set your 'warning' threshold to be too low, you'll get a lot of warnings. Here, they used an arbitrary cutoff of 8 as their warning threshold. A fairer move would have been to allow some data to come back, try to infer a distribution and it's parameters, and then set future cutoffs based on those parameters. One could even continuously recalculate based on incoming data.