Statistical Process Control: A Practitioner's Guide (2022)

83 点作者 jpalomaki大约 1 年前

7 条评论

roenxi大约 1 年前

A few related observations:Software development is not a stable process. Either a team is always building new things in which case there isn't a consistent process to measure, or there is a see-saw as people release new features, deal with the bugs in the features, then go back to building - that isn't a controlled process, it is going to oscillate in statistically weird ways.If SPC is applied to bugs, it will be monitoring the relevant manager's habits. That is to say, if you show me a nice in-control timeseries of bug resolution, all that says is when a bug blows out horribly the manager splits it into 2x tickets or something similar. It isn't necessarily a bad outcome (small tickets are happy tickets and gently stressing managers is a good idea) - but don't expect the devs to behave differently.It is good to have a grounding in SPC, just don't try to apply it to every timeseries that you see. Bugs are a timeseries, but they aren't expected to be a controlled process so SPC's assumptions break down and the logic doesn't work. If it does work, it is probably measuring something other than the software development aspect of the process.

评论 #39638491 未加载

评论 #39636616 未加载

lifeisstillgood大约 1 年前

For interest: <a href="https://en.m.wikipedia.org/wiki/Stable_process" rel="nofollow">https://en.m.wikipedia.org/wiki/Stable_process</a>And“Until we can define what a stable process is, we are doomed to argue forever all use of any statistical metric. For the love of a all science, please help!”<a href="https://www.isixsigma.com/variation/what-stable-process/" rel="nofollow">https://www.isixsigma.com/variation/what-stable-process/</a>

jacques_chester大约 1 年前

This is one of my favorite introductions to this topic area, especially since it gets away from the dominance of manufacturing applications for SPC.

评论 #39639880 未加载

shadowsun7大约 1 年前

A couple of quick notes, from someone who has actually put this to practice — and in a non-manufacturing context, to boot!(From a brief reading of this thread, it seems like kqr, jacques_chester, and I are the only ones who have put this to practice in non-manufacturing contexts — though correct me if I'm wrong.)The bulk of the debate in this HN thread seems to be centred around what is or isn't a 'stable process'. I think this is partially a terminology issue, which Donald Wheeler called out in the appendix of Understanding Variation. He recommends not using words like 'stable' or 'in-control', or even 'special cause variation', as the words are confusing ... and in his experience lead people to unfruitful discussions.Instead, he suggests:- Instead of calling this 'Statistical Process Control', call this 'Methods of Continual Improvement'- Use the term 'routine variation' and 'exceptional variation' whenever possible. In practice, I tend to use 'special variation' in discussion, not 'exceptional variation', simply because it's easier to say.- Use the term 'process behaviour chart' instead of 'process control chart' — we use these charts to characterising the behaviour of a process, not merely to 'control' it.- Use 'predictable process' and 'unpredictable process' (instead of 'stable'/'in-control' vs 'unstable'/'out-of-control' processes) because these are more reflective of the process behaviours. (e.g. a predictable process should reliably show us data between two limit lines).Using this terminology, the right question to ask is: are there processes in software development that display routine variation? And the answer is yes, absolutely. kqr has given a list in this comment: <a href="https://news.ycombinator.com/item?id=39638491">https://news.ycombinator.com/item?id=39638491</a>In my experience, people who haven't actually tried to apply SPC techniques outside of manufacturing do not typically have a good sense for what kinds of processes display routine variation. I would urge you to see for yourself: collect data, and then plot it on an XmR chart. It usually takes you only a couple of seconds to see if it does or does not apply — at which point you may discard the chart if you do not find it useful. But you should discover that a surprisingly large chunk of processes do display some form of routine variation. (Source: I've taught this to a handful of folk by now — in various marketing/sales and software engineering roles —and they typically find some way to use XmR charts relatively quickly within their work domains).[Note: this 'XmR charts are surprisingly useful' is actually one of the major themes in Wheeler's Making Sense of Data — which was written specifically for usage in non-manufacturing contexts; the subtitle of the book is 'SPC for the Service Sector'. You should buy that book if you are serious about application!]I realise that a bigger challenge with getting SPC adopted is as follows: why should I even use these techniques? What benefits might there be for me? If you don't think SPC is a powerful toolkit, you won't be bothered to look past the janky terminology or the weird statistics.So here's my pitch: every Wednesday morning, Amazon's leaders get together to go through 400-500 metrics within one hour. This is the Amazon-style Weekly Business Review, or WBR. The WBR draws directly from SPC (early Amazon exec Colin Bryar told me that the WBR is but a 'process control tool' ... and the truth is that it stems from the same style of thinking that gives you the process behaviour chart). What is it good for? Well, the WBR helps Amazon's leaders build a shared causal model of their business, at which point they may loop on that model to turn the screws on their competition and to drive them out of business.But in order to understand and implement the WBR, you must first understand some of the ideas of SPC.If that whets your interest, here is a 9000 word essay I wrote to do exactly that, which stems from 1.5 years of personal research, and then practice, and then bad attempts at teaching it to other startup operator friends: <a href="https://commoncog.com/becoming-data-driven-first-principles/" rel="nofollow">https://commoncog.com/becoming-data-driven-first-principles/</a>I don't get into it too much, but the essay calls out various other applications of these ideas, amongst them the Toyota Production System (which was bootstrapped off a combination of ideas taught by W Edwards Deming — including the SPC theory of variation), Koch Industries's rise to powerful conglomerate, Iams pet foods, etc etc.

评论 #39639552 未加载

评论 #39640681 未加载

评论 #39639551 未加载

fjkdlsjflkds大约 1 年前

I stopped reading at this point:> Roughly half of your measurements will be above average, and the other half below it.This is simply not true for most definitions of "average" (i.e., for all definitions of "average" except the median).Example:> (data <- c(1:10,100))[1] 1 2 3 4 5 6 7 8 9 10 100> sum(as.numeric(data > mean(data)))/length(data)[1] 0.09090909A 90/10 split is hardly "roughly half of your measurements".

评论 #39638452 未加载

BizOpsZen大约 1 年前

Came across this on twitter. Here's why I think SPC is related to Software Development (and Agile concepts, particular the burndown charts) more generally:To be clear, they are related but not the same use case. IMO, both Agile and SPC leverage the same insight: variation is inevitable: what matters is not that it exists, but how you deal with it.With SPC, you are establishing a normal variation so that you can identify abnormal activity that is warrants further investigation.With Agile you're not really looking for outliers per se, it's more that you want to get to a place where your "normal" variation is a much smaller range. Because a smaller range leads to better quality and more output:Variation in the software dev context is the difference between your estimates and the actual work required to deliver a feature, etc. High variation means you're constantly in a rush, need to cut corners, need to cut scope, etc.This has a lot of downstream impacts in terms of quality but also in the actual scope of what you can deliver. In short, you need to spend more time fixing bigger problems.Less variation means smaller problems and less time spent fixing --> more time is allocated to new feature development.(and separate topic, but variation in Software dev has a special property where it only accrues in the "takes longer" side vs. the "take less time" direction. You never "make up time" because something is quicker than your estimate. see note below.So the burndown chart is less about enabling you to see outliers, more providing visibility to the variation so that you can work towards making it smaller. If you're constantly loading work in the end of the sprint, you have a problem with the scoping process.How does that track back to Agile?One the key elements of Agile process is breaking work down into smaller batches --> and Breaking things down into smaller batches is* they key mechanism to reducing variability.NOTE: *Software pretty much only takes longer than expected because there is high visibility into the fastest something can be done, but very little visibility into the unexpected things that can add scope to the project. So it's extremely rare for something to happen that make it take less effort than your estimates, but very common for things to add scope.It's similar to estimating how long it will take to drive somewhere: you can get a pretty accurate sense of the fastest it will take based on distance and speed. But the things that extend the duration of the trip, like a car accident or unexpected road work, are just much more unpredictable. So if you were to plot that variation on a chart, you only see it move in one direction.

ilayn大约 1 年前

Another post to drive a control theory person up the wall.