TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Nelson Rules

219 pointsby misterdataalmost 10 years ago

11 comments

repsilatalmost 10 years ago
Out of curiosity, with a regular normal distribution I wonder what the probability is that the most recent point finds a problem. I guess you could calculate these things separately for an approximation, but I&#x27;d probably just want to simulate it...<p>For a first cut:<p>- Rule 1: 0.3% of samples are more than 3 standard deviations of the mean.<p>- Rule 2: 1&#x2F;2^8 = 0.4% chance the previous 8 points were on the same side of the mean as the most recent one.<p>- Rule 5: 2.5% chance of being above 2sd on either side, 3 choose 2 is 3, 2 sides, so 0.375% of exactly 2. &quot;2 or 3&quot; is not much higher.<p>- Rule 6: More than 0.55%, if I&#x27;ve done my maths right.<p>- Rule 7: 0.3%<p>I guess you&#x27;re going to get a lot of false positives if you&#x27;re sampling reasonably frequently -- maybe one in 50?
评论 #9981645 未加载
评论 #9984550 未加载
评论 #9981259 未加载
评论 #9982328 未加载
评论 #9981270 未加载
评论 #9982995 未加载
jameshartalmost 10 years ago
Harry Nyquist would probably have something to say about the validity of rule four. Fourteen points in a row of alternating increasing and decreasing isn&#x27;t just indicative of an oscillation, but an oscillation that&#x27;s close enough to your measurement frequency that you aren&#x27;t actually able to measure it accurately. It could be a real oscillation, or it could be an artifact of much higher frequency behavior.
评论 #9981482 未加载
评论 #9981405 未加载
jacquesmalmost 10 years ago
Related: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Moscow_Rules" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Moscow_Rules</a><p>Which includes my favorite: &quot;Once is an accident. Twice is coincidence. Three times is an enemy action.&quot;
评论 #9982266 未加载
评论 #9981267 未加载
评论 #9984553 未加载
learnstats2almost 10 years ago
Thanks for this reference!<p>GCSE Statistics (UK school exams at 16 years old) teach a simpler system of process control rules, closer to Western Electric <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Western_Electric_rules" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Western_Electric_rules</a>: and that is the only place I have ever come across them.<p>Is this in current practical use?
calinet6almost 10 years ago
Oh yes. Yes yes. Learn this and understand how it applies to your systems, your processes, and especially (surprise) your people.<p>This is one quarter of how W. Edwards Deming promoted organizational quality control—understanding how variation works, period. (The other three being understanding psychology, understanding systems, and understanding the theory of knowledge or scientific method).<p>This applies directly to understanding whether observed variation has a common cause (is a natural pattern of the system), or is special cause (something unexpected): <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Common_cause_and_special_cause_(statistics)" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Common_cause_and_special_cause...</a> and this impacts how you handle the variation.<p>For those criticizing validity, I&#x27;ll say this is a way to mentally model how to understand variation, and is not meant to be 100% accurate. You&#x27;re trading intuitive modeling for perfect math. But it will allow you to get close in a back-of-the-napkin quick way so you can identify patterns to study in more depth. Also, think of this in the context of many types of systems, not just a tight electrical signal pattern (which are easy systems to understand). Systems of people doing software development, machines in manufacturing processes, complex network error patterns, etc etc.<p>People don&#x27;t often have a good idea of what&#x27;s important and what&#x27;s noise, especially when you don&#x27;t even have a control chart but are just using intuition and a few data points. We see outliers and variations all the time in processes, especially in human processes like those we encounter in most software companies. Estimation and delays, developer performance, load failures; all kinds of complex systems that exhibit variation that people are usually &quot;winging it&quot; to understand.<p>Instead of understanding the variation and the data, people often handle every large variation in the same way, trying to &quot;fix&quot; it or peg it on some obvious correlation they think they observe. This says: hold on, understand what you&#x27;re looking at intuitively first. Then gather more data. Don&#x27;t act without understanding. Deming was fond of saying, &quot;don&#x27;t just do something, stand there!&quot; Lots to be learned from that, and much to be gained from the simple intuitive understanding of patterns in variation.
IshKebabalmost 10 years ago
This seems terribly fragile and ad-hoc. It doesn&#x27;t even take into account sampling rate and it clearly depends on it.<p>I guarantee there are better methods.
评论 #9982072 未加载
评论 #9983623 未加载
thanatropismalmost 10 years ago
Since I was preempted for the comment &quot;this is typical quality-control ad-hockery BS&quot;, I&#x27;ll play devil&#x27;s advocate and argue that the point of quality control is to identify two components of a mixture distribution[0]: a bounded distribution of uncertainty, which can be modeled as a beta (&quot;PERT&quot;, in some patois), and an unbounded, &quot;error&quot; term that functions more like a Poisson or even a Pareto.<p>This is already an adhockish simplification of something like Mandelbrot&#x27;s seven regimes of randomness [1], which is itself, well, an oversimplification of his own work. But it formalizes the insight that quality-control is trying to impart -- the identification of inconsistencies among consistent variation.<p>So let&#x27;s run some simulations in Matlab. We&#x27;ll generate M numbers distributed like Beta and N like a Pareto (a &quot;long tail&quot;, &quot;black swan&quot; distribution) with identical mean and standard deviation, and shuffle them before we interpret them as a time-series. Then we&#x27;ll check Nelson&#x27;s rules. Since we know how many ordinaries there are, we have a target.<p>In 10^4 repeated simulations each with samples of 180 ordinary betas and 20 Paretos, we expect to identify 10% of abnormals. Now, my samples are shuffled, and Nelson rules rely on time-structure (but this is precisely their weak spot); my code [2] also has visible bugs I didn&#x27;t bother to fix because they&#x27;d involve thinking too hard and didn&#x27;t seem so serious in large samples. Still, here are identification rates:<p>- Rule 1: 0.5%. This will be counter-intuitive to students of the normal distribution, but recall that the ordinary observations have a bounded distribution and we&#x27;re really catching only the abnormals. Now, we&#x27;ve missed a lot of the 10% by this rule.<p>- Rule 2: 6.32535%<p>- Rule 3: 3.4421%<p>- Rule 4: 4.484%<p>- Rule 5: 27.83735%. This is the &quot;medium tendency for samples to be mediumly out of control&quot;.<p>- Rule 6: 8.49465%<p>- Rule 7: 4.38775%<p>- Rule 8: 2.1294%<p>(Edited after some bug fixes that, comfortingly, didn&#x27;t change the results by much!)<p>[0] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Mixture_distribution" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Mixture_distribution</a> [1] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Seven_states_of_randomness" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Seven_states_of_randomness</a> [2] <a href="http:&#x2F;&#x2F;lpaste.net&#x2F;137664" rel="nofollow">http:&#x2F;&#x2F;lpaste.net&#x2F;137664</a>
评论 #9984828 未加载
darkhornalmost 10 years ago
I have B.S. in Statistics and I have never heard the term &quot;Nelson Rules&quot;. However all of this information was taught under other names when we were dealing with &quot;Normality&quot;. Also, don&#x27;t forget to convert your data into Standard Normal Distribution (and it is not that simple, you have check some tests also!!!). And of course you will always make mistakes because even 4th year Statistics students make mistakes...
anonualmost 10 years ago
I wonder what the application, if any, there is to finance&#x2F;stock charting?
评论 #9983087 未加载
评论 #9984805 未加载
评论 #9984840 未加载
niels_olsonalmost 10 years ago
We live by these things in the clinical lab, though we know them as the copyrighted version, the &quot;Westgard rules&quot;. Glad to have the original, thanks!
zwerdldsalmost 10 years ago
There is also a MILSPEC for general quality control processes, though I can&#x27;t find the particular document at the moment.<p>If you&#x27;re one of the people in the thread describing this as too subjective or strict, the MILSPEC is probably more appropriate for your process.