> <i>Not to me confused with the Soviet shoe factories that produced only one type and orientation of shoe - with the other half of the pair being shipped in from 500 miles away.</i><p>This reminded me of something our anti-counterfeiting startup learned when integrating into manufacturing for a high-end outdoor apparel brand.<p>There are efficiencies that the factory gets by, say, parallelizing work on the sleeves of a jacket, while other parts are also worked on. <i>But</i> each batch of dyed outer material may have slight variation in color, no matter how hard they calibrate.<p>So, their processes are designed to ensure that, when the jacket pieces come together, they all came from the same dyed batch of outer material.<p>(There were many interesting tidbits like that, and those were just the ones relevant to us, in terms we could understand. I'm sure the factory also knew a million other things.)<p>Probably the Soviet shoes of TFA had worse problems than slightly mismatched color.
A similar thing I've noticed in the last ~6 years is in regards to smartphones, and in particular their cameras.<p>Android is pretty standardized these days, and even a fairly cheap Android phone will probably do most smartphone things (make calls, send texts, watch YouTube, etc.) reasonably well, but a corner that seems to be cut in cheaper phones is almost always the camera, and that kind of makes sense.<p>If you're buying your phone online, you can look at the more-or-less objective specs of the phone like how much RAM it has, or how fast the CPU is, but you can't really tell how the photos will look [1], and as such it's easy to put a cheaper sensor or lens in there, especially in the cheaper phones which (I think) have lower margins.<p>[1] Even if the listing has sample pictures, you don't know how representative they are of the actual experience; they could just have taken the sample photos with ideal lighting, with a tripod, in a controlled studio, for example.
This is one of the reasons ML benchmarks are most useful soon after their creation. Once they become a target, they cease to meaningfully measure quality.<p>The corollary for the Soviet shoe factory is you need incentives that propagate to guide production. You need the market to guide you in all the little details and you have to discover little by little what it is actually demanding.<p>For ML, we need something similar if not the same.
Tangentially, I think the mostly attribution-free discussion style of the C2 wiki has some merit. It's nice to read opinions and contributions on a topic without the cognitive noise of associated user accounts and reaction counts.
This has been talked about endlessly so why don't we skip to the conclusion. To avoid the problem of gaming metrics you either need to<p>1. Spend a lot of time and thought into developing ungameable(whether this is possible is unclear) metrics and ways to measure these metrics.<p>2. Admit defeat and not be transparent with evaluations and/or go off of vibes.
> All features as found on the specification are included, but the user interface is awkward. The vendor can claim, "we have all the features you need; thus, pick us."<p>Describes every Enterprise software I've ever used.
My last company started stack ranking people on number of GitHub commits - as if each commit is equivalent and the only thing that mattered is the quantity.<p>Soon enough, we started seeing lots of politics, cutthroat environment, management pressure, management meetings looking at dashboards of commits, stack ranks, PIPs and firings, but also a significant rise in the number of commits.<p>One year later, commits metric won. But the business lost customers tired of buggy software.<p>You got what you measure but quantity does not substitute quality.
Turns out that $ profit is one of the least gameable metrics.<p>Not completely ungameable, but for a consumer product manufacturer in a free market it's the least gameable metric invented so far.
I can recommend the book Red Plenty, if you want to learn more about the successes and failures of Soviet economy:<p><a href="https://chris-said.io/2016/05/11/optimizing-things-in-the-ussr/" rel="nofollow">https://chris-said.io/2016/05/11/optimizing-things-in-the-us...</a>
This principle, as well as related principles, like the McNamara Fallacy and Goodhart's Law essentially boil down to one lesson I've come to realize in life: "numbers", "metrics" or even a "system" are never a substitute for actual humans caring about doing the right thing. If the humans involved care to do the right thing, they will do it, even without a system (although some systems make it easier). If they don't care about doing the right thing, no system or data-driven approach can fix that.<p>Which is also why I have a sneaking suspicion that most economic theory is complete bullshit. All the debates over privatization vs. public services, monopoly vs. competition, autocratic vs. democratic styles of leadership etc. are mostly irrelevant. Both can work. It all boils down to whether you have good human stakeholders who have the integrity and agency to do the right thing. Counter to most economic theories, for example, a monopoly that is run by people who actually care about doing the right thing might be better run than fiercely competitive startups who just want to make a quick buck.
This reminds me of a brilliant Soviet satirical cartoon that illustrated the same principle with hats instead of shoes: <a href="https://youtu.be/gSpjDi2BrQk?si=CtJwQTHkm0HfNrxx" rel="nofollow">https://youtu.be/gSpjDi2BrQk?si=CtJwQTHkm0HfNrxx</a>
A Soviet cartoon with similar story - a greedy customer wants a fur hat from a sheepskin, and he asks whether the hat tailor can make 2 hats from the same skin, and the tailor says sure he can ... that goes to 7 hats, and when the customer comes to receive the completed order he receives the 7 hats as ordered <a href="https://youtu.be/gSpjDi2BrQk?t=193" rel="nofollow">https://youtu.be/gSpjDi2BrQk?t=193</a>
Measurement Dysfunction is a better name for this, tho less colorful and insulting<p><a href="https://en.m.wikipedia.org/wiki/Measurement_dysfunction" rel="nofollow">https://en.m.wikipedia.org/wiki/Measurement_dysfunction</a>
The Theranos Startup Model is just the capitalist equivalent of SSFP:
Instead of making useless shoes, they make useless software or fake hardware.
Instead of meeting state quotas, they hit VC funding goals.
Instead of the government propping them up, investors keep them afloat.<p>2. Theranos as a “Startup Shoe Factory”<p>Theranos is the perfect capitalist version of the Soviet Shoe Factory:<p>Soviet Model == Theranos Model (Capitalist Equivalent) Shoe factories produced small sizes to meet quotas. Theranos faked blood test results to meet investor expectations. Factories cut material costs and quality to meet output goals. Theranos lied about its technology because real innovation was too slow. Central planners rewarded metrics over reality. VC investors rewarded hype over real products.
Factories looked productive on paper, even if they made useless products. Theranos looked like a billion-dollar startup, even though it had no working product. The economy was distorted by planned quotas. The startup world is distorted by fake valuations and exit-driven funding.