While I agree completely with the premise of this article, on the other hand I'm weighing the relatively robust findings by Meehl et al. They find, time and time again, in all sorts of fields, that extremely parsimonious models like equal-weighted linear regression of one or two predictors outperform expert judgment[1].<p>One would think this is cognitively dissonant enough, but it gets worse:<p>This article, with the thesis that good arguments are more important than data, is based on, well, a good argument – not much data. On the other hand, the work by Meehl et al. claiming pretty much the opposite, is based on, well, a lot of data, and maybe not much intuitive reasoning. (There's some, yes, but the main thrust of why I believe it is that variants of the experiment have been replicated reliably.)<p>I don't know what to believe. Fortunately, as I've grown older, I've become more comfortable with holding completely dissonant opinions in my head at the same time.<p>----<p>Edit a few minutes later: This actually prompted me to refresh on the subject. It might be the case that Meehl is actually making the same argument as this article, only it gets distorted when repeated. Some things are reliably measurable; for those things be data-driven. Other things not so much, then use your expertise.<p>----<p>[1]: Here's just one relatively early example: <a href="http://apsychoserver.psych.arizona.edu/JJBAReprints/PSYC621/Dawes_Faust_Meehl_Clinical_vs_actuarial_assessments_1989.pdf" rel="nofollow">http://apsychoserver.psych.arizona.edu/JJBAReprints/PSYC621/...</a>
> Are you prepared to do some very very fancy statistics?<p>I'd extend this with "... while understanding what you're doing?"<p>I've seen it so many times already, someone does some A/B-test and then presents a very fancy looking slide-deck with all kinds of crazy-looking math. But if you start to ask questions, it's all very obvious that they didn't really understood what they were doing and that very often it doesn't really matter to them in the first place; it's all about reaching a decision using some pseudo-scienty method that nobody dares to question because 'data' and 'science', without having to take responsibility.
I have experienced this first hand, so this article resonates a lot with me.<p>I worked with a manager who prioritized work which was easily measurable, so he could report the good numbers to leadership and get career points out of this. Unfortunately the project we took on was a demanding and technically challenging problem, and in almost a year of work of a team of engineers we made barely any real progress or made any actual difference, but the numbers were great and people were satisfied during presentations. I ended up feeling completely disconnected from my job and losing all motivation to work there.
> I originally claimed that data-driven culture leads bad arguments involving data to be favored over good arguments that don’t<p>This is symptomatic of the deeper problem of thinking in terms of bumper stickers and slogans, instead of thinking from first principles. When it afflicts educated people, usually you hear slogans like "an anecdote is not data", or "that's the slippery slope fallacy". Instead of grappling with noisy reality, they have sharp cognitive categories with firm boundaries between concepts, then they try to squeeze things into these categories in order to make cognition easier because the relations between the categories are already understood. This gives them the illusion of rigorous and clear thought.
This entire discussion makes a good case for why the general populace would benefit from being taught the basics of philosophy.<p>In this case the topic of value is the often fraught relationship between <i>empiricism</i> and <i>rationalism</i>, and the impacts each have on the scientific process, research, education, and how we go about understanding the world.<p>To operate with one with a complete absence of the other is to expose yourself to huge, often fundamental gaps in your thinking, your arguments, and your plans. This is what the author is ultimately getting at from the direction of the empirical: data, in the form of a large collection of discrete observations, can be used to justify a sea of mutually exclusive claims that may or may not be in accordance with reality, and that's to say nothing about the quality of the data itself.
I often experience the inverse: people come up with hypotheses and theories that should see expressions in observable data - but no-one bothers to look and instead everyone argues around logical constructs etc.
This reminds me a lot of the discussion of the scientific method by Karl Popper, and David Deutsch who was very influenced by Popper. "Being data-driven" sounds very <i>empirical</i>. Just look at the data, and see what you find in it.<p>But you can't just let the data "speak for itself" without an explanation or a theory that interprets the data. Popper in <i>Conjectures and Refutations</i>:<p>> Observation is always selective. It needs a chosen object, a definite task, an interest, a point of view, a problem. And its description presupposes a descriptive language ... which in its turn presupposes interests, points of view, and problems.<p>Deutsch, in <i>The Beginning of Infinity</i>, emphasizes the importance of conjecture, and the role of observation as refuting or criticising those conjectures:<p>> Where does [knowledge] come from? Empiricism said that we derive it from sensory experience. This is false. The real source of our theories is conjecture, and the real source of our knowledge is conjecture alternating with criticism. We create theories by rearranging, combining, altering and adding to existing ideas with the intention of improving upon them. The role of experiment and observation is to choose between existing theories, not to be the source of new ones. We interpret experiences through explanatory theories, but true explanations are not obvious.<p>To bring this back to the subject of the article, I might suggest that it's possible to be "data driven" without a sound explanation or theory that the data is either interpreted through, or used to criticise. Or maybe such theories do exist, but are left implicit.
I won’t belabor the point because others have already made it: this article assumes there is some way to sort through good and bad arguments in the absence of data - a pretty big leap. The reality is all of our arguments are appealing to some sort of data (eg previous experience), it’s just that it doesn’t always fit in a neat definition of data.<p>Obligatory: <a href="https://en.m.wikipedia.org/wiki/All_models_are_wrong" rel="nofollow">https://en.m.wikipedia.org/wiki/All_models_are_wrong</a>
The related problem that I see actually more often is the "you don't have big data" problem.<p>You know, in data science, you see people spending hours writing pandas scripts that replicate a few clicks in excel for a one of analysis. You see datasets of a few gigabytes being processed with spark when SQL would be fine. You see ML techniques being thrown at questions that could be answered simply and reliably with basic statistical tests.<p>Especially in the B2C space a lot of companies, departments, products don't actually have a lot of customers and certainly not many decision makers. The N number is always going to be low. You can just talk to people. Let's say you are doing pretty well and running a SaS with 1000 corporate customers paying a million each - that's a billion dollar revenue - you can just talk to them. Certainly you can just talk to every single person who signs the cheque and those are the only people that matter.<p>And which is easier - putting together a thorough suite of A/B tests or getting some real customers to use your app on video and talking to them about what they are finding annoying, useful, missing? I see less people do that than you'd think.
To use Clayton Christensen’s theory of innovation here, to sustain innovation, businesses tend to be purely data driven. They continue to grow and make more money based on choices made with pure data.<p>For disruptive innovation however, there needs to be an “argument” or opinion to help drive that data based on the industry trends. Companies then take a risk of delivering something new and good enough to the market. Also known as disruptive innovation.<p>This has shifted the idea of being data-driven to being one of “data-inspired”.<p>Anyone can make the same dataset fall into their favor. That’s the problem with being purely data-driven. Another way to think of it in the US especially is that our two party system makes wildly different conclusions from the same data. What’s preventing businesses from doing the same?
To the author... I'd suggest a rewrite of what you're trying to communicate because your usage of <i>"good-argument-driven"</i> is a textbook example of Begging The Question: <a href="https://en.wikipedia.org/wiki/Begging_the_question" rel="nofollow">https://en.wikipedia.org/wiki/Begging_the_question</a><p>For discussion's sake, let's go along with <i>excluding data/metrics/science</i> in pushing for arguments. In this framework, what exactly is a "good" argument based on? Gut feel? Opinion?<p>There was a famous quote by Jim Barksdale, the former CEO of Netscape: <i>"If we have data, let’s look at the data. If all we have are opinions, let’s go with mine."</i><p>(So the tie-breaker in competing arguments in that case was "hierarchy-of-arguer-driven".)<p>So Jane and Bob disagree on the next action to take. Jane thinks her argument is a "good argument" but has no data. But Bob thinks he has a "good argument" but no data.<p>How does this thread's blog post help resolve the above scenario? (Blog's answer: you're driven by the one that has the good argument.) ... which is circular.
This is not what the data shows<p><a href="https://www.google.com/search?q=data+driven+companies+more+profitable&oq=data+driven+companies+more+profitable&aqs=chrome..69i57j0i546j0i30i546j0i546l3.4938j0j7&sourceid=chrome&ie=UTF-8" rel="nofollow">https://www.google.com/search?q=data+driven+companies+more+p...</a><p>Any good-argument-driven based argument you attempt to make is almost always based on political motivating factors, rather on what is good for the business.<p>Intuition driven decisions work when the market is behaving normally, however, are generally too slow in a fast changing market like we have been since the start of COVID.
I think the problem is that people chronically underestimate how hard good science is.<p>Professors get this wrong all the time, despite being some of the smartest people we have around, despite decades of experience and education, despite a career and reputation on the line, and despite a system of peer review to catch mistakes before they get published.<p>Designing experiments is really difficult.<p>Interpreting experiments is difficult and unintuitive.<p>Statistics is difficult. You can't just look at whether the number went up. You need to have a deep understanding of significance, power and effect size, you should probably be doing ANOVA or some such.
> A weak argument founded on poorly-interpreted data is not better than a well-reasoned argument founded on observation and theory.<p>So a good argument is founded on...good data and good understanding of data?<p>The article more seriously makes the mistake of begging the question: it presupposes the known classier of good and bad arguments and then goes on to say bad arguments with data is worse than good arguments. But how do you know good arguments from bad arguments in the first place? What makes a good argument if not empirical data?
In Range, David Epstein talks about about NASA and some of their disasters, like the explosion of Challenger. NASA is the entirely encased in specialized knowledge, and has a completely data-driven mindset, with no room for logic. If you can't prove it with data, they wouldn't even consider it. He explains that, “Reason without numbers was not accepted. In the face of an unfamiliar challenge, NASA managers failed to drop their familiar tools... The Challenger managers made mistakes of conformity. They stuck to the usual tools in the face of an unusual challenge.” Even though the mistake that led to the Challenger disaster could have been caught, it was the uniformity of thinking that lead to an organizational blind spot, and that uniformity was to be too focused on data-driven arguments.<p>There is a famous call prior to the disaster on which engineers had raised the concerns but it was based on intuition and a few cherry picked samples, not a full set of data, and this was the night before the launch. Because of the lack of data, they went ahead with it and we all know the tragedy that ensued. Moreover, other engineers who <i>agreed</i> that there was an issue didn't speak up, because they too lacked the data, and knew that management wouldn't care.
One of the big reasons why data driven approaches are so seductive is, it's very difficult in the moment to distinguish between a good argument and a well crafted rationalization.
1. If there are no good arguments in the collective - there's no retrospective and it's primarily a management and psychological issue. No one is able to fully self-reflect and it breaks the existing delegation / escalation chains, respectively.<p>2. If there are no viable data sources, when it can be proven that there's a correlation with an actual business processes, - it's a management problem. People Can't establish viable metrics, once again, mostly due to 1.<p>This is something any company of any size and any budget can struggle with due to lack of XP and the usual collective XP-accumulation / knowledge sharing deficiency. You can't self-reflect onto something you haven't learned about, yet. And due to 1 this is a closed loop because lack of XP can't be escalated accordingly, most of the time it's also a Workplace Deviance factor.<p>3. Practically, it ends up in a bouquet of Workplace Deviance because no one in the end will be willing to take the blame and actual responsibility to fix anything.<p>Any Problem vs Solution type of culture will worsen things a lot i.e. "All the blame and no Compassion". Companies are usually forced to adopt some Teal stuff in the end, maybe for really no other good reason, but just to keep on growing.<p>The idea of hiring HR that can "work by the booK" and actually build up a personal profile of how anyone could fit into all this mess is impossible by definition - due to Employee Silence and broken retro no one will be willing to expose all the shit that is happening, in the first place... So, most of the time I see Kitchen Sink companies with volatile outcomes where there really no one who could even be able to listen to any arguments, in the first place.<p>Google's internal ML-driven productivity metrics became a meme already for all the reasons described above. You can't reason with Toxic and Inadequate people.<p>Also Asana claim that Social Loafing is a myth and everything else is a retro deficiency really wrong - retro can prevent and display certain glorious occasions, but it's not a root cause of any psychological effect by definition.
Good article. When your only tool is a hammer, every problem looks like a thumb.<p>While we're at it: I've actually been in scrums where the "burndown rate" was analyzed as if it was actually A Thing. It is not A Thing.
Key idea is the "data maturity" of the topic under discussion.<p>Where there is data, you should use it and be smart about it.<p>For a lot of big decisions, especially in companies doing something new, there is no good data at first. You have to reason about it based on experience and analogy.<p>Then, once you commit to a path, you can start gathering data to see if your hypothesis was correct. The further you go, the more you can rely on data, assuming you know how to think about it.<p>Discussions about being data-driven that don't take into account the "data maturity" of the situation are nonsensical.<p>Being "data driven" when you're considering something radically new is either delusional or a cop out.<p>Ignoring data when it could correct your biases is either lazy or wrong or both.<p>And finally, lots of people who claim to be "data driven" are not smart about data. To paraphrase Wilde, "data is rarely pure and never simple." It doesn't just reveal truths you can treat as dogma. It's ambiguous and takes a lot of work to interpret. A lot of "data driven" teams aren't doing that work.
I'm surprised there's no mention of Goodhart's Law [0].<p>Even if the metric is "well understood and free from human/social factors", once you start using it as a target that will no longer be the case.<p>[0]: <a href="https://en.wikipedia.org/wiki/Goodhart%27s_law" rel="nofollow">https://en.wikipedia.org/wiki/Goodhart%27s_law</a>
I couldn't agree with this more. I feel like the author took some of the arguments straight from my brain—I'm exhausted by pseudoscientific "data-driven" arguments.<p>From my experience, most of these try to distill an incredibly complex problem space down to a one-dimensional black and white decision. But the real world doesn't work like that–it's full of grey area, and things we can't effectively measure. If you're trying to slice and dice data down to a happy one-dimensional decision point, you're often missing or ignoring important detail.<p>At work, I'm far more happy with postmortems with general, open "good/bad" lists of after the fact feedback, that we use to consider how we prioritize and design what comes next.
Being data-driven for the sake is being data-driven is indeed becoming an issue. The resources spent measuring and analysing data are overwhelmingly larger than they should in most cases. Cohorts of "data scientists" and "managers" dive head on into data without much (if any!) first-principles thinking. People tend to replicate metrics without much thought into their relevance to the specific situation. Thinking properly is a very hard skill to acquire (the hardest?), and most do everything they can to avoid it.<p>"What you measure affects what you do. If you don't measure the right thing, you don't do the right thing." -- Joseph Stiglitz
Great article, but I think it somewhat misunderstands the impetus for the concept. "Data has its place" sounds obvious precisely because "data-driven" has been such a successful concept. The alternative perspective, which used to be very common in our industry and still pops up from time to time, is that metrics are something you write for debugging and business decisions are made by gut feeling or abstract philosophical analysis. (Most software companies <i>had</i> to make decisions this way in the pre-cloud era, because it wasn't usually feasible to collect usage metrics.)
The hidden assumption here is that things go well if and only if (you think) you understand all the factors that influence your metrics, can do experiments and are prepared to use fancy statistics.<p>Which I reckon is a bit iffy. Special relativity was thought out well before any experiments to test it were feasible, and if understanding everything that influences your metric is a prerequisite then you can blame all failures on insufficient understanding without having any way of knowing when you have <i>enough</i> understanding.
I’ll probably be buried in all these comments, but my position is that data is only as good as how it is collected. Sloppy data collection gives rise to sloppy conclusion through unknown biases.<p>The key is to understand the ‘data generation process’ so you can identify biases. My experience suggests that doing so side-step some common pitfalls.<p>I recommend reach out for ‘The Book Of Why’ by Judea Pearl. He includes many real life examples that’s surprisingly applicable to modern data science.
David Deutsch (Father of Quantum Computation and one of the most brilliant human beings alive) have a really great way of thinking these kinds of discussions.<p>He calls it good explanations.<p>A good explanation is something that is hard to vary while still solving the problem it purports to solve.<p>He is against most use of Bayesianism when used for predictions.<p>Great presentation here<p><a href="https://www.youtube.com/watch?v=EVwjofV5TgU" rel="nofollow">https://www.youtube.com/watch?v=EVwjofV5TgU</a>
A major exception to this reasoning is performance. Argument driven performance suggestions are wrong more than 80% of the time and likely wrong by several orders of magnitude. You can’t know just how wrong you are without appropriate data.<p>This makes for a good litmus test of whether people are lying to you about software or, more likely, have absolutely no idea what they are doing.
A whole book was written on this very topic:
"The Tyranny of Metrics" by Jerry Z. Muller
<a href="https://press.princeton.edu/books/hardcover/9780691174952/the-tyranny-of-metrics" rel="nofollow">https://press.princeton.edu/books/hardcover/9780691174952/th...</a>
Sure, but the thing with "good arguments" is that when two hypotheses oppose each other, it is the case that supporters on each side are sure they are behind the "good argument" so ...<p>Data doesn't lie; it could be nuanced, yes, but if its truthful then you cannot really argue against that.
This reminds me of the Principal Chalmers meme. In this case, first pondering whether he is wrong, only to conclude that it's the data that's wrong.<p>I know that's not what the article says per se, but it's only one slightly abstracted reinterpretation removed, as OP's title demonstrates.
Be data-driven, and question the provenance of your data all the time. Otherwise you will end up like economics, a field with prettier models and more mathematics than almost every engineering field, and yet gets every major prediction wrong.
I've seen a lot of good arguments put to rest with a good test.<p>The key is collecting and looking at the data correctly.<p>Data without a keen understanding of why you need it and what you're looking to solve with it is not much use.
Yes, data is useless without a qualitative explanation. There are simply too many possible confounding factors that you cannot eliminate without understanding what they may be.
Be politically driven (company politics) driven.<p>Good arguments should take in account people's ambitions, and political aspirations especially at big fortune 500 companies.<p>Startups can be more honest.
Being argument driven gives control to the organization's 'lawyers'. People can be very persuasive <i>independent</i> of the reality of the situation.