Well, this sounds like perfect tasks for GPT:<p>"Participants responded to
a total of 18 tasks (or as many as they could within the given time frame). These
tasks spanned various domains. Specifically, they can be categorized into four types:
creativity (e.g., “Propose at least 10 ideas for a new shoe targeting an underserved
market or sport.”), analytical thinking (e.g., “Segment the footwear industry market
based on users.”), writing proficiency (e.g., “Draft a press release marketing copy for
your product.”), and persuasiveness (e.g., “Pen an inspirational memo to employees
detailing why your product would outshine competitors.”)."<p>Here is the GPT response to the first task: <a href="https://chat.openai.com/share/db7556f7-6036-4b3d-a61a-9cd253f094fd" rel="nofollow noreferrer">https://chat.openai.com/share/db7556f7-6036-4b3d-a61a-9cd253...</a><p>A confident GPT hallucination is almost indistinguishable from typical management consulting material...
Well, they buried the lede with this one. Using LLMs were better for some tasks and actually made it worse for others.<p>The first task was a generalist task ("inside the frontier" as they refer to it), which I'm not surprised has improved performance, as it purposely made to fall into an LLM's areas of strength: research into well-defined areas where you might not have strong domain knowledge. This also is the mainstay of early consultants' work, in which they are generalists in their early careers – usually as business analysts or similar – until they become more valuable and specialise later on.<p>LLMs are strong in this area of general research because they have generalised a lot of information. But this generalisation is also its weakness. A good way to think about it is it's like a journalist of research. If you've ever read a newspaper, you often think you're getting a lot of insight. However, as soon as you read an article on an area of your specialisation, you realise they've made many flaws with the analysis; they don't understand your subject anywhere near the level you would.<p>The second task (outside the frontier) required analysis of a spreadsheet, interviews and a more deeply analytical take with evidence to back it up. These are all tasks that LLMs aren't strong at currently. Unsurprisingly, the non-LLM group scored 84.5%, and between 60% and 70.6% for LLM users.<p>The takeaway should be that LLMs are great for generalised research but less good for specialist analytical tasks.
This is hilarious. As impressive as GPT-3/4 has been at writing, what's more shocking is just how bullshity-y human writing is.. And a "business consultant" is the epitome of a role requiring bullshit writing. Chat GPT could certainly out business-consultant the very best business consultants.<p>Sometimes to be taken seriously at work, you need to take some concise idea or data and fluff it up into a multiple pages or a slide deck JUST so that others can immediately see how much work you put in.<p>The ideal role for chatgpt at this moment is probably to take concise writings and to expand it into something way larger and full of filler. On the receiving end, people will endure your long-winded document or slide deck, recognize you "put in the work", and then feed it back into chatGPT to get the original key points summarized.
LLMs are <i>stunningly</i> good at language tasks: almost all of what us old-timers called NLP is just crushed these days. Summarization, Q&A, sentiment, the list goes on and on. Truly remarkable stuff.<p>And where there isn’t a bright line around “fact”, and where it doesn’t need to come together like a Pynchon novel, the generative stuff is smoking hot: short-form fiction, opinion pieces, product copy? Massive productivity booster, you can prototype 20 ideas in one minute.<p>But that’s about where we are: lift natural language into a latent space with some clear notion of separability, do some affine (ish) transformations, lower back down.<p>Fucking impressive for a computer. But if it can really carry water for an expensive Penn grad?<p>You’re paying for something other than blindingly insightful product strategy.
Not surprised. It's frighteningly good, and a perfect match for programming.<p>I often ask GPT4 to write code for something, and try if it works, but I seldom copy and paste the code it writes - I rewrite it myself to fit into the context of the codebase. But it saves me a lot of time when I am unsure about how to do something.<p>Other times I don't like the suggestion at all, but that's useful as well, as it often clarifies the problem space in my head.
Having been a consultant, what strikes me about this is the next, to me seemingly obvious question: What if you just removed the consultants entirely and just had GPT-4 do the work directly for the client?<p>If you’re a client and need a consultant to do something, you have to explain the requirement to them, review the work, give feedback, and so forth. There will likely be a few meetings in there.<p>But if GPT-4 can make consultants so much better, I imagine it can also do their work for them. And if you combine this with the reduction in communications overhead that comes from not working with an outside group, why wouldn’t clients just accrue all the benefits to themselves, plus the benefit of not paying outside consultants or dealing with the overhead of managing them?<p>This is especially the case when the client is already a domain expert but just needs some additional horsepower. For example, marketing brand managers may work with marketing consultants even though they know their products and marketing very well. They just need more resources, which can come in the form of consultants for reasons such as internal head-count restrictions.<p>Anyway, I just wonder if BCG thought through the implications of participating in this study. To me it feels like a very short step from “helps consultants help their clients” to “helps clients directly and shows consultants aren’t really necessary.”<p>Especially so if the client just hires an intern and gives them GPT-4.
HN is so bad at predictions. Just a few months ago HN was awash with comments that confidently claimed LLMs were no more than stochastic parrots and unlikely to amount to anything.<p>> I can't help but think the next AI winter is around the corner. [0]<p>Yeah, right.<p>[0] <a href="https://news.ycombinator.com/item?id=23886325">https://news.ycombinator.com/item?id=23886325</a>
There is a lot of office work that will overtime be optimized over time using gpt like services. I was tech savvy enough to know that a lot of office work that I do is repeatable and can be done using scripts but not good enough to write those scripts myself. Using Chat gpt allowed me to write those scripts it took me I think 15-20hrs to get the scripts working perfectly. I knew just a little bit of python scripting did not know anything about python pandas or xls writer etc but was able to create something that saves me I would estimate 20-25 hours a week.<p>In my opinion a lot of people here on hackernews as they are themselves good at programing underestimate how services like chat gpt can open a new world to non programmers. They also probably make the non inquisitive learn less. Previously to learn how to stop multiple snapd services using a script I would have googled and then cobbled together something today I just ask chatgpt and get a working script in less than a min.
Two things mentioned in the abstract that are worth pointing out.<p>> For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities<p>They specifically picked tasks that GPT-4 was capable of doing. GPT-4 could not do many tasks, so when we say that performance was significantly increased <i>this is only for tasks GPT-4 is well suited to</i>. There is still value here but let's put these results into context.<p>> Consultants across the skills distribution benefited significantly from having AI augmentation, with those below the average performance threshold increasing by 43% and those above increasing by 17% compared to their own scores<p>Even when cherry-picking tasks that GPT-4 is particularly suited for, above average performers only increased performance by 17%. This increase is still impressive, were it to be seen across the board. But I do think that 17% is a lot less than some people are trying to sell.
Pipe /dev/random, transform to decimal, and you just got an amazing increase in performance for calculating decimals of Pi. Nobody said precision was important anyway.
More details in this blog post by a Wharton professor: <a href="https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the-jagged" rel="nofollow noreferrer">https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the...</a><p>My questions to naysayers:<p>* Do you or anyone you know use GPT-4 (not the free GPT-3.5) to do productive tasks like coding and found it to help in many cases?<p>* If you insist it’s useless, why do millions of people pay $20 a month to access GPT-4 and plugins?
My prediction? In about 6 months, every test, task, or use of a LLM for anything that requires a modicum of creativity is going to find that it only has a fixed set of "ideas" before it starts regurgitating them. [0] I can easily imagine this in their hypothetical shoe pitch question, and many models going for more factual answers have been rapidly showing this bias by design.<p>[0] <a href="https://www.marktechpost.com/2023/06/16/this-paper-tests-chatgpts-sense-of-humor-over-90-of-chatgpt-generated-jokes-were-the-same-25-jokes/" rel="nofollow noreferrer">https://www.marktechpost.com/2023/06/16/this-paper-tests-cha...</a>
Can confirm. I popped the 20 bucks for GPT4, and have been using it more and more, every day for 3 weeks. Not sure how I can get by without it now. It's just so easy to have normal conversation and get answers. Like having an expert friend across the hall you can just shoutout questions, and ask for simple reminders, recommendations.<p>Who cares if it gets things wrong sometimes, you would double check your co-workers answers also. And there are times when I insist I am correct, and GPT will argue back and eventually I find I was wrong.
I bet early search engines had similar or even better figures under similar conditions.<p>I suppose this because I recall how much search improved my productivity over flipping through books and I know how for certain tasks ChatGPT is a better source of knowledge on how to do it than search. While often the GPT output isn’t entirely correct, more often than not it suffices to make the correct solution obvious thus saving a lot of time.
The actual research article: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321" rel="nofollow noreferrer">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321</a><p>Summary: <a href="https://pdf2gpt.com/?summary=84ff84d4b98b4f0c985a17d07db482c9" rel="nofollow noreferrer">https://pdf2gpt.com/?summary=84ff84d4b98b4f0c985a17d07db482c...</a>
Guilty! GPT is the best colleague I ever had, but boy does it speak. You can't just copy paste, but if you consider its responses as input I find myself less dependent on other senior consultants sharing their insights. It also makes me more confident in my assessments and deliveries.<p>Purpose of technology is to enhance our performance, GPT is very much doing so - but with great powers comes great responsibility.
BCG : We know layoffs are in fashion and we'd just like you to know that if you need industrial grade ass covering excuses from a legitimate-ish sounding authority to justify what you were planning to do anyway, our 23 year old consultants and their PowerPoint presentations have got you covered.
If this is how so called consultants use AI… they should be very concerned.
A moderately skilled intern with GPT Enterprise connected to data will make them quickly obsolete. Maybe they have some potential building their own fine tuned model but surely they will screw that up
Didn’t have many interactions which BCG so far but in both we had, I was surprised at how much money they get for reshuffling information from what is all common knowledge and available in the net. I can see that this is something LLMs can do really well. It’s exactly the kind of “creativity” LLMs can do: “apply concept X to market / niche Y and give ideas on monetizing”.<p>I don’t blaim BCG for doing this, they are giving an outside view and political uninfluenced (except for the party that pays the tap) view.
The output of many professions is bag-of-words emotional persuation. eg. politicians, consultants, sociologists, psychologists, writers, economists, tv talking heads, media in general.<p>A characteristic of these professions is that there is no accountability for output they produce. It is not like a profession that builds an engine for a car. They can bullshit with confidence and get away with it.<p>chatGPT will replace all of them - as chatGPT itself can bullshit with the best of them.
... for a set of tasks selected to be answerable by AI<p>Also access to AI significantly increased (!) incorrect answers in the case where the tasks were outside of AI capabilities.
I was sort of wondering this with the latest (I think now resolved) writer's strike. The union wanted reassurance that they wouldn't be replaced by AI; however, if I was the studios, I would have said `sounds good` - knowing full well that the union members will likely be turning to it. Unless the union polices its members, the appeal to use it is just too high.
Where ChatGPT could excel is early education learning where the ideas are simple and universally agreed and written online. As you go higher level the chance of hallucinations becomes higher and you could be taught the wrong thing without knowing the risks
Funnily enough, as a business consultant I use GPT to create executive summaries and sell people on the idea that my reports are as short as they possibly can be without information loss.
<i>"The study introduces the concept of a “jagged technological frontier,” where AI excels in some tasks but falls short in others."</i><p>D'oh.
I always wondered if some of the biggest fear mongers against GPT are those who worry they’ll be outed as frauds.<p>If your job is to generate nonsense… well…
Dumb Offtopic question. Is there any way to ask gpt4 to summarize an article online?<p>I tried giving it the url and it was a disaster. Is there a plug-in?
The headline "GPT-4 increased BCG consultants’ performance by over 40%" is misleading since it implies that they became more productive in their actual work, when this is a carefully controlled study that separates tasks by an "AI frontier". Only inside the frontier did the "quality" of work increase by 40%, while they completed 12% more tasks on average.
> Two distinct patterns of AI use emerged: “Centaurs,” who divided and delegated tasks between themselves and the AI, and “Cyborgs,” who integrated their workflow with the AI.<p>It was nice of them to explain that the article was total horse dung before having to read the whole paper
What LLMs do for me is that they make me a pro in every programming language.<p>"How do I do x in language y" always gives me the knowledge I need. Within seconds, I can continue coding.<p>After more than 10 years of coding fulltime, I know some languages very well, like PHP and Javascript. But even in those, LLMs often come up with a better solution than what I wrote. Because they know every fricking thing about those languages.