ChatGPT-4 significantly increased performance of business consultants

309 pointsby bx376over 1 year ago

49 comments

Well, this sounds like perfect tasks for GPT:"Participants responded to a total of 18 tasks (or as many as they could within the given time frame). These tasks spanned various domains. Specifically, they can be categorized into four types: creativity (e.g., “Propose at least 10 ideas for a new shoe targeting an underserved market or sport.”), analytical thinking (e.g., “Segment the footwear industry market based on users.”), writing proficiency (e.g., “Draft a press release marketing copy for your product.”), and persuasiveness (e.g., “Pen an inspirational memo to employees detailing why your product would outshine competitors.”)."Here is the GPT response to the first task: <a href="https://chat.openai.com/share/db7556f7-6036-4b3d-a61a-9cd253f094fd" rel="nofollow noreferrer">https://chat.openai.com/share/db7556f7-6036-4b3d-a61a-9cd253...</a>A confident GPT hallucination is almost indistinguishable from typical management consulting material...

评论 #37714617 未加载

评论 #37714875 未加载

评论 #37715895 未加载

评论 #37716796 未加载

评论 #37715200 未加载

评论 #37714584 未加载

评论 #37718336 未加载

评论 #37727440 未加载

AdamCravenover 1 year ago

Well, they buried the lede with this one. Using LLMs were better for some tasks and actually made it worse for others.The first task was a generalist task ("inside the frontier" as they refer to it), which I'm not surprised has improved performance, as it purposely made to fall into an LLM's areas of strength: research into well-defined areas where you might not have strong domain knowledge. This also is the mainstay of early consultants' work, in which they are generalists in their early careers – usually as business analysts or similar – until they become more valuable and specialise later on.LLMs are strong in this area of general research because they have generalised a lot of information. But this generalisation is also its weakness. A good way to think about it is it's like a journalist of research. If you've ever read a newspaper, you often think you're getting a lot of insight. However, as soon as you read an article on an area of your specialisation, you realise they've made many flaws with the analysis; they don't understand your subject anywhere near the level you would.The second task (outside the frontier) required analysis of a spreadsheet, interviews and a more deeply analytical take with evidence to back it up. These are all tasks that LLMs aren't strong at currently. Unsurprisingly, the non-LLM group scored 84.5%, and between 60% and 70.6% for LLM users.The takeaway should be that LLMs are great for generalised research but less good for specialist analytical tasks.

评论 #37715212 未加载

评论 #37716071 未加载

评论 #37714954 未加载

Hippocratesover 1 year ago

This is hilarious. As impressive as GPT-3/4 has been at writing, what's more shocking is just how bullshity-y human writing is.. And a "business consultant" is the epitome of a role requiring bullshit writing. Chat GPT could certainly out business-consultant the very best business consultants.Sometimes to be taken seriously at work, you need to take some concise idea or data and fluff it up into a multiple pages or a slide deck JUST so that others can immediately see how much work you put in.The ideal role for chatgpt at this moment is probably to take concise writings and to expand it into something way larger and full of filler. On the receiving end, people will endure your long-winded document or slide deck, recognize you "put in the work", and then feed it back into chatGPT to get the original key points summarized.

评论 #37717638 未加载

benreesmanover 1 year ago

LLMs are stunningly good at language tasks: almost all of what us old-timers called NLP is just crushed these days. Summarization, Q&A, sentiment, the list goes on and on. Truly remarkable stuff.And where there isn’t a bright line around “fact”, and where it doesn’t need to come together like a Pynchon novel, the generative stuff is smoking hot: short-form fiction, opinion pieces, product copy? Massive productivity booster, you can prototype 20 ideas in one minute.But that’s about where we are: lift natural language into a latent space with some clear notion of separability, do some affine (ish) transformations, lower back down.Fucking impressive for a computer. But if it can really carry water for an expensive Penn grad?You’re paying for something other than blindingly insightful product strategy.

评论 #37714856 未加载

评论 #37716373 未加载

评论 #37717847 未加载

评论 #37716657 未加载

评论 #37715919 未加载

mgaunardover 1 year ago

Says more about how useless BCG consultants are.

评论 #37714588 未加载

评论 #37714608 未加载

评论 #37714472 未加载

评论 #37714516 未加载

评论 #37714605 未加载

评论 #37714953 未加载

评论 #37718464 未加载

评论 #37714935 未加载

评论 #37715218 未加载

评论 #37715477 未加载

awestrokeover 1 year ago

Not surprised. It's frighteningly good, and a perfect match for programming.I often ask GPT4 to write code for something, and try if it works, but I seldom copy and paste the code it writes - I rewrite it myself to fit into the context of the codebase. But it saves me a lot of time when I am unsure about how to do something.Other times I don't like the suggestion at all, but that's useful as well, as it often clarifies the problem space in my head.

评论 #37716328 未加载

评论 #37714620 未加载

评论 #37714511 未加载

评论 #37714589 未加载

评论 #37714501 未加载

评论 #37714827 未加载

simonmesmithover 1 year ago

Having been a consultant, what strikes me about this is the next, to me seemingly obvious question: What if you just removed the consultants entirely and just had GPT-4 do the work directly for the client?If you’re a client and need a consultant to do something, you have to explain the requirement to them, review the work, give feedback, and so forth. There will likely be a few meetings in there.But if GPT-4 can make consultants so much better, I imagine it can also do their work for them. And if you combine this with the reduction in communications overhead that comes from not working with an outside group, why wouldn’t clients just accrue all the benefits to themselves, plus the benefit of not paying outside consultants or dealing with the overhead of managing them?This is especially the case when the client is already a domain expert but just needs some additional horsepower. For example, marketing brand managers may work with marketing consultants even though they know their products and marketing very well. They just need more resources, which can come in the form of consultants for reasons such as internal head-count restrictions.Anyway, I just wonder if BCG thought through the implications of participating in this study. To me it feels like a very short step from “helps consultants help their clients” to “helps clients directly and shows consultants aren’t really necessary.”Especially so if the client just hires an intern and gives them GPT-4.

评论 #37714583 未加载

评论 #37714567 未加载

olalondeover 1 year ago

HN is so bad at predictions. Just a few months ago HN was awash with comments that confidently claimed LLMs were no more than stochastic parrots and unlikely to amount to anything.> I can't help but think the next AI winter is around the corner. [0]Yeah, right.[0] <a href="https://news.ycombinator.com/item?id=23886325">https://news.ycombinator.com/item?id=23886325</a>

评论 #37714876 未加载

评论 #37714960 未加载

评论 #37714946 未加载

评论 #37714834 未加载

评论 #37714915 未加载

评论 #37714934 未加载

评论 #37716538 未加载

xbmcuserover 1 year ago

There is a lot of office work that will overtime be optimized over time using gpt like services. I was tech savvy enough to know that a lot of office work that I do is repeatable and can be done using scripts but not good enough to write those scripts myself. Using Chat gpt allowed me to write those scripts it took me I think 15-20hrs to get the scripts working perfectly. I knew just a little bit of python scripting did not know anything about python pandas or xls writer etc but was able to create something that saves me I would estimate 20-25 hours a week.In my opinion a lot of people here on hackernews as they are themselves good at programing underestimate how services like chat gpt can open a new world to non programmers. They also probably make the non inquisitive learn less. Previously to learn how to stop multiple snapd services using a script I would have googled and then cobbled together something today I just ask chatgpt and get a working script in less than a min.

评论 #37717335 未加载

skepticATXover 1 year ago

Two things mentioned in the abstract that are worth pointing out.> For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilitiesThey specifically picked tasks that GPT-4 was capable of doing. GPT-4 could not do many tasks, so when we say that performance was significantly increased this is only for tasks GPT-4 is well suited to. There is still value here but let's put these results into context.> Consultants across the skills distribution benefited significantly from having AI augmentation, with those below the average performance threshold increasing by 43% and those above increasing by 17% compared to their own scoresEven when cherry-picking tasks that GPT-4 is particularly suited for, above average performers only increased performance by 17%. This increase is still impressive, were it to be seen across the board. But I do think that 17% is a lot less than some people are trying to sell.

评论 #37715019 未加载

评论 #37714673 未加载

Sirikonover 1 year ago

Pipe /dev/random, transform to decimal, and you just got an amazing increase in performance for calculating decimals of Pi. Nobody said precision was important anyway.

评论 #37714532 未加载

评论 #37715828 未加载

nopinsightover 1 year ago

More details in this blog post by a Wharton professor: <a href="https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the-jagged" rel="nofollow noreferrer">https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the...</a>My questions to naysayers:* Do you or anyone you know use GPT-4 (not the free GPT-3.5) to do productive tasks like coding and found it to help in many cases?* If you insist it’s useless, why do millions of people pay $20 a month to access GPT-4 and plugins?

评论 #37714883 未加载

评论 #37714909 未加载

评论 #37717953 未加载

NBJackover 1 year ago

My prediction? In about 6 months, every test, task, or use of a LLM for anything that requires a modicum of creativity is going to find that it only has a fixed set of "ideas" before it starts regurgitating them. [0] I can easily imagine this in their hypothetical shoe pitch question, and many models going for more factual answers have been rapidly showing this bias by design.[0] <a href="https://www.marktechpost.com/2023/06/16/this-paper-tests-chatgpts-sense-of-humor-over-90-of-chatgpt-generated-jokes-were-the-same-25-jokes/" rel="nofollow noreferrer">https://www.marktechpost.com/2023/06/16/this-paper-tests-cha...</a>

评论 #37716166 未加载

FrustratedMonkyover 1 year ago

Can confirm. I popped the 20 bucks for GPT4, and have been using it more and more, every day for 3 weeks. Not sure how I can get by without it now. It's just so easy to have normal conversation and get answers. Like having an expert friend across the hall you can just shoutout questions, and ask for simple reminders, recommendations.Who cares if it gets things wrong sometimes, you would double check your co-workers answers also. And there are times when I insist I am correct, and GPT will argue back and eventually I find I was wrong.

_pferreir_over 1 year ago

Maybe this tells more about BCG consultants than its does about GPT-4?

评论 #37714540 未加载

评论 #37714966 未加载

User23over 1 year ago

I bet early search engines had similar or even better figures under similar conditions.I suppose this because I recall how much search improved my productivity over flipping through books and I know how for certain tasks ChatGPT is a better source of knowledge on how to do it than search. While often the GPT output isn’t entirely correct, more often than not it suffices to make the correct solution obvious thus saving a lot of time.

z991over 1 year ago

The actual research article: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321" rel="nofollow noreferrer">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321</a>Summary: <a href="https://pdf2gpt.com/?summary=84ff84d4b98b4f0c985a17d07db482c9" rel="nofollow noreferrer">https://pdf2gpt.com/?summary=84ff84d4b98b4f0c985a17d07db482c...</a>

makachover 1 year ago

Guilty! GPT is the best colleague I ever had, but boy does it speak. You can't just copy paste, but if you consider its responses as input I find myself less dependent on other senior consultants sharing their insights. It also makes me more confident in my assessments and deliveries.Purpose of technology is to enhance our performance, GPT is very much doing so - but with great powers comes great responsibility.

leoffover 1 year ago

This is a good thing, since increased perfomance means that the clients will have less billed hours, right? Right?

评论 #37714692 未加载

pydryover 1 year ago

BCG : We know layoffs are in fashion and we'd just like you to know that if you need industrial grade ass covering excuses from a legitimate-ish sounding authority to justify what you were planning to do anyway, our 23 year old consultants and their PowerPoint presentations have got you covered.

digitcatphdover 1 year ago

If this is how so called consultants use AI… they should be very concerned. A moderately skilled intern with GPT Enterprise connected to data will make them quickly obsolete. Maybe they have some potential building their own fine tuned model but surely they will screw that up

ekianjoover 1 year ago

So useless BCG consultants were faster in delivery bullshit with ChatGPT? That's impressive.

doubtfuluserover 1 year ago

Didn’t have many interactions which BCG so far but in both we had, I was surprised at how much money they get for reshuffling information from what is all common knowledge and available in the net. I can see that this is something LLMs can do really well. It’s exactly the kind of “creativity” LLMs can do: “apply concept X to market / niche Y and give ideas on monetizing”.I don’t blaim BCG for doing this, they are giving an outside view and political uninfluenced (except for the party that pays the tap) view.

emmender1over 1 year ago

The output of many professions is bag-of-words emotional persuation. eg. politicians, consultants, sociologists, psychologists, writers, economists, tv talking heads, media in general.A characteristic of these professions is that there is no accountability for output they produce. It is not like a profession that builds an engine for a car. They can bullshit with confidence and get away with it.chatGPT will replace all of them - as chatGPT itself can bullshit with the best of them.

MilStdJunkieover 1 year ago

Nonsense career threatened by nonsense generator. Beautiful.

yafbumover 1 year ago

... for a set of tasks selected to be answerable by AIAlso access to AI significantly increased (!) incorrect answers in the case where the tasks were outside of AI capabilities.

bigmattystylesover 1 year ago

I was sort of wondering this with the latest (I think now resolved) writer's strike. The union wanted reassurance that they wouldn't be replaced by AI; however, if I was the studios, I would have said `sounds good` - knowing full well that the union members will likely be turning to it. Unless the union polices its members, the appeal to use it is just too high.

m3kw9over 1 year ago

Where ChatGPT could excel is early education learning where the ideas are simple and universally agreed and written online. As you go higher level the chance of hallucinations becomes higher and you could be taught the wrong thing without knowing the risks

xystover 1 year ago

This only really confirms what we already know. Business consultants are useless.

matt3Dover 1 year ago

Funnily enough, as a business consultant I use GPT to create executive summaries and sell people on the idea that my reports are as short as they possibly can be without information loss.

ralfcheungover 1 year ago

In other words, companies can replace consultants with GPT-4.

评论 #37716290 未加载

Animatsover 1 year ago

"The study introduces the concept of a “jagged technological frontier,” where AI excels in some tasks but falls short in others."D'oh.

Waterluvianover 1 year ago

I always wondered if some of the biggest fear mongers against GPT are those who worry they’ll be outed as frauds.If your job is to generate nonsense… well…

bilsbieover 1 year ago

Dumb Offtopic question. Is there any way to ask gpt4 to summarize an article online?I tried giving it the url and it was a disaster. Is there a plug-in?

Projectibogaover 1 year ago

Let the Turbocharged En-Shitifications commence.

WirelessGigabitover 1 year ago

Take some text and dump it into ChatGPT, and ask it to make it more formal.Sounds like the same text in the average deck...

chevmanover 1 year ago

Does this increased efficiency mean my SOW estimates are going to start coming down?????Oh right, it's not that type of efficiency :)

belterover 1 year ago

Withing 3 months, companies will be applying to Y Combinator, where both Founder and CEO are ML Models... :-)

anfelorover 1 year ago

The headline "GPT-4 increased BCG consultants’ performance by over 40%" is misleading since it implies that they became more productive in their actual work, when this is a carefully controlled study that separates tasks by an "AI frontier". Only inside the frontier did the "quality" of work increase by 40%, while they completed 12% more tasks on average.

评论 #37714706 未加载

seydorover 1 year ago

"Productivity returned to 100% after consultant was eliminated"

Avlin67over 1 year ago

It is quite efficient to generate unit tests using specific libraries

charbullover 1 year ago

slideware professionals got better at making slides with LLMs

croesover 1 year ago

So they don't check the results for the clients.

user_namedover 1 year ago

Reminder to not hire people who worked at MBB.

tbm57over 1 year ago

does 'paradox mindset' measure my ability to 'please accept the mystery'?

ltr_over 1 year ago

shocking, bullshit tech for bullshit people improves bullshit.

helsinkiover 1 year ago

No shit.

jakey_bakeyover 1 year ago

> Two distinct patterns of AI use emerged: “Centaurs,” who divided and delegated tasks between themselves and the AI, and “Cyborgs,” who integrated their workflow with the AI.It was nice of them to explain that the article was total horse dung before having to read the whole paper

评论 #37717005 未加载

评论 #37716193 未加载

评论 #37715141 未加载

评论 #37714596 未加载

评论 #37715680 未加载

评论 #37718048 未加载

MrThoughtfulover 1 year ago

What LLMs do for me is that they make me a pro in every programming language."How do I do x in language y" always gives me the knowledge I need. Within seconds, I can continue coding.After more than 10 years of coding fulltime, I know some languages very well, like PHP and Javascript. But even in those, LLMs often come up with a better solution than what I wrote. Because they know every fricking thing about those languages.