Perplexity Deep Research

368 点作者 vinni23 个月前

27 条评论

Every week we get a new AI that according to the AI-goodness-benchmarks is 20% better than the old AI, yet the utility of these latest SOTA models is only marginally higher than the first ChatGPT version released to the public a few years back.These things have the reasoning skills of a toddler, yet we keep fine-tuning their writing style to be more and more authoritative - this one is only missing the font and color scheme, other than that the output formatted exactly like a research paper.

评论 #43066581 未加载

评论 #43066849 未加载

评论 #43066123 未加载

评论 #43066203 未加载

评论 #43068592 未加载

评论 #43070264 未加载

评论 #43066239 未加载

评论 #43068368 未加载

CSMastermind3 个月前

I'm super happy that these types of deep research applications are being released because it seems like such an obvious use case for LLMs.I ran Perplexity through some of my test queries for these.One query that it choked hard on was, "List the college majors of all of the Fortune 100 CEOs"OpenAI and Gemini both handle this somewhat gracefully producing a table of results (though it takes a few follow ups to get a correct list). Perplexity just kind of rambles generally about the topic.There are other examples I can give of similar failures.Seems like generally it's good at summarizing a single question (Who are the current Fortune 100 CEOs) but as soon as you need to then look up a second list of data and marry the results it kind of falls apart.

评论 #43082401 未加载

评论 #43064283 未加载

simonw3 个月前

That's the third product to use "Deep Research" in its name.The first was Gemini Deep Research: <a href="https://blog.google/products/gemini/google-gemini-deep-research/" rel="nofollow">https://blog.google/products/gemini/google-gemini-deep-resea...</a> - December 11th 2024Then ChatGPT Deep Research: <a href="https://openai.com/index/introducing-deep-research/" rel="nofollow">https://openai.com/index/introducing-deep-research/</a> - February 2nd 2025Now Perplexity Deep Research: <a href="https://www.perplexity.ai/hub/blog/introducing-perplexity-deep-research" rel="nofollow">https://www.perplexity.ai/hub/blog/introducing-perplexity-de...</a> - February 14th 2025.

评论 #43066249 未加载

评论 #43064272 未加载

评论 #43064301 未加载

评论 #43064225 未加载

评论 #43065456 未加载

评论 #43065658 未加载

评论 #43065451 未加载

评论 #43064572 未加载

评论 #43064274 未加载

melvinmelih3 个月前

In about 2 weeks since OpenAI launched their $200/mo version of Deep Research, it has already been open sourced within 24 hours (Hugging Face) and now being offered for free by Perplexity. The pace of disruption is mind boggling and makes you wonder if OpenAI has any moats left.

评论 #43064863 未加载

评论 #43064523 未加载

评论 #43067490 未加载

评论 #43067492 未加载

评论 #43066032 未加载

rchaud3 个月前

As with all of these tools, my question is the same: where is the dogfooding? Where is the evidence that Perplexity, OAI etc actually use these tools in their own business?I'm not particularly impressed with the examples they provided. Queries like "Top 20 biotech startups" can be answered by anything from Motley Fool or Seeking Alpha, Marketwatch or a million other free-to-read sources online. You have to go several levels deeper to separate the signal from the noise, especially with financial/investment info. Paperboys in 1929 sharing stock tips and all that.

larsiusprime3 个月前

I tried using this to create a fifty state table of local laws and policies and tax rates and legal obstacles for my pet interest (land value tax) I gave it the same prompts I gave OpenAI DR. Perplexity gave equally good results, and unlike OpenAI didn’t bungle the CSV downloads. Recommended!

ankit2193 个月前

Every time OpenAI comes up with a new product, and a new interaction mechanism / UX and low and behold, others copy the same, sometimes leveraging the same name as well.Happened with ChatGPT - a chat oriented way to use Gen AI models (phenomenal success and a right level of abstraction), then code interpreter, the talking thing (that hasnt scaled somehow), the reasoning models in chat (which i feel is a confusing UX when you have report generators, and a better ux would be just keep editing source prompt), and now deep research. [1] Yes, google did it first, and now Open AI followed, but what about so many startups who were working on similar problems in these verticals?I love how openai is introducing new UX paradigms, but somehow all the rest have one idea which is to follow what they are doing? Only thing outside this I see is cursor, which i think is confusing UX too, but that's a discussion for another day.[1]: I am keeping Operator/MCP/browser use out of this because 1/ it requires finetuning on a base model for more accurate results 2/ Admittedly all labs are working on it separately so you were bound to see the similar ideas.

评论 #43066623 未加载

评论 #43070111 未加载

afro883 个月前

This is great. I haven't tried OpenAI or Google's Deep Research, so maybe I'm not seeing the relative crapness that others in the comments are seeing.But for the query "what made the Amiga 500 sound chip special" it wrote a fantastic and detailed article: <a href="https://www.perplexity.ai/search/what-made-the-amiga-500-sound-6SON9.D6SqeJzWu_Ai0wYQ" rel="nofollow">https://www.perplexity.ai/search/what-made-the-amiga-500-sou...</a>For me personally it was a great read and I learnt a few things I didn't know before about it.

评论 #43068409 未加载

XenophileJKO3 个月前

I'm unimpressed. I gave it specifications for a recommender system that I am building and asked for recommendations and it just smooshed together some stuff, but didn't really think about it or try to create a resonable solution. I had claude.ai review it against the conversation we had.. I think the review is accurate. ---- This feels like it was generated by looking at common recommendation system papers/blogs and synthesizing their language, rather than thinking through the actual problems and solutions like we did.

nathanbrunner3 个月前

Tried it and it is worse that OpenAI deep search (one query only, will need to try it more I guess...)

评论 #43066311 未加载

评论 #43066748 未加载

NewUser763123 个月前

It's great to see the foundation model companies having their product offerings commoditized so fast - we as the users definitely win. Unless you're applying to be an intern analyst of some type somewhere... good luck in the next few years.I'm just starting to wonder where we as the entrepreneurs end up fitting in.Every majorly useful app on top of LLMs has been done or is being done by the model companies:- RAG and custom data apps were hot, well now we see file upload and understanding features from OAI and everyone else. Not to mention longer context lengths.- Vision Language Models: nobody really has the resources to compete with the model companies, they'll gladly take ideas from the next hot open source library and throw their huge datasets and GPU farm at it, to keep improving GPT-4o etc.- Deep Research: imo this one always seemed a bit more trivial, so not surprised to see many companies, even smaller ones, offering it for free.- Agents, Browser Use, Computer Use: the next frontier, I don't see any startups getting ahead of Anthropic and OAI on this, which is scary because this is the 'remote coworker' stage of AI. Similar story to Vision LMs, they'll gladly gobble up the best ideas and use their existing resources to leap ahead of anyone smaller.Serious question, can anyone point to a recent YC vertical AI SaaS company that's not on the chopping block once the model companies turn their direction to it, or the models themselves just become good enough to out-do the narrow application engineering?See e.g. <a href="https://lukaspetersson.com/blog/2025/bitter-vertical/" rel="nofollow">https://lukaspetersson.com/blog/2025/bitter-vertical/</a>

评论 #43066793 未加载

nextworddev3 个月前

I tried it but it seems to be biased to generate shorter reports compared to OpenAI's Deep Research. Perhaps it's a feature.

submeta3 个月前

It ends its research in a few seconds. Can this be even thorough? Chatgpt‘s Deep Research does its job for five minutes or more.

评论 #43066401 未加载

评论 #43066634 未加载

NeatoJn3 个月前

Tried a trending topic, I must say the output is quite underwhelming. It went through many "reasoning and searching" steps however the final write-up was still shallow descriptive texts, covering all aspects but no emphasis on the most important part.

Agraillo3 个月前

It's interesting. Recently I came up with a question that I posted to different LLMs with different results. It's about the ratio between GDP (PPP adjusted) to general GDP. ChatGPT was good, but because it found a dedicated web page exactly with this data and comparison so just rephrased the answer. General perplexity.ai when asked hallucinated significantly showing Luxemburg as the leader and pointing to some random gdp-related resources. But this kind of perplexity gave a very good "research" on a prompt "I would like to research countries about the ratio between GDP adjusted to purchasing power and the universal GDP. Please, show the top ones and look for other regularities". Took about 3 minutes

Lws8033 个月前

Curious to hear folks thoughts about Gergely's (The Pragmatic Engineer) tweet though <a href="https://x.com/GergelyOrosz/status/1891084838469308593" rel="nofollow">https://x.com/GergelyOrosz/status/1891084838469308593</a>I do wonder if this will push web publishers to start pay-walling up. I think the economics for deep research or AI search in general don't add up. Web publishers and site owners are losing traffic and human eyeballs from their site.

daveguy3 个月前

This seems like magic, but I can't find a research paper that explains how it works. And "expert-level analysis across a range of complex subject matters." is quite the promise. Does anyone have a link to a research paper that describes how they achieve such a feat? Any experts compared deep research to known domains? I would appreciate accounts from existing experts on how they perform.In the meantime, I hope the bean counters are keeping track of revenue vs LLM use.

评论 #43072942 未加载

评论 #43079135 未加载

marban3 个月前

Same link got flagged yesterday. @dang?<a href="https://news.ycombinator.com/item?id=43056072">https://news.ycombinator.com/item?id=43056072</a>

alecco3 个月前

I just tried it and the result was pretty bad."How to do X combining Y and Z" (in a long detailed paragraph, my prompt-fu is decent). The sources it picked were reasonable but not the best. The answer was along the lines of "You do X with Y and Z", basically repeating the prompt with more words but not actually how to address the problem, and never mind how to implement it.

cc62cf4a4f203 个月前

Don't forget gpt-researcher and STORM which have been out since well before any of these.

transformi3 个月前

Since google, everyone trying replicate this feature... (OpenAI, HF..)It's powerfull yes, so as asking an A.I and let him sythezise all what he fed.I guess the air is out of the ballon from the big players, since they lack of novel innovation in their latest products.

SubiculumCode3 个月前

Are there good benchmarks for this type of tool? It seems not?Also, I'd compare with the output of phind (with thinking and multiple searches selected).

评论 #43065359 未加载

评论 #43067561 未加载

Kalanos3 个月前

It's producing more in-depth answers than alternatives, but the results are not as accurate as alternatives.

pbarry253 个月前

Never forget that their CEO was happy to cross picket lines: <a href="https://techcrunch.com/2024/11/04/perplexity-ceo-offers-ai-companys-services-to-replace-striking-nyt-staff/" rel="nofollow">https://techcrunch.com/2024/11/04/perplexity-ceo-offers-ai-c...</a>

bsaul3 个月前

can someone explain what perplexity value is ? They seem like a thin wrapper on top of big AI names, and yet i find them often mentioned as equivalent to the likes of opena ai / anthropic / etc, which build foundational models.It's very confusing.

评论 #43066784 未加载

评论 #43066662 未加载

评论 #43067808 未加载

joshdavham3 个月前

Unrelated question: would most people consider perplexity to have reached product market fit?

评论 #43067659 未加载

SubiculumCode3 个月前

Any evaluation of hallucination?

27 条评论

alexvitkov3 个月前

评论 #43066581 未加载

评论 #43066849 未加载

评论 #43066123 未加载

评论 #43066203 未加载

评论 #43068592 未加载

评论 #43070264 未加载

评论 #43066239 未加载

评论 #43068368 未加载

CSMastermind3 个月前

评论 #43082401 未加载

评论 #43064283 未加载

simonw3 个月前

评论 #43066249 未加载

评论 #43064272 未加载

评论 #43064301 未加载

评论 #43064225 未加载

评论 #43065456 未加载

评论 #43065658 未加载

评论 #43065451 未加载

评论 #43064572 未加载

评论 #43064274 未加载

melvinmelih3 个月前

评论 #43064863 未加载

评论 #43064523 未加载

评论 #43067490 未加载

评论 #43067492 未加载

评论 #43066032 未加载

rchaud3 个月前

larsiusprime3 个月前

ankit2193 个月前

评论 #43066623 未加载

评论 #43070111 未加载

afro883 个月前

评论 #43068409 未加载

XenophileJKO3 个月前

nathanbrunner3 个月前

Tried it and it is worse that OpenAI deep search (one query only, will need to try it more I guess...)

评论 #43066311 未加载

评论 #43066748 未加载

NewUser763123 个月前

评论 #43066793 未加载

nextworddev3 个月前

I tried it but it seems to be biased to generate shorter reports compared to OpenAI's Deep Research. Perhaps it's a feature.

submeta3 个月前

It ends its research in a few seconds. Can this be even thorough? Chatgpt‘s Deep Research does its job for five minutes or more.

评论 #43066401 未加载

评论 #43066634 未加载

NeatoJn3 个月前

Agraillo3 个月前

Lws8033 个月前

daveguy3 个月前

评论 #43072942 未加载

评论 #43079135 未加载

marban3 个月前

Same link got flagged yesterday. @dang?<a href="https://news.ycombinator.com/item?id=43056072">https://news.ycombinator.com/item?id=43056072</a>

alecco3 个月前

cc62cf4a4f203 个月前

Don't forget gpt-researcher and STORM which have been out since well before any of these.

transformi3 个月前

SubiculumCode3 个月前

Are there good benchmarks for this type of tool? It seems not?Also, I'd compare with the output of phind (with thinking and multiple searches selected).

评论 #43065359 未加载

评论 #43067561 未加载

Kalanos3 个月前

It's producing more in-depth answers than alternatives, but the results are not as accurate as alternatives.

pbarry253 个月前

bsaul3 个月前

评论 #43066784 未加载

评论 #43066662 未加载

评论 #43067808 未加载

joshdavham3 个月前

Unrelated question: would most people consider perplexity to have reached product market fit?

评论 #43067659 未加载

SubiculumCode3 个月前

Any evaluation of hallucination?