TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Improving Postgres text search speed

131 点作者 kuzee将近 3 年前

9 条评论

zasdffaa将近 3 年前
&gt; A key gotcha that tripped us up: when querying with a list of primary keys, Postgres will not return the records in the same order as the IDs provided in a “where … in” clause.<p>That&#x27;s frightening they don&#x27;t know that. So, burn this into your minds: no ordering is guaranteed in SQL unless an ORDER BY is applied to it (and any ordering in a subquery is lost). Even if it seems to work, it will fail. No guarantees on order unless... scratch that onto your eyeballs so you never forget it.<p>Also, will people please stop posting rainbow-on-black screenshots, especially with the screenshotted text size smaller than the main text.
评论 #32448735 未加载
评论 #32450307 未加载
评论 #32449280 未加载
评论 #32452190 未加载
评论 #32448672 未加载
jrochkind1将近 3 年前
How can querying for PK&#x27;s than doing a second query on the FK be faster than a join that&#x27;s semantically equivalent? Postgres optimizer gone terribly wrong or something? Or am I misunderstanding what&#x27;s going on?<p>It would never have occurred to me to even try this, I would have assumed postgres would do as well as could be done at what looks like a very standard join. So I guess that&#x27;s a lesson not to make assumptions?<p>Not sure, without even having a theory of why pg would behave this way, I&#x27;d be worried that as the data changes it might regress again. Or if the phase of the moon changes or something -- since it seems like magic!
评论 #32447945 未加载
评论 #32447601 未加载
评论 #32449856 未加载
评论 #32452210 未加载
评论 #32447304 未加载
评论 #32449333 未加载
kuzee将近 3 年前
I wrote up my recent experience optimizing Postgres text search on a database with a few million records without relying on a new service like Elastic Search.
评论 #32446029 未加载
norwalkbear将近 3 年前
Has anyone overcome the 16382 positional limits of tsvector?<p>That and the automatic stemming and lemming of search words even in phrase searches makes postgres awful for any software where accurate search is critical.
评论 #32446530 未加载
评论 #32448272 未加载
评论 #32446812 未加载
评论 #32447792 未加载
eric4smith将近 3 年前
PostgreSQL text search is awesome - for English and Roman type languages.<p>But Asian languages such as Thai, Japanese, Korean, etc are not going to work at all.<p>PostgreSQL is weird about joins. Joining on certain columns could be super fast but others dog slow. And this can flip depending on size of table and this index Cardinality.<p>That’s why it’s important on databases that grow quickly to check the performance of even simple queries as those can balloon in execution time as the profile of the data changes.
评论 #32446948 未加载
评论 #32446361 未加载
评论 #32446350 未加载
rdoherty将近 3 年前
I&#x27;d love to see some details on the why using EXPLAIN ANALYZE on each query and schema. It seems like the changes were done with a hunch as to why they were slow?
评论 #32446030 未加载
vosper将近 3 年前
PGSync might be useful for those who don&#x27;t mind also running Elasticsearch<p><a href="https:&#x2F;&#x2F;github.com&#x2F;toluaina&#x2F;pgsync" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;toluaina&#x2F;pgsync</a>
评论 #32447240 未加载
评论 #32447337 未加载
评论 #32446576 未加载
maverwa将近 3 年前
I remember achieving some quite good results when implementing a simple „fuzzyesque“ search for some B2B e-commerce system a few years back, but what hit us hard was the german dictionary used for stemming for the ts_vector index. The original one coming with psql did not support german compound words correctly and the „fix“ was to import a 3rd party one. I learned the hard ways that this comes at the cost of Postgres loading the dictionary every time a new connection uses the index. Running a simple, small php app that could not reuse&#x2F;pool connections, every search query done was coming in on a clean connection and hit a &gt;1s fee for having to load the dict.<p>iirc pgbouncer saved the day there.<p>Otherwise it worked fine.
anshul将近 3 年前
You might want to test using the first query as a sub-query or cte in the second one. That would likely give you the same &#x2F; better perf. It would avoid the join and save a round trip.
评论 #32446972 未加载