TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How We Built a Cost-Based SQL Optimizer

198 pointsby andydbover 6 years ago

13 comments

jzelinskieover 6 years ago
I tend to dislike cost-based optimizers because they add a layer of uncertainty when you're attempting to optimize database queries. I know what I want the optimizer to do. The problem is that when I run test queries against a local database or even a staging database, the statistics used to calculate costs differ. This means production can do something totally undesirable.
评论 #18410188 未加载
评论 #18411397 未加载
评论 #18413553 未加载
评论 #18410280 未加载
DreamSpinnerover 6 years ago
People generally interested in this topic might also find the CMU Intro to DB Systems youtube of interest.<p>Note that it&#x27;s an introduction to <i>building</i> database systems (not just using them).<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;playlist?list=PLSE8ODhjZXja3hgmuwhf89qboV1kOxMx7" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;playlist?list=PLSE8ODhjZXja3hgmuwhf8...</a>
edmundhuberover 6 years ago
Great article, and it sounds like it took a pretty big leap of faith to go from heuristics to cost-based optimization. If this was addressed in the article I apologize: do you only ever select transformations that yield equivalent results, with immediately lower cost, or do you also explore transformations that incur in immediate cost increase, but then later pay off? i.e. are you doing simple hill climbing, or something more interesting?
评论 #18409947 未加载
评论 #18409592 未加载
shooover 6 years ago
Maybe of interest - postgres planner documentation<p><a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;planner-optimizer.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;planner-optimizer.ht...</a><p>Genetic query optimiser: <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;geqo.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;geqo.html</a><p>postgres has a number of hooks, which can be used to override the default behaviour. In particular, there are a bunch of hooks that can be used to install a custom query planner. <a href="https:&#x2F;&#x2F;github.com&#x2F;AmatanHead&#x2F;psql-hooks&#x2F;blob&#x2F;master&#x2F;Readme.md#planner-hooks" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;AmatanHead&#x2F;psql-hooks&#x2F;blob&#x2F;master&#x2F;Readme....</a><p>more generally, ignoring database query optimisation specifically, if you are interested in learning about discrete optimisation techniques, I recommend this course: <a href="https:&#x2F;&#x2F;www.coursera.org&#x2F;learn&#x2F;discrete-optimization" rel="nofollow">https:&#x2F;&#x2F;www.coursera.org&#x2F;learn&#x2F;discrete-optimization</a>
评论 #18410593 未加载
millrawrover 6 years ago
&gt; outside expert on database optimizers run a months-long bootcamp, with the goal of teaching our developers how state-of-the-art optimizers work, complete with homework assignments to read seminal papers on the subject<p>Any chance you could share more about this? This general area of how to build a production grade SQL optimizer seems to be a thing that&#x27;s more scattered in tiny pieces across a wide number of papers, or held in peoples&#x27; heads, than aggregated in a teaching manner. It seemed that the realistic general advice on how to build a SQL optimizer was to poach someone from the SQL Server team. ;)<p>I&#x27;ve generally just gone referring back to the unfinished Building Query Compilers[1] when pondering this subject. Not that the hundreds of references don&#x27;t provide sufficient material to read though as well, but it&#x27;d be interesting to hear what a bootcamp like this presented as state of the art.<p>[1]: <a href="http:&#x2F;&#x2F;pi3.informatik.uni-mannheim.de&#x2F;~moer&#x2F;querycompiler.pdf" rel="nofollow">http:&#x2F;&#x2F;pi3.informatik.uni-mannheim.de&#x2F;~moer&#x2F;querycompiler.pd...</a>
cy6erlionover 6 years ago
This is like studying chess, the approach to studying the complexities that arise from the number of routes a query can be executed is like how one will study an opening, also the tree data structure mentioned is reminiscent to chess, statistics also hence studying GM games. I feel like someone with a good understanding of chess will enjoy such a project. Anyways great post and breakdown.
评论 #18415902 未加载
altitudinousover 6 years ago
Ah, my past life and knowledge developing Oracle systems comes flooding back.<p>I don&#x27;t know if Oracle still uses CBO now, or even SQL or PL&#x2F;SQL but I am sure that a large chunk of their revenue still comes from supporting these legacy systems.
atombenderover 6 years ago
I&#x27;ve been looking for books that cover database implementation techniques such as this — the &quot;memo&quot; algorithm was new to me, but I&#x27;m also generally interested in query planning and optimization. Anyone got any recommendations? I started a thread: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=18410692" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=18410692</a>.
评论 #18415921 未加载
评论 #18411119 未加载
dzidolover 6 years ago
Shouldn&#x27;t &quot;unless both inputs are sorted on the same key&quot; be &quot;unless both inputs are sorted on the join key&quot;? And similarly later.
评论 #18411225 未加载
radiowaveover 6 years ago
Excellent write-up. Thanks.<p>Has any thought been given to whether this query planner could be adapted (much further down the road, I&#x27;d guess) to support dynamic replanning? (That is, altering the plan mid-query, if it should be found that the row-count estimates were way off.)
评论 #18415972 未加载
ryanworlover 6 years ago
How do you manage statistics in the system catalog in CockroachDB?
评论 #18409831 未加载
georgewfraserover 6 years ago
Is the cost based planner relevant to OLTP-style queries, where everything is an indexed lookup, or only to OLAP-style queries that involve scans?
评论 #18412818 未加载
kuntharover 6 years ago
is still no avail to use sqlalchemy within your cockroach? ok go ahead with plans.
评论 #18411135 未加载
评论 #18421414 未加载