TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Spending $5k to learn how database indexes work

236 点作者 anglinb超过 3 年前

27 条评论

nrmitchi超过 3 年前
As the author touches on, the main problem here isn&#x27;t learning about indexes. It&#x27;s about &quot;infinity scaling&quot; working <i>too well</i> for people who do not understand the consequences.<p>In no sane version of the world should &quot;not adding a db index&quot; lead to getting a 50x bill at the end of the month without knowing.<p>I am a strong believer that services that are based on &quot;scale infinitly&quot; really need hard budget controls, and slower-scaling (unless explicitly overidden&#x2F;allowed, of course).<p>If I accidently push very non-performant code, I kind of expect my service to get less performant, quickly realize the problem, and fix it. I don&#x27;t expect a service to seemingly-magically detect my poor code, increase my bill by a couple orders-of-magnitude, and only alert me hours (if not days) later.
评论 #29134936 未加载
评论 #29136319 未加载
评论 #29133551 未加载
评论 #29136363 未加载
评论 #29138005 未加载
评论 #29136042 未加载
评论 #29136753 未加载
评论 #29134862 未加载
评论 #29140533 未加载
评论 #29138435 未加载
评论 #29135634 未加载
Grimm1超过 3 年前
I feel like indexes are a pretty fundamental type of DB knowledge. In fact I&#x27;d say it&#x27;s table stakes knowledge you should have if you&#x27;re working with them. Further more, knowing that ForeignKeys typically apply an index to that column is also in my head basic knowledge. I&#x27;m sorry you got burnt, and congrats on learning a lesson, but you could have gotten the same knowledge by ever googling MySql ForeignKeys and saved yourself a headache.<p>In fact it&#x27;s like a big bullet point near the top of the docs page.<p>&quot;MySQL requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. Such an index is created on the referencing table automatically if it does not exist. This index might be silently dropped later if you create another index that can be used to enforce the foreign key constraint. index_name, if given, is used as described previously.&quot;<p>I&#x27;m not entirely sure why buzz around &quot;developer learns basic knowledge&quot; has this on the front page.
评论 #29133576 未加载
评论 #29133056 未加载
评论 #29133783 未加载
评论 #29133746 未加载
评论 #29133080 未加载
评论 #29134726 未加载
评论 #29133617 未加载
评论 #29133464 未加载
评论 #29133303 未加载
samlambert超过 3 年前
This is definitely a lesson in the importance of indexes in general. We are well aware of the potential pitfalls with our current pricing. I’m happy to say we are nearly done modeling different metering rates for the product which would mean significantly lower bills for our users and avoid issues like this.<p>It’s core to our mission that our product’s pricing is accessible and friendly to small teams. Part of being in beta was us wanting to figure out the best pricing based on usage patterns. That work is nearly done. As the post mentions we’ve credited back the amount.
评论 #29133121 未加载
评论 #29133120 未加载
评论 #29133110 未加载
xupybd超过 3 年前
No foreign keys to make migrations easier. That doesn&#x27;t sound like the best trade off to me.<p>Having the database constrained as much as possible makes maintenance so much easier. Many bugs don&#x27;t escape into production as they&#x27;re caught by the database constraints. Those that do get out do less damage to the data.<p>I know scale comes with trade offs but that seems extreme to me.
评论 #29133111 未加载
评论 #29133232 未加载
评论 #29133316 未加载
hodgesrm超过 3 年前
My company runs a cloud service for ClickHouse. We&#x27;ve spent a lot of time thinking about pricing. In the end we arrived at (VMs + allocated storage) * management uplift + support fee.<p>It&#x27;s not a newfangled serverless pricing model, but it&#x27;s something I can reason about as a multi-decade developer of database apps. I feel comfortable that our users--mostly devs--feel the same way. We work to help people optimize the compute and storage down to the lowest levels that meet their SLAs. The most important property of the model is that costs are capped.<p>One of the things that I hear a lot from users of products like BigQuery is that they get nailed on consumption costs that they can&#x27;t relate in a meaningful way to application behavior. There&#x27;s a lot of innovation around SaaS pricing for data services but I&#x27;m still not convinced that the more abstract models really help users. We ourselves get nailed by &quot;weird shit&quot; expenses like use cases that hammer Zookeeper in bad ways across availability zones. We eat them because we don&#x27;t think users should need to understand internals to figure out costs. The best SaaS services abstract away operational details and have a simple billing model that doesn&#x27;t break when something unexpected happens on your apps.<p>Would love to hear alternative view points. It&#x27;s not an easy problem.
radu_floricica超过 3 年前
I&#x27;m just leaving this here:<p><a href="https:&#x2F;&#x2F;www.hetzner.com&#x2F;dedicated-rootserver&#x2F;ax161&#x2F;configurator" rel="nofollow">https:&#x2F;&#x2F;www.hetzner.com&#x2F;dedicated-rootserver&#x2F;ax161&#x2F;configura...</a><p>Draw that nice red slide all the way to the right. No, it&#x27;s not storage. Yeah, it&#x27;s actually affordable. Yeah, that was a sexual sound you just made.<p>You do have to be prepared to know some basic sysadmin, or pay somebody to do it for you. My newest server has about 60 cores and half a tera of ram. Surprisingly, it&#x27;s not uber sharp - I went with high core count so individual queries actually got slower for about 20%. But that load... you can&#x27;t even tell if the cpu load gauge is working. I can&#x27;t wait to fill it up :D Maybe this black friday season I&#x27;ll get it to 10%.
评论 #29133680 未加载
评论 #29134372 未加载
评论 #29133541 未加载
foreigner超过 3 年前
The real answer here is cost limiting. I don&#x27;t want my cloud provider to keep working at the cost of an order of magnitude higher bill than I was expecting because of a bug in my code. I want to be able to set a billing limit and have them degrade or stop there service if I exceed the limit.<p>AFAIK AWS doesn&#x27;t have that. They do have the ability to send me alerts if my bill is unexpectedly high, but they still keep working until I go bankrupt. It&#x27;s possible to use those alerts to implement your own &quot;broke man&#x27;s switch&quot;, but they don&#x27;t have it built in.
评论 #29133403 未加载
dreyfan超过 3 年前
Don’t use DB providers that charge for rows&#x2F;data scanned. Use Amazon RDS or Google Cloud SQL or just install it yourself on a VM. Pay for CPU, memory, and storage instead.
评论 #29133563 未加载
评论 #29133223 未加载
fabian2k超过 3 年前
That pricing model seems rather inherently tricky to me, and also quite expensive. At $1.50 per 10 million rows read this can get very expensive the moment you do a full table scan on any non-trivial table. And while this example is a trivial case where you only need minimal database knowledge to ensure that no full table scan is necessary, many real world cases are much more complex.<p>It also seems very expensive compared to just renting DBs by instance, if you put any real load onto this. I can see this being attractive if your use case only queries single rows by key, but it&#x27;s essentially a big minefield for any query more complex than that. A database with a rather opaque query planner doesn&#x27;t seem like a good fit for this kind of pricing.
评论 #29138048 未加载
bigbillheck超过 3 年前
I&#x27;m not a DB expert, but &quot;750k users in a month.&quot; doesn&#x27;t sound like a quantity that you&#x27;d need to use some kind of fancy special tooling for.
评论 #29132991 未加载
评论 #29137214 未加载
评论 #29133205 未加载
j3th9n超过 3 年前
Are StackOverflow topics now eligible for HN as soon as you mention the savings? Or is mentioning some numbers about the users enough? Or did I just click on an advertisement? So many questions.
rabuse超过 3 年前
Gotta love cloud pricing. This is why I colocate.
AnotherGoodName超过 3 年前
I&#x27;ve seen things you people wouldn&#x27;t believe. Millions burnt on consultants and licensing Oracle. I watched C series startups throwing it all away in a move to NoSQL. All those Amazon RDS fees will be lost in time.
评论 #29133573 未加载
评论 #29134580 未加载
评论 #29133161 未加载
评论 #29135445 未加载
评论 #29134466 未加载
评论 #29133167 未加载
XCSme超过 3 年前
I was actually considering PlanetScale, but them saying &quot;Every time a query retrieves a row from the database, it is counted as a row read.&quot; when it&#x27;s actually all the scanned rows, sounds intentionally confusing. &quot;Retrieving&quot; sounds like it should only be counted rows returned by a query.
NicoJuicy超过 3 年前
0,15 $ per query... The world has gone insane.
评论 #29132904 未加载
mekster超过 3 年前
Why do people get on stuff some known people use blindly?<p>That is such a bad habit like everyone getting on git and getting burned and now it&#x27;s irreversible with all the existing ecosystem.<p>How hard is it to just spin up a beefy cloud instance and run a MySQL of your own with whatever backup strategy you got and do things the way it is than getting bitten by using stuff you&#x27;re not even familiar with.
smoldesu超过 3 年前
Huh, learning about this &quot;Superwall&quot; product constitutes as my horror-story-of-the-day. It&#x27;s paywalling as a service, just what the industry needed. Thankfully it appears to be quarantined to iOS right now, but God does it feel like we&#x27;re headed right back into Stallman&#x27;s predictions about how SAASS will ruin the landscape of commercial technology.
评论 #29132857 未加载
评论 #29133117 未加载
racl101超过 3 年前
Yeah, It cost me two bad months of high RDS fees to learn about indexes. $900 in total.<p>Then a bro showed me one night about the magic of indexes. 5 minutes worth of advice saved me hundreds of dollars per month in the future and all he asked in return was for some beer and chicken wings.<p>Now that is a good bro.<p>I&#x27;m happy to say I&#x27;ve paid it forward myself.
Aeolun超过 3 年前
What a crazy way to do billing though. At larger scales (more rows, more customers, more queries) the costs become absolutely insane.
revskill超过 3 年前
Thanks. At least i&#x27;ll never use PlatnetScale. A good service should have config for me to alert&#x2F;prevent these kinds of money wasting cases.<p>Imagine how many wasted $$$ they earned based on common knowledge that they should prevent for customers instead.
crorella超过 3 年前
It amazes me that things so basic and fundamental like understanding the way indexes work are often overlooked or not leveraged
bborud超过 3 年前
One shouldn&#x27;t assume people know anything (even the most basic thing) about databases just because they say they do.
max_hammer超过 3 年前
I hate this pricing model<p>My company in boarded `fivetran` to source data from different tools.<p>Budget got exhausted in sourcing `iterable` data
ihusasmiiu超过 3 年前
Let me understand please. These people are selling a commercial product and their team has no idea whatsoever of what an index is? And this is news?
评论 #29133437 未加载
arpa超过 3 年前
Well this was an embarrasing read.
smsm42超过 3 年前
TLDR: author forgot to create indexes in cloud-based MySQL database and paid too much for the queries which were run as full-table scans.<p>Interestingly enough, some DBs (like Cassandra) would refuse scan-type queries unless specifically asked to. I wonder if cloud-based DBs which charge per row inspected could have such mode... Though of course it&#x27;s their incentive not to.
sushsjsuauahab超过 3 年前
Ssh into an ec2 instance, install mysql, and you&#x27;ll never pay more than $7.50 a month!
评论 #29137435 未加载