Ask HN: Dealing with a huge MySQL database - help

32 pointsby webvetover 11 years ago

We have been dealing with a huge MySQL database on one of our projects.Total byte size: > 150GB, Record count: > 100M (and growing daily)Primary issue is of performance (this is a live site), currently hosted on a (fairly powerful) VPS.Not having had in-house prior experience with such data volumes on MySQL, we hacked our own table partitioning architecture of sorts and also optimized the queries, table structures etc. to the max of our abilities. All this has been working reasonably well for us so far.But now, with rising traffic levels, things are beginning to slow down again.At this point, we're considering the following 3 options:1) Throw more hardware resources at it Pro: Instant gratification :) Cons: Cost + vicious cycle(?)2) Try to optimize the database/config, table structures and queries further Pro: Always useful Cons: Time + Effortand3) Migrate to PostgreSQL (or something else?) Pros/Cons: Still studyingWhat do HNers with experience in such matters suggest? We're open to all suggestions. Thanks in advance.

25 comments

debaserab2over 11 years ago

There is no silver bullet.1) Analyze your indexes on all tables. Start with the high volume insert/read tables.2) Look at your slow query log. Pick the slowest three queries. Optimize them. Setup a recurring meeting with your team to do this every week.3) Start looking at what data can be flattened and is less important and could be stuck in a key value datastore.4) Consider loading a replica slave server. Off load some of high volume read queries to the replica. Also serves as redundancy.5) Consider where it may make sense to move tables to InnoDB. Table-level locking can be extremely time consuming when traffic is high. Row-level locking can help here.6) Figure out what your hardware upgrade path looks like. You will have to do this eventually. Identify when you might hit a hardware ceiling with your hosting provider and get a plan in place. Coordinate with your marketing/sales team and upgrade before you hit a traffic peak, not during.

评论 #6349103 未加载

charlesjuover 11 years ago

I have a little bit of experience dealing with large datasets and MySQL. This is going to be a little counter-intuitive to standard startup culture advice, but generally, the engineering effort to optimize databases past the initial live flow is not worth throwing money at the problem. You have to weigh the fixed cost of the optimization and the variable cost of the upkeep of the system against simply patching the system for now.If you are willing to throw money at the problem, which you seem like you're wavering on, here are some options on how to throw money at the problem:1. If you are willing to be hands on and maintain your MySQL cluster yourselves, you should move away from EC2 (I am presuming) and to a local collo where you can install super fatty SSD computers with massive RAID setups that will greatly improve your performance per server (on the magnitude of up to 10x)2. You can do a hybrid solution with your existing infrastructure. It sounds like a lot of the data is not necessarily going to be used that much (tracking all events probably, from what I am assuming, or something big data intensive like that), you can throw that into a separate database infrastructure.If my assumption is correct, I am presuming you are using a large portion for the database for stat tracking, and not user information, then the "proper" way to throw it somewhere is to put it into S3 as fat logs that you can later use something like map reduce to process. Other options for more efficient storage are Redis and MongoDB.3. We have taken this to the extreme and actually moved all of our live data to DynamoDB. It is a NoSql database storage system operated by Amazon. This has let us concentrate on features and the user experience. The cost is our database is probably 5-10 times more expensive than a self rolled Redis SSD backed equivalent hosted at our local collo.------Just some other general database scaling advice for MySQL. The steps to scaling your MySQL database for live use are as follows:1. Do not use any relational calls on your data. Relational calls are useful for databases that aren't used to manage live web transactions as it can seriously back up your system.2. Put indexes on all of your heavily used columns for finding data.3. Look at the slow queries and optimize those queries out.4. Call up Percona and ask for a full consultation to make sure your configs and systems are set up properly for MySQL.<a href="http://www.percona.com/products/mysql-consulting/overview" rel="nofollow">http://www.percona.com/products/mysql-consulting/overview</a>5. Email me if you have more questions charlesju gmail

评论 #6349243 未加载

评论 #6349159 未加载

a3nover 11 years ago

"Not having had in-house prior experience with such data volumes on MySQL ... At this point, we're considering the following 3 options:"4. Hire or contract a consultant/expert. Part of their assignment could be to teach you.

评论 #6349000 未加载

评论 #6349040 未加载

geophileover 11 years ago

What exactly is getting slow? Inserts? Updates? Queries? Which ones? Have you looked at the slow query log? Have you checked the plans for the slow queries?Are these MyISAM tables or InnoDB?150GB and 100M records is not huge, not even close. You are looking for a quick fix and there isn't one. You need to start with (2).

评论 #6349032 未加载

评论 #6350691 未加载

ainsejover 11 years ago

Caching is definitely a must have, both an external cache (Memcache, Redis) and tuning MySQL's own caches in the configuration, using InnoDB and MySQL means you can configure a large buffer in the MySQL config to speed things up considerablyDisk speed could also be a huge issue, especially on a VPS where resources are shared between many virtual machines, consider migrating your database off to a dedicated server (Preferably one with an SSD), depending on where you are in the world OVH might be a good choice.

netaustinover 11 years ago

Here are some guiding questions:1) Do you actively need all 100M records, or is there a period after which you can archive them?2) Have you partitioned data across multiple servers, or just multiple tables on one server?3) What are the nature of your expensive queries? Are you generating reports? User dashboards?I think a general idea of what your application is doing would help us provide some guidance.

评论 #6350410 未加载

mjhea0over 11 years ago

In my experiences, the size of the database nor the number of records really matter. It's the number of queries that you are handling at one time that can really affect performance.Sure, if you have a single table with a ton of records this can slow things down - but not nearly as much as dealing with a high number of queries.I highly recommend this book - <a href="http://www.amazon.com/dp/0596101716/?tag=stackoverfl08-20" rel="nofollow">http://www.amazon.com/dp/0596101716/?tag=stackoverfl08-20</a>Yes, caching can really help. Logically placed indexes can too. Switching to postgres will not help. The bottleneck will still be there. The key is to find that before you start tuning/optimizing: <a href="http://www.singlehop.com/blog/analyzing-mysql-performance-and-bottlenecks/" rel="nofollow">http://www.singlehop.com/blog/analyzing-mysql-performance-an...</a>Good luck!

评论 #6349147 未加载

WestCoastJustinover 11 years ago

I would not migrate to postgres without knowing the issue first. Here are some suggestions: Enable replication master/slaves+, assuming you have a read heavy architecture. Enable the "The Slow Query Log" [1], which will give you something like /var/log/mysql-slowquery.log, then you can look in here to find queries that take a long time. Find out what the bottleneck is, I'm willing to best you have limited I/O and that is the source of your issues, can you use memcached [2] to limit the database hit rate? You need to remove/limit direct access to the database when people get duplicate data out.Before you start to optimize I would profile things to get a baseline for how many select/inserts you are doing and how long they are taking, and system load, etc. Based off your 100M statement, you are doing roughly 1157.4 inserts/s. Maybe that's 250 inserts/s during a slow time and 3,000 inserts/s at peak, but it would be nice to know.<pre><code> 100,000,000 (inserts) / 86,400 (seconds/day) = 1157.4 (inserts/s) </code></pre> I guess it all depends on the insert size and index but you should be able to scale this. If you cannot then you need to partition/shard your data. After reading all this, you are probably thinking, I should just throw more hardware at the problem ;)ps. Have you stopped to ask if you really need all this data?[1] <a href="http://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html" rel="nofollow">http://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html</a>[2] <a href="http://memcached.org/" rel="nofollow">http://memcached.org/</a>

评论 #6349136 未加载

apalmbladover 11 years ago

Hmm.Bringing in a consultant is probably your best bet. Database optimization needs to generally take into account YOUR data and usage patterns.If you want to learn a bit yourself, pick up a copy of "High Performance MySQL." I've found that to be the most useful MySQL scaling book I own.Figure out your slow queries. Run "SHOW PROCESSLIST" on the server and see what's typically running.Look at the actual queries - if you're frequently running a query that looks through most rows of your biggest table, that's going to be a tough query to optimize. Instead, look to either caching or regular pre-computation of the results.I've found that an unfortunate number of people aren't aware of "EXPLAIN" and its use to help figure out query issues. Learn and use it.You're on a VPS. That may be fine, but it may have horrible I/O throughput. If you're writing a lot of data, or having queries that hitting some sort of mysql or OS cache, IO will be your bottleneck. Make sure it's fast enough. Look at average IO wait times. Test max. throughput.MyISAM tables may be an issue. MyISAM tables use table level locking - only one session at a time will be able to update the table. This is quite possibly a problem if you're doing any updates or inserts to a table that is also frequently read from. Look to move to InnoDB soon. See: <a href="http://dev.mysql.com/doc/refman/5.1/en/internal-locking.html" rel="nofollow">http://dev.mysql.com/doc/refman/5.1/en/internal-locking.html</a>

评论 #6350605 未加载

janlin1999over 11 years ago

It sounds like your database is roughly the same size as the database that we work with, and MySQL performance has been a recurring issue for us. Assuming that you've done a reasonable job of option 2, I gravitate towards option 1 and maybe option 4 (depending on how confident you are of your team's technical capabilities and how expensive consultants are relative to additional hardware).These are intermediate steps that have helped, giving us some time to figure we whether want to migrate to a different database:1) We started using InnoDB a few years ago.2) One of our common queries joined parts of multiple large tables; we essentially cached the useful parts of the join and put the results into a memory table.3) Our non-production server is under a lot of load from our back-end processes, and we decided to move the database to a RAM disk, which greatly improved performance. It's important to note that, while inconvenient, a power outage is not catastrophic in our case (e.g. we don't deal with financial transactions); we do daily backups to non-volatile memory. For around $5,000, you can build your own server that has over 250 GB of RAM and then put it into a co-location facility.We seriously considered migrating to Postgres, but #3 has bought us some time on that front. It might be that we eventually still migrate, but we prefer to do it after the company is on better financial footing. It might also be that while we wait, other less painful options open up over time (e.g. maybe MariaDB will suffice).I hope the situation resolves well.

mattjaynesover 11 years ago

Consider that 37Signals probably has a much larger dataset and is still happily scaling vertically due to Moore's law.<a href="http://highscalability.com/blog/2012/1/30/37signals-still-happily-scaling-on-moore-ram-and-ssds.html" rel="nofollow">http://highscalability.com/blog/2012/1/30/37signals-still-ha...</a>It's probably the cheapest, quickest, and least risky option if your engineers' time is expensive (which it most likely is).Of course, you'll ultimately also want to find bottlenecks and tune the DB and cache the hell out of everything, but scaling vertically will at least buy you some good time so you can do that in a more relaxed time-table.Related links:<a href="http://37signals.com/svn/posts/3202-behind-the-scenes-the-hardware-that-powers-basecamp-campfire-and-highrise" rel="nofollow">http://37signals.com/svn/posts/3202-behind-the-scenes-the-ha...</a><a href="http://37signals.com/svn/posts/3090-basecamp-nexts-caching-hardware" rel="nofollow">http://37signals.com/svn/posts/3090-basecamp-nexts-caching-h...</a><a href="http://37signals.com/svn/posts/3112-how-basecamp-next-got-to-be-so-damn-fast-without-using-much-client-side-ui" rel="nofollow">http://37signals.com/svn/posts/3112-how-basecamp-next-got-to...</a>

morgoover 11 years ago

Hi!There are some good suggestions in this thread already. Step #1 is to determine why it is slow.If it is slow under load, then I am going to suspect that you are hitting table locks - which means you need to migrate to InnoDB (row locking + MVCC). Important to point out: a bigger server may not actually help enough here, because while you are waiting on locks nothing can be done while there is free capacity waiting to be used.If it is just generally slow related to growth, it can probably be improved by indexing - which will help you reduce what data needs to be in RAM.In either case, two third party tools to suggest: - pt-query-digest: aggregate your slow query log. You want to use 0 seconds as the threshold, record 20 minutes of slow queries during regular activity. Methodology described here: <a href="http://gtowey.blogspot.ca/2012/07/slow-query-log-is-not.html" rel="nofollow">http://gtowey.blogspot.ca/2012/07/slow-query-log-is-not.html</a>- pt-online-schema-change: you can migrate from MyISAM to InnoDB online via triggers (not online by default).Both tools part of <a href="http://www.percona.com/software/percona-toolkit" rel="nofollow">http://www.percona.com/software/percona-toolkit</a>

评论 #6349174 未加载

tdoniaover 11 years ago

Inline with questions 2 & 3 - what's the bottleneck? reading or writing or both? if you haven't already, it's worth looking into a master (writes) / slave (reads) setup - provides a bit of redundancy and lets you scale reads very easily by adding more slaves. It also helps with writes because it can take a lot of pressure off the master....assuming you're webvet.com it looks like you're scaling drupal, so these would be my first questions: are you using the views module? if so, kill that, as well as anything else by merlin (panels, etc). and look at your indexes. setup pressflow + percona. master & slave db. reverse proxy caching (varnish or nginx) are all good places to start. hostwise if you're not already on amazon, you can get a lot of mileage out of a service like voxel that lets you mix VPSs (cheap webheads) with physical hardware. it's not great and it may well be worth biting the more expensive bullet and going to AWS for long term needs, but it's a lot better than linode for an underwater db.my email is in my profile if you'd like more detail - drupal is a bear at first but it can be scaled for a while.

评论 #6349195 未加载

评论 #6349266 未加载

IgorPartolaover 11 years ago

One of the less documented tuning parameters I found is the threa dcache size. Normally it is a very small value for the number of simultaneous connections you likely support. Increasing it can result in elimination of random pauses/slowdowns.In general, it is impossible to "tune MySQL" though. You can allow it to use more system resources and in some cases you can get it to be less durable to gain write performance, but to get orders of magnitude increases you need to restructure your data. Try sharing or partitioning data. Increase or reduce the number of indecies. Denormalize data. Archive logs to flat files. Put the OS, /var/lib, and /var/log on separate physical drives. Run MySQL on bare metal since disk IO is often a bottleneck and virtualization can add overhead. Restructure the data. At one point a write heavy app I was optimizing was logging things to the database (that needed to be queries). Each log entry contained extra data we did not query. One speed up that worked was throwing the majority of the extra data into a single gzipped JSON blob. This made it easier for MySQL to fetch fewer pages. Less IO, faster operation.

评论 #6349301 未加载

Demiurgeover 11 years ago

Have you considered rewriting MySQL in Go?jk. What is the real nature of the queries, what kind of caching is being used? MySQL should not really get hit that often if you have 150GB of data and a site, unless it's not getting content but running some sort of deep processing. In such a case MySQL seems like the wrong tool for the job.

nlaover 11 years ago

Without doing an analysis, which is really warranted, here are some coarse grain recommendations:1) Get off VPS if you can. If you can't check out linode.com; they offer better price/performance than most cloud services.2) Take a look at the Percona build of MySQL - it is faster, especially with more cores.3) Use in memory tables if possible4) Switch to InnoDB or XtraDB (in the case of Percona) tables... MyISAM implements table level locking, whereas InnoDB and XtraDB implement only row level locking.

patrickg_zillover 11 years ago

My first recommendation would be to get a dedicated server with a decent amount of RAM (32GB or more) and put MySQL on that.1. Greater memory bandwidth 2. dedicated disk resources 3. you can ensure that extra RAM is used as read and write cache.I am not the cheapest provider by far, and I rent these kind of systems for $200 a month. If you go with someone else the price could be under $100 a month.

gauravguptaover 11 years ago

Cleanup - Probably one the most effective but often ignored tips for DB management. Do you really need all that data in real time? Are you querying all of it in your app? If not, it makes sense to archive parts of it to another disk. The size reduction would boost performance tremendously.

评论 #6360676 未加载

CWIZOover 11 years ago

Do you have any sort of caching in your app? Memcache, for instance, can go a long way in helping you scale.

评论 #6348961 未加载

benjaminwoottonover 11 years ago

Option 4) would be to hire an expert in these matters.100M rows is not a lot of data nowadays.

评论 #6349052 未加载

lotsofcowsover 11 years ago

Not particularly large for MySQL. Consider using master / slave to separate reads and writes, cache heavily and consider sharding if appropriate.Alternatively, as suggested elsewhere, give Percona a ring.

Joyfieldover 11 years ago

You give us two values (150GB and 100M+ rows). That is not enough to give you ANY suggestions other than give us more info.

评论 #6349012 未加载

petervandijckover 11 years ago

Call Percona first. $10K well spent.Then (probably, and assuming you have the basics right) just throw hardware at it.

sgaitherover 11 years ago

How slow is "slow" for you right now? And writes vs. read? Which MySql storage engine are you using?

评论 #6348993 未加载

masterkainover 11 years ago

stackoverflow -> this way