Update on InfluxDB Clustering, High Availability and Monetization

124 pointsby KyleBrandtabout 9 years ago

21 comments

23davidabout 9 years ago

I've advocated for and implemented several InfluxDB installations in production over the last year+, and one of the considerations was always that non-alpha (prod-ready) clustering was always promised in the 'next version' that was just around the corner.Several months ago it seemed clear that the team was overly optimistic, and it's just disappointing to see that now the clustering will be available only in a paid (minimum $400!) option or on their hosted service.I understand the business considerations here, but it feels like a bait n' switch for all the people who evaluated/used InfluxDB in single-node operation as a temporary measure while giving the team ample time to work out the clustering kinks.Lesson learned I guess... but dang what an expensive lesson.

评论 #11266463 未加载

评论 #11264194 未加载

评论 #11266680 未加载

jrvabout 9 years ago

This makes me think: an open-source project can be better off if it's not controlled by any one company. While in Prometheus (<a href="http://prometheus.io" rel="nofollow">http://prometheus.io</a>), we might still take a while before we have a clustered remote long-term storage, we'd never prevent it because the project is independent of any company and we'd want the open-source project to be as good as it can be.Also, there were some tentative thoughts about using InfluxDB as the main long-term storage backend for Prometheus, but that has become pretty much uninteresting now that clustering support (needed for LTS and durability) is basically cancelled for the open-source version.Still, I guess I can understand that when you're a company, you need to focus on making money.

评论 #11263772 未加载

评论 #11266260 未加载

KyleBrandtabout 9 years ago

I'm interested in what the pricing will be like at various scales, on Twitter the CEO (Paul Dix) indicated $400 month will be the basic offering for limited cores.I guess time will tell, but a bigger deal than the money to me is will the clustering actually work well? Clustered databases is an extremely difficult problem. Seen it get better in the years in things like MsSQL, but even with that sort of resources it took a long time (years) for the newer availability model to become stable.At Stack Overflow we are still using OpenTSBD behind Bosun. But HBase sucks to manage if you don't have any other reason to be using the technology. So I see a lot of users interested in using InfluxDB as the backend (and some do, Bosun can talk to it) because it is easier to get started. But if you know you will have to scale up eventually, all the options right now are not appealing in TSDB land :-/ So if some $$ really does get a good TSDB that scales and reasonable to manage then great, but I'm skeptical.

评论 #11263912 未加载

评论 #11266003 未加载

评论 #11265564 未加载

noir-yorkabout 9 years ago

This basically kills Influxdb for me. We're evaluating influxdb in stand-alone mode to collect limited metrics (grafana) with a view to eventually moving more and more data to Influxdb once clustering became available. $399 per month is ridiculous. Clustering is table stakes.Lesson learnt: before deploying a new OSS, check if they have a credible plan to support the project. Otherwise skip.

评论 #11266400 未加载

ericbabout 9 years ago

Ugh, we are a startup, we invested in influxdb on faith that it would "get there", developed around it, and now we can't afford what they want to charge for the scalable version.Maybe we can avoid clustering with some workaround, (sharding) but I feel tricked.

评论 #11265366 未加载

评论 #11266439 未加载

gtirloniabout 9 years ago

People will find increasingly clever ways to work around the lack of clustering, as they always did. This could mean only the top-tier users will be paying, which must be a very small portion of the pie.Even though this announcement is sold as something that will empower InfluxData to add even more cool features to the open source version, I'm actually worried that it won't be around as an open source product for very long. Let me explain.InfluxDB is a very fine product but I don't think it has all the momentum it needs to keep going in the long term yet. That means it's still building that critical mass of supporters in an open source ecosystem. With this announcement, InfluxData is eroding the trust that small to medium-sized users had in it so the momentum slows down. If InfluxDB was a no-brainer for anyone starting a time-series project, it is not anymore. All the other alternatives need to be carefully analyzed and, even if InfluxDB is chosen, there is that voice in the back of your head saying you'll be in trouble if you exceed a single node's capacity. Eventually people will jump to the next TSDB solution that offers clustering as soon as it becomes avaiable. Then where are all those paying customers going? It'll then be easier and cheaper for InfluxDB to become a proprietary software company.While there are some comparisons being made to what Nginx does with its Nginx Plus offering, I think it's the opposite situation. IIRC, since the early days, Nginx has offered a paid product with more features. And recently they have started to add those features to the open source version, so people got happier. InfluxDB has always promised clustering (it was a major selling point, go watch any presentation about it from 2 years ago), shipped the code (even if it's half-working as of now) and now announced it's removing the functionality. Suddenly any InfluxDB node you have (even if it's running just fine without clustering) looks like a lemon.It's all such bad PR. And it's 2016, haven't hundreds of OSS companies been through this already? Oh well.

dammabout 9 years ago

I really don't see a point to worry about. So as of 0.12 the OSS product will change; however a new product will come out to bring in replication. Somehow they will produce a private binary that will do the same thingSingle node usage is great for most people; sure it can be useful to cluster and you have that option.If there's limitations it's open source; those can be removed. Then people would likely use the fork that doesn't have that limitation.I don't see evil here. Just trying to work on their product

评论 #11263905 未加载

kev009about 9 years ago

"less than ½ of a percent are running active clusters." because it doesn't work. They have been TheatricDB to me for a long time, this is just another nail in the coffin.

sp1982about 9 years ago

We use blueflood developed by Rackspace at Square which uses cassandra , coupled with a query layer called MQE (<a href="https://github.com/square/metrics" rel="nofollow">https://github.com/square/metrics</a>). While it's actively developed, I definitely recommend taking a look if you are interested in highly scalable metrics system with decent strategy for rollups. (100k+ metrics/sec).

评论 #11264664 未加载

hoovabout 9 years ago

I'm pretty damn angry right now, but I'm sure that in a few days I'll get over it.I run the tech side of a decent sized startup right now. Having joined ~4 years into the venture, one of my immediate concerns was the lack of visibility into how our system works. Keeping tabs on the performance of several hundred ETL pipelines is not super easy. I decided to double down on InfluxDB, changing the road-map of my infrastructure team.Then, InfluxData introduced the TICK stack. I can get monitoring and alerting as well? Let's double down again!I bought hardware, set up a cluster (painful), we filed bug reports, learned what sorts of queries not to run, and we were in good shape. I purchased tickets for a training session and a flight (out of pocket; we don't have travel budget). And then I saw the blog post last night.All along, I knew that the promise of all of this great technology for free was too good to be true. There was always a little voice in my head asking me about how they actually made money -- I knew that a paid offering + professional services was not sustainable.I'm angry, but we'll be fine. We're going to break apart our cluster and shard. We've already got code written to do a backup/restore that actually works (slowly) from one cluster to another. If we hit the point where sharding doesn't work, then it'll also probably be financially viable to pay the $399/month (on top of server costs). I'll still go to the training, but I'm not sure what I'll get out of it. The reason why I was going to the training was the section on "Cluster Administration".The only thing that I'm raw about is the lack of apology. You have to make money, and you had to do a bait-and-switch as a result. I totally understand, but the tone of that blog post was too defensive and unapologetic.

tlipconabout 9 years ago

Hopefully this isn't too "pitch"-y, but: if you're looking for a database that's good at time series, will always be open source, and does support scale-out and HA, you might be interested in Apache Kudu (incubating).Feel free to drop by our Slack (<a href="http://getkudu-slack.herokuapp.com" rel="nofollow">http://getkudu-slack.herokuapp.com</a> ) if you have any questions.

teromabout 9 years ago

Congrats on the decision. Standalone InfluxDB can be scaled up just fine to meet most usecases, and it's better to have the long-term project sustainability that a hosted/enterprise offering can bring to the standalone offering.InfluxDB 0.9 still had plenty of bugs, and I'd rather see a high-quality standalone server than any not-quite-there-yet clustered version.

评论 #11266245 未加载

评论 #11265346 未加载

spotmanabout 9 years ago

> ... "customers eventually drop support as their infrastructures mature and they look to reduce operating costs"This is part of the game called software. In some worlds this is actually the goal; that the software works so well and is so reliable that your customers eventually don't all need to keep paying you.So you find new customers and add new, never advertised before features and enterprise clients with SLA support contracts, etc.It's very understandable that it's hard to monetize, but giving people the impression clustering was going to be included long term and taking it away is not going to score you points.Furthermore you want to get all the folks doing a startup on a budget hearing a story that works for them. Most ( especially new or young ) cofounder programmer types rarely plan to not need clustering, even if the reality is most won't need or use it. Saying this is for the big kids only is a turn off at this point in the influx story.Wish you the best of luck

troykabout 9 years ago

I understand the need for businesses to make money, but I do not understand how a business can understand OSS and implement the OSS core-only model. If your product has demand, at some point, an OSS project will arise to displace you.Are there many winners with this model to name. Maybe NGINX, MongoDB?

评论 #11263948 未加载

评论 #11263886 未加载

评论 #11263897 未加载

rodionosabout 9 years ago

Axibase Time Series Database is free for pseudo-cluster installations and ships with SQL and visualization included [0].We also have a fully functional Grafana driver in case the customer prefers it over programmable visualization that we ship.By programmable visualization I mean a way of building dashboards using toml-flavored configuration [1] and templating language, as opposed to manual design.[0] <a href="https://axibase.com/products/axibase-time-series-database/" rel="nofollow">https://axibase.com/products/axibase-time-series-database/</a> [1] <a href="https://apps.axibase.com/chartlab/2ef08f32" rel="nofollow">https://apps.axibase.com/chartlab/2ef08f32</a>

fangjinabout 9 years ago

Druid (<a href="http://druid.io/druid-powered.html" rel="nofollow">http://druid.io/druid-powered.html</a>) is another option for similar workloads. Druid is a community-led open source data store used by many technology companies at very large scale. Comes with multiple visualization/open source applications, SQL interfaces, Grafana extensions, and a community to help with issues.

bussettaabout 9 years ago

it appears that paul dix deleted his answer in this stackoverflow question <a href="http://web.archive.org/web/20150416175827/http://stackoverflow.com/questions/25540722/how-to-make-a-choice-between-opentsdb-and-influxdb-or-other-tsds" rel="nofollow">http://web.archive.org/web/20150416175827/http://stackoverfl...</a>

icbm504about 9 years ago

It seems like there are 2 OSS models other there: 1) community supported; 2) company led.Most of the "community supported" projects that I have used are libraries. Since shifting from developer to the devops team, nearly everything that I rely on is "company led": saltstack, docker, grafana, influxdb, ELK, ngnix, sensu, rabbitmq, etc.That being said, we are now re-evaluating the use of influxdb.

sisciaabout 9 years ago

An interesting way to monetize software could be to have people pay to access official docker images.You want to pull the latest images an unlimited number of times ? Sure, it is just XXX$/monthNothing will stop people to build open source containers of the same product, but those won't be "official" and there won't be any update guarantees...

reubensuttonabout 9 years ago

I wonder if clustering will be available in the cloud service?We use a single server on Influx's cloud service for some internal metrics and it has been running without any interruptions for >6mo.

评论 #11265320 未加载

shaneduanabout 9 years ago

Hello, I was directed here through the mentioning of blueflood. I hope it is ok to share some product information. Please let me know if there is a more appropriate way to do it. My only goal is to offer solution to solve the pain we run into frequently ourselves.Rackspace Metrics (the product team behind blueflood) is currently doing many times more than 300 metrics per second. Our goal of this year is to reach to a new scale from where we are at, driven from the needs of big internal adoption this year. So blueflood project would be live and kicking.Here is the most recent update on Rackspace Metrics product. All the functional work are through blueflood: <a href="http://bit.ly/rax-metrics-mvp" rel="nofollow">http://bit.ly/rax-metrics-mvp</a>Also, there is a hidden gem in monitoring unknown for most of the startups. My other product, Rackspace Monitoring, is a SaaS product designed to assist Rackspace business instead of making big bucks. If you have a Rackspace cloud account, which requires minimum $50 service charge, you will be able to use Rackspace Monitoring for free. You can use it to monitor any other servers you host anywhere. This means that if your monitoring budget is already above $50 (all cost considered), you really should look into Rackspace Monitoring.To learn about Rackspace Monitoring, please go here: <a href="https://www.rackspace.com/cloud/monitoring/" rel="nofollow">https://www.rackspace.com/cloud/monitoring/</a>Please also feel free to reach out to me. We are located at the same location of Rackspace Startup Program in San Francisco. We talk to startups all the time and always enjoy the learning.