We decided to go for the big rewrite

167 pointsby duijfover 5 years ago

16 comments

royjacobsover 5 years ago

Reading this article it seems like yet another example of "you don't have big data". Most of the features that are unique to Spark (or Spark-like setups) were not needed, so in the end it's mostly...just an app talking to Postgres?I'm not sure, but reading other articles[0] on the blog seems like they've been jumping on bandwagons before, so it's probably good to come back on those decisions every now and again.Edit: Not trying to come off as too snarky, though I've found that this type of thing is pretty common in startups where everyone from the CTO on down has some experience but not a lot of experience. I've fallen into that trap too, at some point saying "Sure, Scala will work great! It's future-proof and everyone will love it!" cue crickets[0] <a href="https://tech.channable.com/posts/2017-02-24-how-we-secretly-introduced-haskell-and-got-away-with-it.html" rel="nofollow">https://tech.channable.com/posts/2017-02-24-how-we-secretly-...</a>

评论 #21156585 未加载

评论 #21157014 未加载

评论 #21156789 未加载

评论 #21164737 未加载

评论 #21158167 未加载

评论 #21157855 未加载

alexpotatoover 5 years ago

Rewriting an application can mean different things:1. "We are going to start over from scratch and rewrite the whole thing!"Joel Spolksy famously said to "Never do this!"2. "We are going to slowly refactor the whole codebase."This can, eventually, lead you to a place where none of the original code is there so it's like a rewrite but much simpler.3. "We are going to slowly add new places to replace the old system till there is no new system."This is called the "Strangler App" model as described by Martin Fowler (<a href="https://martinfowler.com/bliki/StranglerFigApplication.html" rel="nofollow">https://martinfowler.com/bliki/StranglerFigApplication.html</a>)Granted, for some reason, it seems that "I retired a legacy system and rolled out a brand new one" seems to look better on a resume than "I refactored a legacy system into a better system." so your YMMV.

评论 #21158327 未加载

评论 #21158789 未加载

gtsteveover 5 years ago

This doesn't really sound like a big-bang rewrite as such but an incremental development process. The sort of rewrite that would be a major strategic mistake is where you start with a new repository and begin re-implementing the entire product, but this seems all perfectly sensible to me.This just sounds like the sort of incremental Ship of Theseus [0] development that many of us are doing. The product I'm working on has had enough key internal portions rewritten over a long enough time (including interestingly the job management system) that you could say it's a rewrite compared to the product from 2 years ago.[0] <a href="https://en.wikipedia.org/wiki/Ship_of_Theseus" rel="nofollow">https://en.wikipedia.org/wiki/Ship_of_Theseus</a>

评论 #21156416 未加载

评论 #21156568 未加载

评论 #21156500 未加载

Darkstryderover 5 years ago

> Prematurely designing systems “for scale” is just another instance of premature optimization> Examples abound: (...) using a distributed database when Postgres would doThis is the only part of the article that bugged me a little, because in my experience the choice between single-machine and distributed databases is not so much about scale as it is about availability and avoiding a single point of failure.Even if your database server is fairly stable (a VM in a robust cloud for instance), if you use Postgres or MySQL and you need to upgrade to a newer version of the database (let say for an urgent security update), you have no choice but to completely stop the service for a few seconds / minutes (assuming the service cannot work without its database).Depending on the service and its users, this mandatory down-time might or might not be acceptable.Anecdotally I suspect services requiring high SLAs are more common than ones requiring petabyte scale storage.

评论 #21156648 未加载

评论 #21156486 未加载

评论 #21156896 未加载

评论 #21156740 未加载

评论 #21158045 未加载

评论 #21159146 未加载

评论 #21156812 未加载

评论 #21156494 未加载

jermaustin1over 5 years ago

This is a story of one of my first product launches. And inevitable rewrites that ensued. You can read more of it here: <a href="https://jeremyaboyd.micro.blog/2016/11/05/my-first-product.html" rel="nofollow">https://jeremyaboyd.micro.blog/2016/11/05/my-first-product.h...</a>Years ago I was building SEO software. One of the products was originally written as an internal tool, and handled our work load without skipping a beat. Then we decided to release it to the public, so I did a small refactor to implement accounts. We launched with it on hosted on a small Dell PC under my desk (where it had been running as our internal tool). Within 2 hours of launch, it was completely overwhelmed and shutting down due to overheating.It was "rewrite" time.While doing that, I had to come up with SOME work around. So I opened the case and stuck a box fan on it to try and exhaust some of the heat. That lasted about 8 more hours. Before the server shut down, and I got a call from the boss.I went in to the office in the middle of the night, and started profiling the application. I found a VPS host, quickly spun up their largest Windows VM they offered, and that helped for a few days while I rewrote large swaths of the application. Even after a ~80% rewrite and splitting the application in two, we had more users that we'd ever anticipated and I was out of my depths with scaling. So we got a few (much larger) physical servers at Softlayer.This was the setup that this website ran under for the next couple of years with minor tweaks, more space with an iSCSI array, more RAM, migrating to a more CPU, etc, but all staying at Softlayer. Eventually when the hosting bill was getting into the high four figures a month, we reevaluated and decided a rewrite was in order to switch it to Microsoft Azure utilizing Azure SQL, Azure Table Storage, Azure Queue Service, and offloading all of the complicated tasks from the web server onto the Azure infrastructure. For all I know it is still on Azure.

goto11over 5 years ago

Is the current system is so badly architected that it cannot be refactored gradually or rewritten piecemeal? Then the forces which caused these problems will also be in effect during the rewrite, so it will end in the same place when it reaches feature parity.I can only think of a few places where a full rewrite is justified:* You lost the source code* The application is almost purely integration with some 3'rd party platform or component, and you need to replace that platform. (E.g. you are developing a registry cleaner and need to port it to Mac)* You don't have any customers or users yet and time-to-market is not a concern.* You are not a business and are writing the code purely for your own enjoymentBut these are business level considerations. For individual developers there may be compelling reasons to push for a rewrite:* You find it more fun to work on green-field projects than to perform maintenance development.* The new platform is more exciting or looks better on the CV than the old

评论 #21158626 未加载

mark_l_watsonover 5 years ago

I have also found Spark (and Hadoop before that) a little clunky to prototype and develop on, but when you need to handle very large data sets with good throughout performance then systems like Spark/Hadoop are great. One problem they had was maintaining infrastructure, and to be honest, when I used mapreduce as a contractor at Google or AWS Elastic MapReduce as a consultant I didn’t have to deal too much with infrastructure.Anyway, it makes sense that they backed off using Spark and HDFS - makes sense given the size of their datasets.The original poster mentioned that their data analytics software is written in Haskell. I would like to see a write up on that.EDIT: I see that they do have two articles on their blog on their use of Haskell.

z3t4over 5 years ago

"A creature from another dimension would see us as noise as our atoms are constantly being replaced"Don't wait for a big rewrite. Constantly keep deleting and rewriting. Just make sure you are solving real problems while doing it.

评论 #21156811 未加载

lifeisstillgoodover 5 years ago

>>> One of our main reasons for choosing Apache Spark had been its ability to handle very large datasets (larger than what you can fit into memory on a single node) and its ability to distribute computations over a whole cluster of machines ... We cannot fit all of our datasets in memory on one node, but that is also not necessary, since we can trivially shard datasets of different projects over different servers, because they are all independent of one another.So this seems to be the massive takeaway - if you need to operate on a whole dataset that is larger than one node's memory capacity then you have to go distributed. Else it still seems an overhead barely worth the effort.So Google: dataset is all web pages on the internet - yes that's too large go distributed.Tesco / Walmart : dataset might be all the sales for a year. Probably too large. But could you do with sales per week? per day?having the raw data of all your transactions etc lying around waiting for your spiffo business query sounds good but ... is it?I would be interested in hearing folks' cut-off points for going full Big Data vs "we don't really need this"

jimbokunover 5 years ago

We decided to go for a Big Rewrite for a completely different reason. The initial license for the proprietary NoSQL database we had negotiated was about to expire, and the company was going to charge us an order of magnitude more to renew.So we immediately set out redesigning our system to use other, fully open source technologies. Also gave us an opportunity to reconsider architecture decisions that had not scaled well. In our case, moving from a monolith to microservices has had major benefits. Maybe the biggest being able to quickly see which microservice is the bottleneck and needs to be scaled up to handle the load. With the monolith, if it got slow, it was very difficult to figure out which part of the workload was making it slow.

bluedinoover 5 years ago

Our company runs on a pile of VBA/Access. At least it talks to a MariadDB server on Linux.The biggest problems are trying to run/develop this code on machines that were made in the last ten years, the other is that it's a horrible, horrible codebase. Code practices from the early 90's.To make things worse, objects are 'evil', all HTML/SQL/XML is built by appending strings, there's no data sanity checks.....I started a proof of concept replacement system that was written in Python and ran on the web.It was met with "We can go with a web based system, since if anything changes in the browser we'll be up shit creek.":-/

评论 #21157016 未加载

sandGorgonover 5 years ago

in case you are using PySpark - a good framework to move to is Dask. <a href="https://docs.dask.org/en/latest/" rel="nofollow">https://docs.dask.org/en/latest/</a>it is also natively integrated with K8s - <a href="https://github.com/dask/dask-kubernetes" rel="nofollow">https://github.com/dask/dask-kubernetes</a> - and Yarn <a href="https://yarn.dask.org/en/latest/" rel="nofollow">https://yarn.dask.org/en/latest/</a>

roland35over 5 years ago

I am currently evaluating if we should rewrite a large chunk of our embedded controller which handles motion control right now, so this is a timely write-up! I think the lessons here are the same in embedded code - we can keep the existing black box (consultant written!) code for now while the new motion control code is written in a parallel branch. The more modular the project the easier this is luckily!

sebastianconcptover 5 years ago

This begs the question: in which situations is it appropriate to decide on a full rewrite?In theory, there is an easy answer to this question: If the cost of the rewrite, in terms of money, time, and opportunity cost, is less than the cost of fixing the issues with the old system, then one should go for the rewrite.In our case, the answer to all of these questions was yes.One of our original mistakes (back in 2014) had been that we had tried to “future-proof” our system by trying to predict our future requirements. One of our main reasons for choosing Apache Spark had been its ability to handle very large datasets (larger than what you can fit into memory on a single node) and its ability to distribute computations over a whole cluster of machines4. At the time, we did not have any datasets that were this large. In fact, 5 years later, we still do not. Our datasets have grown by a lot for sure, both in size and quantity, but we can still easily fit each individual dataset into memory on a single node, and this is unlikely to change any time soon5. We cannot fit all of our datasets in memory on one node, but that is also not necessary, since we can trivially shard datasets of different projects over different servers, because they are all independent of one another.With hindsight, it seems obvious that divining future requirements is a fool’s errand. Prematurely designing systems “for scale” is just another instance of premature optimization, which many development teams seem to run into at one point or another

AzzieElbabover 5 years ago

This article is insane. I wouldn't even know how to begin configuring HDFS/Spark cluster for 10Gb of data.

aptaover 5 years ago

Next up: why we decided to re-write our Haskell re-write in $HYPED_LANGUAGE.

评论 #21158942 未加载

评论 #21159032 未加载