Fragile narrow laggy asynchronous mismatched pipes kill productivity

264 点作者 trishume大约 5 年前

25 条评论

btreecat大约 5 年前

So I have been thinking about software projects lately, and I have come to the conclusion that a lot of these tools/solutions exist to "build houses" when most of us are just throwing together lean-to sheds and dog houses.Software projects today are naturally more complex and have more complex tooling the same way building a house today requires more knowledge and skill than it did 50 years ago.Then there are some folks/organizations building cathedrals, and the associated tooling (react, angular, maven, etc) and all the rest of us look up in awe and thing "well I guess if I want to be that good I need to use those tools on this dog house."But your dog house doesn't have the need to host parties, provide security, or even real weather protection other than a roof to keep the rain and sun out. Yet we all try to build our dog houses in ways that might be better if they are one day converted to a proper living quarters but likely will never have a need for running Water or windows.

评论 #23216914 未加载

评论 #23217743 未加载

评论 #23217164 未加载

评论 #23218177 未加载

lostcolony大约 5 年前

Failure is such a fun thing to think about, and it gets handwaved away so often. So many devs, architects, product owners, etc, just focus on happy path, and leave failure unspecced, unhandled, and just hope it never happens. And then boast about 99% uptime, but once you start questioning them you find out they get weekly pages they have to go investigate (and really the system is behaving weirdly a solid 10% of the time, but they don't know what to do about it and it eventually resolves itself, and they don't count "pageable weirdness" in their failure metric).It's actually one of the things I love about Erlang, and how it's changed my thinking. Think about failures. Or rather; don't. Assume they'll happen, in ways you can't plan for. Instead think about what acceptable degraded behavior looks like, how to best ensure it in the event of failure, and how to automatically recover.

评论 #23214394 未加载

评论 #23213931 未加载

评论 #23213897 未加载

chubot大约 5 年前

I think git is a good model for what would otherwise be "laggy async and mismatched" distributed systems.It has a fast sync algorithm, and after you sync, everything works locally on a fast file system. You explicitly know when you're hitting the network, rather than hitting it ALL THE TIME.-----I would like to use something like git to store the source code to every piece of software I use, and the binaries. That is, most of a whole Linux distro.I have been loosely following some "git for binary data" projects for a number of years. I looked at IPFS like 5 years ago but it seems to have gone off the rails. The dat project seems to have morphed into something else?Are there any new storage projects in that vein? I think the OP is identifying a real problem -- distributed systems are unreliable, and you can get a lot done on a single machine. But we are missing some primitives that would enable that. Every application is littered with buggy and difficut network logic, rather than having a single tool like git (or rsync) which would handle the problem in a focused and fast way.It would be like if Vim/Emacs and GCC/Clang all were "network-enabled"... that doesn't really make sense. Instead they all use the file system, and the file system can be sync'd as an orthogonal issue.Sort of related is a fast distro here I'm looking at: <a href="https://michael.stapelberg.ch/posts/2019-08-17-introducing-distri/" rel="nofollow">https://michael.stapelberg.ch/posts/2019-08-17-introducing-d...</a>

评论 #23214030 未加载

评论 #23213900 未加载

评论 #23215127 未加载

评论 #23214111 未加载

评论 #23215979 未加载

评论 #23215803 未加载

评论 #23215589 未加载

评论 #23215092 未加载

评论 #23215210 未加载

评论 #23215777 未加载

评论 #23225975 未加载

adrianmonk大约 5 年前

See also Peter Deutsch's "Fallacies of Distributed Computing" list (<a href="https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing" rel="nofollow">https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...</a>).There's some overlap, but also some new stuff. In particular, "pipes" isn't covered by the Fallacies list and is consistently a pain point and/or issue you always face in some way. Also "asynchronous" isn't covered by the Fallacies list.

smitty1e大约 5 年前

> Sometimes a distributed system is unavoidable, such as if you want extreme availability or computing power, but other times it’s totally avoidable.But so much of our sales pitch involves these shiny cloud systems.Who ever sold business by telling the customer: "Your use-case really isn't exciting, and a boring batch-driven process is completely appropriate"?

评论 #23213938 未加载

评论 #23212809 未加载

tlarkworthy大约 5 年前

I love the catagorization. But decent software should be distributed. I dislike single teams that begat 10s of miscroservices, but it should be buy, not build, features from specialized 3rd parties. Thus a decent modern installation should be leaning on a ton of 3rd party services (e.g. Identity providers, databases, caches) because they all do a better job than the hand rolled local one. It's how you outsource expertise.The vision of the service mesh is to make unreliability and security no longer the job if the application binary. Even without a service mesh, you can put a lot of common functionality into a reverse proxy. Personally I am loving OpenResty for the simplicity of writing adapters and oauth at the proxy layer with good performance.

评论 #23214472 未加载

ChrisMarshallNY大约 5 年前

This article has a point.But, as in all things software, "it depends."It depends on what the tools are, and what we are writing.In my own case, I have the luxury of writing fully native Swift code for Apple devices. I don't need to work with anything off the device, except for fairly direct interfaces, like USB, TCP/IP or Bluetooth.Usually.I have written "full stack" systems that included self-authored SDKs in Swift, and self-authored servers in PHP/JS.I avoid dependencies like the plague. Some of them are excellent, and well worth the effort, but I have encountered very few that really make my life as an Apple device programmer that much easier. The rare ones I do use (like SOAPEngine, or ffmpeg, for instance), are local to the development environment, and usually quite well-written and supported.If I were writing an app that reflected server-provided utility on a local device, then there's a really good chance that I'd use an SDK/dependency with network connectivity, like GraphQL, or MapBox. These are great services, but ones that I don't use (at the moment).I'm skeptical of a lot of "Big Social Media" SDKs. I believe that we just had an issue with the FB SDK.That said, if I were writing an app that leveraged FB services, I don't see how I could avoid their SDK.So I write fully native software with Swift, and avoid dependencies. That seems to make my life easier.But Xcode is still a really crashy toolbox.

perfunctory大约 5 年前

Just reading the title I assumed it was a post about business processes and communication between teams. Because this is how working for a big corp sometimes feel.

评论 #23213877 未加载

lowbloodsugar大约 5 年前

Really like the curiosity and thought behind this article. Couple of thoughts:>probably upwards of 80% of my time is spent on things I wouldn’t need to do if it weren’t distributed.Sure. Do it on one giant machine. Then you'll be spending 80% of your time doing things you wouldn't need to do if it weren't monolithic.At the end of the day, if your customer is on the other end of the internet, then all of those complaints apply. If you solve that by running an app on their device, then oh boy are you going to have fun testing.I prefer scaling out. The stackoverflow peeps prefer scaling up. There are some great write-ups about how they scale. I found this [1] one after some quick googling, but I am certain there are more. So it's really about choose your poison.>I think people should be more willing to try and write performance-sensitive code as a (potentially multi-threaded) process on one machine in a fast language if it’ll fit rather than try and distribute a slower implementation over multiple machines.Sure. I once replaced a system that ran on 10 32-core machines with one that ran on one with four cores on one machine and did the work in the same time. Another time I had 96 cores, more threads, and I replaced it with one that had three threads and was faster.But both of those solutions were evolutionary dead-ends. The tasks were very specific, and not subject to change. The first one was a single C file. The latter was actually java, but with hand-rolled hash tables and optimistic locks. The first one I doubt I could follow it now.My point is, you can have understandable systems that good people (as opposed to geniuses) can work on, evolve and adapt, and that have well understood failure modes and scaling cliffs. Or you can have bonkers code that everyone is afraid to touch, and which fails in production when it hit a cliff you didn't know about and now your site is dead for eight days.If you can strike a good balance, then you'll probably have some combination of distributed, and brute force.[1] <a href="http://highscalability.com/stack-overflow-architecture" rel="nofollow">http://highscalability.com/stack-overflow-architecture</a>

评论 #23217304 未加载

evadne大约 5 年前

Do you have a moment to talk about our Lord and Saviour, Erlang/OTP?

评论 #23216441 未加载

crazygringo大约 5 年前

Of course they do. But there's no alternative.No matter how fast or beefy your server is, these days if your product becomes a success, 99% of the time it will outgrow what's possible on a single server. (Not to mention needs for redundancy, geographic latency, etc.) And by the time you see the trend heading upwards so you can predict what day that will happen, you already won't have the time for the massive rewrite necessary.So yes, it's tons slower to write distributed servers/systems. But what other choice do you usually have?Though, as much as possible, you can try to avoid the microservices route, and integrate everything as much as possible into monolithic replicable "full-stack servers" that never talk to each other, but rather rely entirely on things like cloud storage and cloud database. Where you're paying your cloud provider $$$ to not fail, rather than handle it yourself. Sometimes this will work for you, sometimes it won't.

评论 #23215184 未加载

评论 #23217331 未加载

nitwit005大约 5 年前

I've found people have these problems inside of their datacenter, where there is reliable low latency bandwidth, but where things might rebooted due to upgrades or maintenance.Common example is data being pushed between systems with HTTP. Take the simplest case of propagating a boolean value. You toggle some setting in the UI, and it sends an update to another system with an HTTP request, retrying on a delay if it can't connect. This has two problems. The first is that if the user toggles a setting on and then off, you can have two sets of retries going, producing a random result when the far end can be connected to again. The second, is that the machine doing the retries might get rebooted, and people often fail to persist the fact that a change needs to be pushed to the other system.I've seen this issue between two processes on the same machine, so technically you don't even need a network.

评论 #23216991 未加载

jancsika大约 5 年前

> Untrusted: If you don’t want everything to be taken down by one malfunction you need to defend against invalid inputs and being overwhelmed. Sometimes you also need to defend against actual attackers.Sorry, but unless your centralized alternative is only used internally by troglodytes you have to at least defend against invalid inputs.

robbrown451大约 5 年前

The title reminded me of the turboencabulator. <a href="https://www.thechiefstoryteller.com/2014/07/16/turbo-encabulator-best-worst-jargon/" rel="nofollow">https://www.thechiefstoryteller.com/2014/07/16/turbo-encabul...</a>

lazyjones大约 5 年前

Networking and distributed software are well understood nowadays, the lower productivity of web companies vs. SpaceX etc. comes from all the unstable (both ever-changing and buggy) software they need to use. Most modern software is affected by this, but the web has it worse because of security issues and because the way browsers are evolving (on purpose, one has to add, because it's a cartel of large players on the web trying to stifle competition). SpaceX doesn't get some innovative new alloy they didn't order every couple of weeks and even the games industry has fewer obstructing external dependencies (hardware vendors and their drivers being one).At least that's my experience from about 20 years of web development (15 professionally).

at_a_remove大约 5 年前

Yes, I have a little Python library for managing network shares in Windows.It has things like automatic retries that "back off" slowly, switching to cached IPs in case DNS is down, and checking to see if all of the drive-letters are full and either re-using a letter or creating a "letter-less" share. I had to develop it during a period of great instability within our network. It's ... large and over-engineered, but it just keeps on truckin'.On the other hand, it has been quite useful going forward, so that's a plus.I tend to program fairly defensively, in layers, right down to the much maligned Pokemon exception handling. The results don't have the, ah, velocity that is so often praised but they'll be there ticking along years later.

hn_throwaway_99大约 5 年前

Hah, I saw the title "Fragile narrow laggy asynchronous mismatched pipes kill productivity" and thought it was about the pitfalls of trying to coordinate remote teams across disparate time zones.

评论 #23217418 未加载

carapace大约 5 年前

> I hope this leads you to think about the ways that your work could be more productive if you had better tools to deal with distributed systems, and what those might be.We have tools. Promula/SPIN model checker is one just off the top of my head.

Andaith大约 5 年前

Just some fun with the English language:> probably upwards of 80% of my time is spent on things I wouldn’t need to do if it weren’t distributed.If you don't ever plan on distributing your software you can save a _lot_ of time :)

wpietri大约 5 年前

Funnily, I thought the headline was talking about development process, as that also describes how a lot of places (mis-)handle the flow of what get worked on.

_bxg1大约 5 年前

> I also think all these costs mean you should try really hard to avoid making your system distributed if you don’t have to.There's a point here about microservices.

emmelaich大约 5 年前

It's perfectly applicable to people too; one stickler for the rules or slow working in a critical role, say security officer, or change board chair can kill productivity.

ahh大约 5 年前

I feel so attacked right now.

RandyRanderson大约 5 年前

TLDR he means microservices. We now have a generation that doesn't even recall a non-MS world.Tristan, the generation between us has created the IT world you describe. You'll probably spend the next 20 years of your career dealing with that mess. Sorry about that.

gfxgirl大约 5 年前

doh! I thought this was going to be about remote work.