Swap, swap, swap, and bad places to work

213 pointsby r4umalmost 6 years ago

22 comments

deathanatosalmost 6 years ago

Swap (her solution), xor the kernel should just OOM kill Chrome; but spinning the disk for half an hour to accomplish border-line nothing for a human that has long since gotten bored and left is pointless — the behavior noted in the LKML post is severely annoying, and not productive.My previous team was also a "no swap in prod", and this behavior bit us more than I care to admit. The devs were occasionally on the side of "swap for safety", ops was religious no-swap, and ugh. It can take 10+, 30+ minutes for systems encountering this to resolve to some meaningful conclusion, and half the time, I'm desperately trying to ssh in so-as to kill -9 the errant task anyway but ssh is paged out, and I wish the OOM killer would just do it for me instead of Linux trying to page everything through what feels like a single 4KiB page. I need to play around will sysctls more on some sort of test rig.On AWS instances with EBS disks (most instances), disk is basically network.I once suggested "cgroup'ing" (loosely speaking) the entire system into two rough buckets: one for SSH, with enough dedicated RAM that ssh will never get swapped, and one for everything else.Also, I feel like the number of devs out there who understand that mmap'd files — including binaries/libraries — are basically mini-swap files when memory pressure is high is really low; more than once I've diagnosed a machine as "page thrashing" to get back "what? but it has no swap that cannot be?". Well, pgmajfault and disk I/O metrics don't lie.

评论 #20653224 未加载

评论 #20656227 未加载

评论 #20652810 未加载

评论 #20652975 未加载

评论 #20656036 未加载

评论 #20657407 未加载

评论 #20654394 未加载

KaiserProalmost 6 years ago

Swap, the sub religion.I've wobbled back and forward on swap. In the early days I used to be annoyed by how much disk space it'd take. (an 8 gig disk with 1 gig for swap is too much)I've run an 8 core machine with only 2 gigs of ram, and tried to compile something with boost in it. swap allowed me to kill it, and recover the system.I've run VMs with no swap, some swap and loads.However, what I've never done is actually benchmarked the same workload on machines with no, some and loads of swap. However, I generally defer to Rachel, because Rachel has been there and been bitten by that before.On this point: <a href="http://rachelbythebay.com/w/2018/04/28/meta/" rel="nofollow">http://rachelbythebay.com/w/2018/04/28/meta/</a> which everyone should read and digest, this remark jumped out:>"This is so easy to test for"If I ever say this and never qualify it, it should read: "haha, yeah, I made that same mistake, that's why I test for it now."The only reason why I am "better"[I hope] than my younger self, is because I've made a bucket load of mistakes before. Some of them are technical, but to be honest a load of them are societal. (as in, bleating like Cassandra and not being able to affect change.)If we, as "engineers" are to grow as a class of people, we have to actually learn from other people's mistakes, not just use them as bias confirmation. This is why I like blog posts where they lay out the problem, outfall, cause, workaround and eventual solution.

评论 #20654740 未加载

评论 #20652902 未加载

评论 #20652748 未加载

评论 #20654416 未加载

gumbyalmost 6 years ago

A non-swap comment:> For everyone else, you'd probably cry too. I sure did.I remember a colleague (employee) crying because a third party vendor screwed up and tried to blame us which would have sent a multimillion dollar project down the tubes. What saddened me was she felt the need to apologize. It was a pure expression of frustration, anger, sadness, and exhaustion on a project we were all deeply committed to, produced by the brazen unfairness of this contract house.It's not good to live in a culture that denigrates human expression. I'm glad rachelbythebay was able to express this.* We were able to apportion blame properly, get a proper result from someone else, and make the regulators happy with no funny business.

VectorLockalmost 6 years ago

Desktop machine? Swap. For some reason you have a single server and don't care about performance? Swap. Running an application that might have a working set thats larger than RAM and the application doesn't understand how to do its own disk paging? Swaps good there!Larger scale systems with redundancy? No swap.Having swap in systems like this still doesn't make sense to me. It treads heavily on the "cattle not pets" philosophy. I shouldn't be ssh-ing into a machine thats swapping to see whats up. It should be killed. One server in the cluster starts swapping and falls out of step with its peers? It should be killed. When a machine starts swapping it falls into a while different performance regime than the rest of your systems, now you've got more variance in your response times. Not good when you care about your response times. Unless you have memory-pretending-to-be-disk for swap (in which case why isn't it just memory)I've never seen a machine 'act funny' because it didn't have swap, its always the other way around. I don't think I've ever encountered a machine that used so much memory that the kernel didn't have buffers, but not so much that it invoked OOM killer. Unless there was a woefully misconfigured process running on the machine.If a machine is well utilized CPU wise it is going to get absolutely crushed when it starts swapping.Time and time again I see swap being an issue. The past year I've been in a Large Scale shop which for some ungodly reason it has swap (nowhere I've been in the past 10 years as swap as a general rule)Don't even get me started with EBS IOPS exhaustion when you start swapping onto an EBS volume.

评论 #20654675 未加载

评论 #20655012 未加载

评论 #20654945 未加载

arendtioalmost 6 years ago

I wonder why we talk so often about swap but rarely about using zram. I mean, isn't it much simpler to add some zram as swap instead of messing with the partitions and in the end, it should solve the problem equally well, doesn't it?I have seen this being done on Android devices and wondered why it is being used so rarely in other areas (Desktops/Servers).

评论 #20652982 未加载

评论 #20653031 未加载

评论 #20653837 未加载

lambdasquirrelalmost 6 years ago

I don’t know if this is just the big corporations in Silicon Valley. Just guys in general around here (in tech) seem like that. There’s a whole movement around empathy and then vulnerability but that just makes the competition more veiled.

评论 #20652014 未加载

评论 #20652013 未加载

georgebarnettalmost 6 years ago

I’ve run clusters of several thousands machines installed with petabytes of ram with no swap (or even disks).It works just fine however you need to keep a appropriate headroom to allow the kernel to do its thing with caches as indicated otherwise things get very weird very quickly.Containers are very helpful in this regard for helping explicitly divide a machine up between processes without allowing any one to get out of hand.

评论 #20652740 未加载

评论 #20655214 未加载

评论 #20652233 未加载

scottlambalmost 6 years ago

On the swap aspect:I absolutely hate Linux's behavior with swap enabled, as described in a previous thread: <a href="https://news.ycombinator.com/item?id=20479622" rel="nofollow">https://news.ycombinator.com/item?id=20479622</a>It makes sense that it can also be broken with swap disabled: paging out too many file-backed pages can also lead to an unresponsive system.> Earlier this week, a post to the linux-kernel mailing list talking about what happens when Linux is low on memory and doesn't have swap started making the rounds. ... Now, here we are in 2019, and we have a fresh set of people still fighting over it, like it's some kind of brand new dilemma. It's not.The problem isn't new, but the approach I saw them discussing (use the new PSI stuff to OOM kill early) is new—PSI was only added ~a year ago, iirc. So I think this comment is unnecessarily dismissive.I've seen the systems behave badly without swap. I don't see the bad swapless behavior as often personally, but I believe it exists. (In particular, I haven't tried the reproduction instructions in the lkml thread.) I don't know how the "tinyswap" approach is supposed to help—I'd love details. Swapless with the PSI-based OOM killing is an approach that actually makes sense to me in theory.

psibialmost 6 years ago

> I stand by my original position: have some swap. Not a lot. Just a little.Is there some exact figure on this ? Like how much percentage of RAM size should be allocated as swap space.

评论 #20652219 未加载

评论 #20652464 未加载

评论 #20652144 未加载

评论 #20652181 未加载

评论 #20652128 未加载

评论 #20653167 未加载

评论 #20652139 未加载

mzsalmost 6 years ago

link to flattened lkml discussion thread: <a href="https://lore.kernel.org/lkml/d9802b6a-949b-b327-c4a6-3dbca485ec20@gmx.com/T/#u" rel="nofollow">https://lore.kernel.org/lkml/d9802b6a-949b-b327-c4a6-3dbca48...</a>some background on anonymous memory: <a href="https://utcc.utoronto.ca/~cks/space/blog/unix/NoSwapConsequence" rel="nofollow">https://utcc.utoronto.ca/~cks/space/blog/unix/NoSwapConseque...</a>

kazinatoralmost 6 years ago

If you don't allocate swap, you have to do other things to compensate, like reduce or eliminate overcommit.At one company where I worked over a decade ago, we ran some Linux-based equipment without swap also. To prevent executables from being evacuated by low memory pressure, I put a hack into the kernel: executables and shared libs were mapped such that they were nailed into memory (MAP_LOCKED).

andmariosalmost 6 years ago

I would assume that the consensus is clear these days that swap is good and should be enabled for most cases for Linux > 4.0.Of course, real-life often is different than theory. Does your machine have a spinning disk or an SSD? I am much faster to enable swap on an SSD, since it won't be painfully slow, should we ever get into a situation where our RAM is saturated.What happens in cloud VMs? These things use a network disk storage (transparent to us), and often writes need to be sent over the network more than once (for redundancy). How extensive swapping would behave in such an environment?As for saying no, it's important to set some rules to avoid chaos, but it's also important to trust our senior people to take decisions. If they need to go against a rule, I would expect a good explanation in their commit —because, infrastructure as code— and documentation. If a junior wants to go against a rule, they can consult a senior. Issuing a no and expecting everyone to follow it blindly, is the worst form of micromanagement. :)

bluntealmost 6 years ago

It seems to me that if you hit the point where you really need swap, then you're already in trouble. Maybe that swap gives you a little buffer before things get really bad, but chances are that will just keep you unaware of your impending problem until they go critical (unless you have lots of good monitoring/alerting).

评论 #20653262 未加载

rcfoxalmost 6 years ago

> Item: If you allocate all of the RAM on the machine, you have screwed the kernel out of buffer cache it sorely needs. Back off.Why not just permanently allocate enough RAM for the kernel? If I have 16GB of RAM but the kernel needs 1GB to do its job, then just tell me that I have 15GB to work with.

评论 #20656462 未加载

dreamcompileralmost 6 years ago

This is an example of where desktop and server engineers could benefit from having embedded design experience.It's certainly possible to create a small, protected area of memory that contains a kernel-level interrupt handler (which itself allocates no memory) whose sole job is to run a couple of times a second and check for thrashing and OOM. If it sees memory problems, it takes over the computer, determines which processes are using the most memory and kills the ones that are expendable. ("Expendable" is a list configurable by the user and yeah, Chrome would be right at the top for a desktop system.)Embedded systems designers routinely build such watchdogs into their systems. It could probably be added to Linux as a kernel patch.

scottlambalmost 6 years ago

In a first for one of these "bad places to work" stories, I recognized the project she described in the "A patch which wasn't good enough (until it was)" post linked from this one, so I looked up the history. Sure enough, I know the developer she was complaining about in both posts.In the patch case, he asked about testing, and they realized the ssh/scp versions she tested with weren't the same as the ones the code was using. She promised to follow up with best-practice testing and didn't. (Without knowing the reason, this isn't unusual: people get busy and drop things all the time.) I didn't get the same sense of rejection or hostility she did. And the second developer (who got her patch accepted) credited her in the code review, tested in a middle way (better than she originally did, worse than she promised to do later), requested the review from a different person than she had (why I don't know), and got a review question with a similar tone before it was accepted. None of the parties' behavior looked unusual/red-flag-worthy to me.I don't fault her for imperfectly describing an interaction that was five years ago when she wrote that post and is twelve years ago now. I'm trying to figure out what the lesson is and who should be learning it. A few unorganized ideas:* Much of what people are thinking and feeling is left unwritten/unsaid, so two people can have very different ideas of what happened. (A reminder I suppose to listen to both sides before making a judgement on something.)* I don't want to dismiss her feeling about bad team dynamics, even if I don't see them in this particular interaction. "At the end of the day people won't remember what you said or did, they will remember how you made them feel." - Maya Angelou* A (imo typical) code review question can seem intimidating or hostile from a senior developer when "you're already not sure you belong there at all". Maybe an in-person follow-up would have helped, either then or later ("hey, did you have a chance to try writing that test? can I help? I want to get your change in"). I've been on both sides of this one. The junior developer often wants some extra help and attention, and the senior developer is often feeling overwhelmed by the volume of questionable-quality things coming in, such that they can go into more of a gatekeeper role than trying to mentor each person thoroughly in each interaction. (I think this is what she's talking about with "Any lazy fool can deny a request and get you to 'no.' It takes actual effort to appreciate and recognize what they're trying to accomplish and try to help them get to a different 'yes'.")

c12almost 6 years ago

On every dedicated box I keep a swap partition with an alert raised when its used beyond a certain threshold. For all VMS no-swap because as far as they are concerned disk = network.Then again anything above 80% memory utilisation and we begin looking at adding another box to the cluster due to the fact that occasional spikes in usage can easily put us beyond what swap can protect from and that just causes a shit storm.

gokalmost 6 years ago

Swap is best seen as a component of the "bad idea trinity", next to fork() and overcommit.

评论 #20656642 未加载

tempodoxalmost 6 years ago

Situations like the ones described are the reason that the Bastard Operator From Hell is still a wet dream for some of us.<a href="http://bofh.bjash.com/" rel="nofollow">http://bofh.bjash.com/</a>

hacknatalmost 6 years ago

One problem with Linux in low memory situations is that the OOM killer is a really blunt force instrument. It would be nice if it were a lot more configurable. Simple OOM scores don't cut it, IMO.

评论 #20656209 未加载

gwbas1calmost 6 years ago

Did I miss the point? Is this a rant about incorrectly configuring swap space, or is this a rant about some kind of bad team dynamics?Anyway, isn't this the kind of argument that should be replaced by gathering objective data? Otherwise, the low/no swap space problems really appear to be symptoms of someone irresponsibly experimenting in production.

评论 #20655651 未加载

raldialmost 6 years ago

"Nope" is not a strategy.

22 comments

deathanatosalmost 6 years ago

评论 #20653224 未加载

评论 #20656227 未加载

评论 #20652810 未加载

评论 #20652975 未加载

评论 #20656036 未加载

评论 #20657407 未加载

评论 #20654394 未加载

KaiserProalmost 6 years ago

评论 #20654740 未加载

评论 #20652902 未加载

评论 #20652748 未加载

评论 #20654416 未加载

gumbyalmost 6 years ago

VectorLockalmost 6 years ago

评论 #20654675 未加载

评论 #20655012 未加载

评论 #20654945 未加载

arendtioalmost 6 years ago

评论 #20652982 未加载

评论 #20653031 未加载

评论 #20653837 未加载

lambdasquirrelalmost 6 years ago

评论 #20652014 未加载

评论 #20652013 未加载

georgebarnettalmost 6 years ago

评论 #20652740 未加载

评论 #20655214 未加载

评论 #20652233 未加载

scottlambalmost 6 years ago

psibialmost 6 years ago

> I stand by my original position: have some swap. Not a lot. Just a little.Is there some exact figure on this ? Like how much percentage of RAM size should be allocated as swap space.

评论 #20652219 未加载

评论 #20652464 未加载

评论 #20652144 未加载

评论 #20652181 未加载

评论 #20652128 未加载

评论 #20653167 未加载

评论 #20652139 未加载

mzsalmost 6 years ago

kazinatoralmost 6 years ago

andmariosalmost 6 years ago

bluntealmost 6 years ago

评论 #20653262 未加载

rcfoxalmost 6 years ago

评论 #20656462 未加载

dreamcompileralmost 6 years ago

scottlambalmost 6 years ago

c12almost 6 years ago

gokalmost 6 years ago

Swap is best seen as a component of the "bad idea trinity", next to fork() and overcommit.

评论 #20656642 未加载

tempodoxalmost 6 years ago

Situations like the ones described are the reason that the Bastard Operator From Hell is still a wet dream for some of us.<a href="http://bofh.bjash.com/" rel="nofollow">http://bofh.bjash.com/</a>

hacknatalmost 6 years ago

One problem with Linux in low memory situations is that the OOM killer is a really blunt force instrument. It would be nice if it were a lot more configurable. Simple OOM scores don't cut it, IMO.