Two frequently used system calls are ~77% slower on AWS EC2

415 pointsby jcapoteabout 8 years ago

22 comments

Yes, this is why we (Netflix) default to tsc over the xen clocksource. I found the xen clocksource had become a problem a few years ago, quantified using flame graphs, and investigated using my own microbenchmark.Summarized details here:<a href="https://www.slideshare.net/brendangregg/performance-tuning-ec2-instances/42" rel="nofollow">https://www.slideshare.net/brendangregg/performance-tuning-e...</a>

评论 #13814648 未加载

评论 #13815162 未加载

评论 #13814141 未加载

drewg123about 8 years ago

Another option is to reduce usage of gettimeofday() when possible. It is not always free.Roughly 10 years ago, when I was the driver author for one of the first full-speed 10GbE NICs, we'd get complaints from customers that were sure our NIC could not do 10Gbs, as iperf showed it was limited to 3Gb/s or less. I would ask them to re-try with netperf, and they'd see full bandwidth. I eventually figured out that the complaints were coming from customers running distros without the vdso stuff, and/or running other OSes which (at the time) didn't support that (Mac OS, FreeBSD). It turns out that the difference was that iperf would call gettimeofday() around every socket write to measure bandwidth. But netperf would just issue gettimeofday calls at the start and the end of the benchmark, so iperf was effectively gettimeofday bound. Ugh.

评论 #13814772 未加载

评论 #13819315 未加载

评论 #13813779 未加载

nneonneoabout 8 years ago

The title is misleading. 77% slower sounds like the system calls take 1.77x the time on EC2. In fact, the results indicate that the normal calls are 77% faster - in other words, EC2 gettimeofday and clock_gettime calls take nearly 4.5x longer to run on EC2 than they do on ordinary systems.This is a big speed hit. Some programs can use gettimeofday extremely frequently - for example, many programs call timing functions when logging, performing sleeps, or even constantly during computations (e.g. to implement a poor-man's computation timeout).The article suggests changing the time source to tsc as a workaround, but also warns that it could cause unwanted backwards time warps - making it dangerous to use in production. I'd be curious to hear from those who are using it in production how they avoided the "time warp" issue.

评论 #13815265 未加载

评论 #13815882 未加载

评论 #13819894 未加载

binarycrusaderabout 8 years ago

I prefer the way Solaris solved this problem:1) first, by eliminating the need for a context switch for libc calls such as gettimeofday(), gethrtime(), etc. (there is no public/supported interface on Solaris for syscalls, so libc would be used)2) by providing additional, specific interfaces with certain guarantees:<a href="https://docs.oracle.com/cd/E53394_01/html/E54766/get-sec-fromepoch-3c.html" rel="nofollow">https://docs.oracle.com/cd/E53394_01/html/E54766/get-sec-fro...</a>This was accomplished by creating a shared page in which the time is updated in the kernel in a page that is created during system startup. At process exec time that page is mapped into every process address space.Solaris' libc was of course updated to simply read directly from this memory page. Of course, this is more practical on Solaris because libc and the kernel are tightly integrated, and because system calls are not public interfaces, but this seems greatly preferable to the VDSO mechanism.

评论 #13814140 未加载

评论 #13816483 未加载

评论 #13814422 未加载

评论 #13814312 未加载

jdamatoabout 8 years ago

Author here, greetings. Anyone who finds this interesting may also enjoy our writeup describing every Linux system call method in detail [1].[1]: <a href="https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/" rel="nofollow">https://blog.packagecloud.io/eng/2016/04/05/the-definitive-g...</a>

评论 #13814421 未加载

评论 #13814407 未加载

评论 #13815502 未加载

评论 #13817901 未加载

JoshTriplettabout 8 years ago

For anyone looking at the mentions of KVM "under some circumstances" having the same issue and wondering how to avoid it with KVM: KVM appears to support fast vDSO-based time calls as long as:- You have a stable hardware TSC (you can check this in /proc/cpuinfo on the host, but all reasonably recent hardware should support this).- The host has the host-side bits of the KVM pvclock enabled.As long as you meet those two conditions, KVM should support fast vDSO-based time calls.

masklinnabout 8 years ago

So… it's not that the syscalls are slower, it's that the Linux-specific mechanism the Linux kernel uses to bypass having to actually perform these calls does not currently work on Xen (and thus EC2).

评论 #13814391 未加载

andygrunwaldabout 8 years ago

This was also presented at the last AWS re:Invent in December. See AWS EC2 Deep Dive: <a href="https://de.slideshare.net/mobile/AmazonWebServices/aws-reinvent-2016-deep-dive-on-amazon-ec2-instances-featuring-performance-optimization-best-practices-cmp301" rel="nofollow">https://de.slideshare.net/mobile/AmazonWebServices/aws-reinv...</a>

chillydawgabout 8 years ago

Interesting way to find out the version of the hypervisor kernel. If the gtod call returns faster than the direct syscall for it, then you know the kernel version is prior to that of the patch fixing the issue in xen.I expect there are many such patches that you could use to narrow down the version range of the host kernel. Once you've that information, you may be in a better position to exploit it, knowing which bugs are and are not patched.

nodesocketabout 8 years ago

If anybody is interested, Google Compute Engine VM's result.<pre><code> blog ~ touch test.c blog ~ nano test.c blog ~ gcc -o test test.c blog ~ strace -ce gettimeofday ./test % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 100 gettimeofday ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000</code></pre>

gtirloniabout 8 years ago

Previous related discussion: <a href="https://news.ycombinator.com/item?id=13697555" rel="nofollow">https://news.ycombinator.com/item?id=13697555</a>

评论 #13814077 未加载

amlutoabout 8 years ago

vDSO maintainer here.There are patches floating around to support vDSO timing on Xen.But isn't AWS moving away from Xen or are they just moving away from Xen PV?

apetrescabout 8 years ago

Does anyone have any intuition around how this affects a variety of typical workflows? I imagine that these two syscalls are disproportionally likely to affect benchmarks more than real-world usage. How many times is this syscall happening on a system doing things like serving HTTP, or running batch jobs, or hosting a database, etc?

评论 #13818237 未加载

评论 #13814405 未加载

anonymous_iamabout 8 years ago

I wonder if they tried this: <a href="https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/" rel="nofollow">https://blog.packagecloud.io/eng/2017/02/21/set-environment-...</a>

xenophonfabout 8 years ago

Is this just an EC2 problem, or does it affect any Xen/KVM guest?I ran the test program on a Hyper-V VM running CentOS 7 and got the same result: 100 calls to the gettimeofday syscall. Conversely, I tested a vSphere guest (also running CentOS 7), which didn't call gettimeofday at all.

评论 #13816636 未加载

评论 #13814438 未加载

MayeulCabout 8 years ago

Wasn't a workaround posted for this some time ago, that requires setting the TZ environment variable?<a href="https://news.ycombinator.com/item?id=13697555" rel="nofollow">https://news.ycombinator.com/item?id=13697555</a>It seems very closely related, unless I am mistaken.

评论 #13819170 未加载

pgaddictabout 8 years ago

I wonder why the blog post claims setting clock source to 'tsc' is considered dangerous.

评论 #13814896 未加载

teddyukabout 8 years ago

How common are get time calls so that they would actually be an issue?I've worked on quite a few systems and can't think of a time where an api for getting the time would have been called so much that it would affect performance?

评论 #13814808 未加载

评论 #13834268 未加载

westbywestabout 8 years ago

OpenJDK has an open issue about this in their JVM: <a href="https://bugs.openjdk.java.net/browse/JDK-8165437" rel="nofollow">https://bugs.openjdk.java.net/browse/JDK-8165437</a>

peterwwillisabout 8 years ago

> All programmers deploying software to production environments should regularly strace their applications in development mode and question all output they find.Or, instead, you could just not do that. Then you could go back to being productive, instead of wasting time tracking down unstable small tweaks for edge cases that you can barely notice after looping the same syscall 5 million times in a row.When will people learn not to micro-optimize?

评论 #13834282 未加载

knownabout 8 years ago

Just curious to know the status on Azure;

damagednoobabout 8 years ago