An analogy that was useful for explaining part of this to my (non-technical) father. Maybe others will find it helpful as well.<p>Imagine that you want to know whether someone has checked out a particular library book. The library refuses to give you access to their records and does not keep a slip inside the front cover. You can only see the record of which books you have checked out.<p>What you do is follow the person of interest into the library whenever they return a book. You then ask the librarian for a copy of the books you want to know whether the person has checked out. If the librarian looks down and says "You are in luck, I have a copy right here!" then you know the person had checked out that book. If the librarian has to go look in the stacks and comes back 5 minutes later with the book, you know that the person didn't check out that book (this time).<p>The way to make the library secure against this kind of attack is to require that all books be reshelved before they can be lent out again, unless the current borrower is requesting an extension.<p>There are many other ways to use the behavior of the librarian and the time it takes to retrieve a book to figure out which books a person is reading.<p>edit: A closer variant. Call the library pretending to be the person and ask for a book to be put on hold. Then watch how long it takes them in the library. If they got that book they will be in and out in a minute (and perhaps a bit confused), if they didn't take that book it will take 5 minutes.
Papers describing each attack:<p><a href="https://meltdownattack.com/meltdown.pdf" rel="nofollow">https://meltdownattack.com/meltdown.pdf</a><p><a href="https://spectreattack.com/spectre.pdf" rel="nofollow">https://spectreattack.com/spectre.pdf</a><p>From the spectre paper:<p>>As a proof-of-concept, JavaScript code was written
that, when run in the Google Chrome browser, allows
JavaScript to read private memory from the process
in which it runs (cf. Listing 2).<p>Scary stuff.
"AMD chips are affected by some but not all of the vulnerabilities. AMD said that there is a "near zero risk to AMD processors at this time." British chipmaker ARM told news site Axios prior to this report that some of its processors, including its Cortex-A chips, are affected."<p>- <a href="http://www.zdnet.com/article/security-flaws-affect-every-intel-chip-since-1995-arm-processors-vulnerable/" rel="nofollow">http://www.zdnet.com/article/security-flaws-affect-every-int...</a><p>* Edit:<p>From <a href="https://meltdownattack.com/" rel="nofollow">https://meltdownattack.com/</a><p>Which systems are affected by Meltdown?<p>"Desktop, Laptop, and Cloud computers may be affected by Meltdown. More technically, every Intel processor which implements out-of-order execution is potentially affected, which is effectively every processor since 1995 (except Intel Itanium and Intel Atom before 2013). We successfully tested Meltdown on Intel processor generations released as early as 2011. Currently, we have only verified Meltdown on Intel processors. At the moment, it is unclear whether ARM and AMD processors are also affected by Meltdown.<p>Which systems are affected by Spectre?<p>Almost every system is affected by Spectre: Desktops, Laptops, Cloud Servers, as well as Smartphones. More specifically, all modern processors capable of keeping many instructions in flight are potentially vulnerable. In particular, we have verified Spectre on Intel, AMD, and ARM processors."
Hard to find a good spot for this, but: Thanks to anyone involved! From grasping the magnitude of this vulnerability to coordinating it with all major OS vendors, including Open Source ones that do all of their stuff more or less „in the open“, it was almost a miracle that the flaw was leaked „only“ a few days before the embargo - and we‘ll all have patches to protect our infrastructure just in time.<p>Interestingly, it also put the LKML developers into an ethical grey zone, as they had to deceive the public the patch was actually fixing something else (they did a good and right thing there IMHO).<p>Despite all the slight problems along the way, kudos to any of the White Hats dealing with this mess over the last months and handling it super graceful!
I'm not that savvy with security so I need a little help understanding this. According to the google security blog:<p>> Google Chrome<p>> Some user or customer action needed. More information here (<a href="https://support.google.com/faqs/answer/7622138#chrome" rel="nofollow">https://support.google.com/faqs/answer/7622138#chrome</a>).<p>And the "here" link says:<p>>Google Chrome Browser<p>>Current stable versions of Chrome include an optional feature called Site Isolation which can be enabled to provide mitigation by isolating websites into separate address spaces. Learn more about Site Isolation and how to take action to enable it.<p>>Chrome 64, due to be released on January 23, will contain mitigations to protect against exploitation.<p>>Additional mitigations are planned for future versions of Chrome. Learn more about Chrome's response.<p>>Desktop (all platforms), Chrome 63:<p>> Full Site Isolation can be turned on by enabling a flag found at chrome://flags/#enable-site-per-process.
> Enterprise policies are available to turn on Site Isolation for all sites, or just those in a specified list. Learn more about Site Isolation by policy.<p>Does that mean if I don't enable this feature using chrome://flags and tell my grandma to do this complicated procedure I (or she) will be susceptible to getting our passwords stolen?
From a recently posted patch set:<p>Subject: Avoid speculative indirect calls in kernel<p>Any speculative indirect calls in the kernel can be tricked
to execute any kernel code, which may allow side channel
attacks that can leak arbitrary kernel data.<p>So we want to avoid speculative indirect calls in the kernel.<p>There's a special code sequence called a retpoline that can
do indirect calls without speculation. We use a new compiler
option -mindirect-branch=thunk-extern (gcc patch will be released
separately) to recompile the kernel with this new sequence.<p>We also patch all the assembler code in the kernel to use
the new sequence.
"Before the issues described here were publicly disclosed, Daniel Gruss, Moritz Lipp, Yuval Yarom, Paul Kocher, Daniel Genkin, Michael Schwarz, Mike Hamburg, Stefan Mangard, Thomas Prescher and Werner Haas also reported them; their [writeups/blogposts/paper drafts] are at"<p>Does anyone have any color/details on how this came to be? A major fundamental flaw exists that affects all chips for ~10 years, and multiple independent groups discovered them roughly around the same time this past summer?<p>My hunch is that someone published some sort of speculative paper / gave a talk ("this flaw could exist in theory") and then everyone was off to the races.<p>But would be curious if anyone knows the real version?
Azure's response: <a href="https://azure.microsoft.com/en-us/blog/securing-azure-customers-from-cpu-vulnerability/" rel="nofollow">https://azure.microsoft.com/en-us/blog/securing-azure-custom...</a><p>This part is interesting considering the performance concerns:<p>"The majority of Azure customers should not see a noticeable performance impact with this update. We’ve worked to optimize the CPU and disk I/O path and are not seeing noticeable performance impact after the fix has been applied. A small set of customers may experience some networking performance impact. This can be addressed by turning on Azure Accelerated Networking (Windows, Linux), which is a free capability available to all Azure customers."
Someone correct me if I understood this wrong. The way they are exploiting speculative execution is to load values from memory regions which they don't have permission to a cache line, and when the speculation is found to be false, the processor does not undo the write to the cache line?<p>The question is, how is the speculative write going to the cache in the first place? Only retired instructions should be able to modify cache lines AFAIK. What am I missing?<p>Edit: Figured it out. The speculatively accessed memory value is used to compute the address of a load from a memory location which the attacker has access to. Once the mis-speculation is detected, the attacker will time accesses to the memory which was speculatively loaded and figure out what the secret key is. Brilliant!
"These vulnerabilities affect many CPUs, including those from AMD, ARM, and Intel, as well as the devices and operating systems running them."<p>Curious. All other reports I've read state that AMD CPUs are not vulnerable.
Has Google the best security team in the world? It seems like Google security is in a complete different league. I cannot imagine how this impacts companies handling fiat money or cryptocurrencies in the cloud like Coinbase in AWS.
So, as I gather, one of the main culprits is that unwinding of speculatively executed commands is done incompletely. That is something that the people doing the unwinding must have noticed and known. Somewhere the decision must have been made to unwind incompletely for some reasons (performance/power/cost/time).<p>As for the difference between AMD and intel. (From other posts here, not this one.) The speculative execution can access arbitrary memory locations on intel processors while this is not possible on AMD. This means that on intel processors you can probe any memory location with only limited privileges.<p>As for the affected AMD and ARM processors I'm none the wiser. How are they affected? Which models are affected? Does it allow some kind of privilege escalation? The next days will surely stay interesting.
<a href="https://spectreattack.com/" rel="nofollow">https://spectreattack.com/</a><p>Information site with some more information, and links to papers on the two vulnerabilities, called "Meltdown" and "Spectre" (with logos, of course).<p>(<a href="https://meltdownattack.com/" rel="nofollow">https://meltdownattack.com/</a> goes to the same site)
It seems that Richard Stallman is not so paranoid after all:<p>> I am careful in how I use the Internet.<p>> I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see <a href="https://git.savannah.gnu.org/git/womb/hacks.git" rel="nofollow">https://git.savannah.gnu.org/git/womb/hacks.git</a>) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it (using konqueror, which won't fetch from other sites in such a situation).<p>Ref: <a href="https://stallman.org/stallman-computing.html" rel="nofollow">https://stallman.org/stallman-computing.html</a>
Speculative execution seems like something that would be very intuitively insecure even to a layperson(relative to the field of course).<p>I'm wondering, was this vulnerability theorized first and later found out to be an actual vulnerability? Or was this something that nobody had any clue about?<p>I'm only saying this, because from a security perspective, I imagine somewhere at some point very early on someone had to have pointed out the potential for something like speculative execution to eventually cause security problems.<p>I just don't understand how chip designers assumed speculative execution wouldn't eventually cause security problems. Is it because chip designers were prioritizing performance above security?
I don't think this is the last we have seen of side-channels, it's just a ridicolously hard problem to get right. And for that reason I can't feel too angry at the procesor makers.<p>And I certainly expect to see more things like this (but at least hopefully with lower bandwidth).
AMD put out an announcement:<p><a href="https://www.amd.com/en/corporate/speculative-execution" rel="nofollow">https://www.amd.com/en/corporate/speculative-execution</a>
Wow so intel comes and says what is all the panic about there is nothing wrong (despite knowing this) and then amazon drops the we are updating everything right now bomb and then google drops the mother of all cpu bugs. In a previous thread someone was asking if it really is all that bad and at this point I think it’s safe to say that yea, it is.
So, is AMD effected or not? This seems fairly important. The Google blog post sort of goes against itself in this regard. AMD itself has said:<p>"The threat and the response to the three variants differ by microprocessor company, and AMD is not susceptible to all three variants. Due to differences in AMD's architecture, we believe there is a near zero risk to AMD processors at this time."<p>So either AMD is lying or Google's blog post is wrong. Granted AMD's statement is a bit muddled, not sure if they mean they aren't susceptible to all THREE variants (as in only 1/3) or they aren't susceptible to ALL three variants (as in none of them.)
Can someone with a little more experience this low-level let me know if this is as bad as I think it is?<p>Because this looks real bad:<p>> Reading host memory from a KVM guest
So is speculative execution just inherently flawed like this, or can we expect chips in 2 years that let operating systems go back to the old TLB behavior?
So, basically CPUs will read instructions inside a branch even if the branch is eventually going to evaluate to false. Does the CPU do this to optimize branch instructions? The results of instructions that are executed ahead of time are stored in a cache. How exactly does this exploit read from the cache? I understand it uses timing somehow but I'm not quite sure exactly how that works. (I mostly do software.)
First implementation I've seen on twitter.<p><a href="https://twitter.com/pwnallthethings/status/948693961358667777" rel="nofollow">https://twitter.com/pwnallthethings/status/94869396135866777...</a>
One of the meltdown paper writers evidently has a sense of humor since "hunter2" [0] is one of the passwords they use in their demonstration [1]<p>[0] <a href="http://bash.org/?244321" rel="nofollow">http://bash.org/?244321</a><p>[1] <a href="https://meltdownattack.com/meltdown.pdf" rel="nofollow">https://meltdownattack.com/meltdown.pdf</a> (page 13, figure 6)
So what exactly are they going to do about spectre? Seems pretty unstoppable from what I can see.<p>Can they disable speculative exec completely for sensitive boxes or is this too baked in?
Can someone more knowledgeable than me in regards to this vulnerability tell me:<p>1. How to best protect my local personal data from being subject to this?<p>2. Whether I should seriously consider pulling all my cryptocurrency off of any exchanges?
So how much legal liability are they exposed to due to this security flaw?<p>Since this affects legacy systems that may not be able to be upgraded it seems like this issue will be around for a very long time.
I can't understand this paragraph from [1]:<p>> Cloud providers which use Intel CPUs and Xen PV as virtualization without having patches applied. Furthermore, cloud providers without real hardware virtualization, relying on containers that share one kernel, such as Docker, LXC, or OpenVZ are affected.<p>I take it to imply that hypervisors that use hardware virtualization are not affected. However, the PoC that reads host memory from a KVM guest seems to contradict this.<p>Is it because on Xen HVM, KVM, and similar hypervisors, only kernel pages are mapped in the address space of the VM thread (so a malicious VM cannot read memory of other VMs), but on these other hypervisors, pages from other containers are mapped? Yet the Xen security advisory [2] says:<p>> Xen guests may be able to infer the contents of arbitrary host memory,
including memory assigned to other guests.<p>Relatedly, what sensitive information other than passwords could appear in the kernel memory? I'd expect that at the very least buffers containing sensitive data pertaining to other VMs may be leaked.<p>[1] <a href="https://meltdownattack.com/" rel="nofollow">https://meltdownattack.com/</a>
[2] <a href="https://xenbits.xen.org/xsa/advisory-254.html" rel="nofollow">https://xenbits.xen.org/xsa/advisory-254.html</a>
> Meltdown breaks all security assumptions given by address space isolation as well as paravirtualized environments and, thus, every security mechanism building upon this foundation.<p>> On affected systems, Meltdown enables an adversary to read memory of other processes or virtual machines in the cloud without any permissions or privileges, affecting millions of customers and virtually every user of a personal computer.
Reading over this.... it sounds like ultimately the exploit in Linux still only works thanks to being able to run stuff in the kernel context through eBPF?<p>The first section states that even with the branch prediction you still need to be in the same memory context to be able to read other process's memory through this. But eBPF lets you run JIT'd code in the kernel context.<p>I guess this JITing is also the issue with the web browsers, where you end up getting access to the entire browser process memory.<p>But ultimately the dangerous code is still code that got a "privilege upgrade"? the packet filter code for eBPF, and the JIT'd JS in the browser exploit?<p>So if our software _never_ brought user's code into the kernel space, then we would be a bit safer here? For example if eBPF worked in... kernel space, but a different kernel space from the main stuff? And Site Isolation in Chrome?
I should at first point out that I am by no definition an expert on CPU design, operating systems, or infosec.<p>But I just remembered that <i>years</i> ago the FreeBSD developers discovered a vulnerability in Intel's Hyperthreading that could allow a malicious process to read other processes' memory.[1]<p>To the degree that I understand what is going on here, that sounds very similar to the way the current vulnerabilities work.<p>For a while, back then, I was naive enough to think this would be the end of SMT on Intel CPUs, but I was very wrong about that.<p>So I am wondering - is this just a funny coincidence, or could people have seen this coming back then?<p>[1] <a href="http://www.daemonology.net/hyperthreading-considered-harmful/" rel="nofollow">http://www.daemonology.net/hyperthreading-considered-harmful...</a>
The ARM whitepaper is also worth a read in terms of how it affects them and mitigations on that platform: <a href="https://developer.arm.com/support/security-update" rel="nofollow">https://developer.arm.com/support/security-update</a>
I'm really amazed by the simplicity of the meltdown gadget. After the initial blog post I played with a few variants, but always got the zeroed out register in the speculative branch. I guess what people (including me) were looking for here was some other side channel or instruction that did not have this mitigation in place (e.g. I had hoped a cmpxchg would leak whether the target memory address matches the register to compare with). The shl/retry loop makes a lot of sense if you instead assume that the mitigation was implemented improperly and can race subsequent uops. I really can't imagine why this data ever made it to the bypass network to be available to other uops.
I wonder if the whole thing with enormously complex CPUs requiring deep pipelines which in turn requires complex speculation etc was a design mistake? Is there an alternative history where mainstream CPUs are equally fast with a dumber/simpler design?
Since no one has yet posted Amazon AWS security bulletin:<p><a href="https://aws.amazon.com/security/security-bulletins/AWS-2018-013/" rel="nofollow">https://aws.amazon.com/security/security-bulletins/AWS-2018-...</a>
<a href="https://github.com/IAIK/meltdown" rel="nofollow">https://github.com/IAIK/meltdown</a> 404's. I assume this is by intention? So full disclosure, but missing the code? Or is it somewhere else?
According to the page, Project Zero only tested with AMD Bulldozer CPUs. Why didn't they use something based on Zen/Ryzen? It's not clear if the 3 issues affect Zen/Ryzen or not.
Just an idea that I had:<p>If these exploits seem rely on taking precise timing measurements (on the order of nanoseconds), could we eliminate or restrict this functionality in user space?<p>The Spectre exploit uses the RDTSC instruction, and this can apparently be restricted to privilege level 0 by setting the TSD flag in CR4.<p>I know it would kind of suck, but it might be better than nothing.<p>I would think that most typical user applications wouldn't require that accurate of a time measurement. If they do, then maybe they can be white listed?
What is the reason that Intel would allow speculative instructions to bypass the supervisor bit and access arbitrary memory? That seems the root cause for Meltdown.<p>Is it that the current privilege level could be different between what it is now, and what it will be when the speculative instruction retires? If so then that seems a thin justification. CPL should not change often so it doesn't seem worth it to allow speculative execution for instructions where a higher CPL is required.
<i>There are 3 known CVEs related to this issue in combination with Intel, AMD, and ARM architectures. Additional exploits for other architectures are also known to exist. These include IBM System Z, POWER8 (Big Endian and Little Endian), and POWER9 (Little Endian).</i><p><a href="https://access.redhat.com/security/vulnerabilities/speculativeexecution" rel="nofollow">https://access.redhat.com/security/vulnerabilities/speculati...</a>
How come this wasn't discovered sooner?<p>It would seem to me that all the really smart people who designed super-scalar processors and all the nifty tricks that CPUs do today - would have thought that these attacks would be in the realm of possibility. If that's the case - who's to say these attacks haven't been used in the wild by sophisticated players for years now?<p>Seems like the perfect attack. Undetectable. No log traces.
Could somebody please coin a name for this? Wikipedia currently calls it "Intel KPTI flaw", but that is very vague. It's quite difficult to talk about something without a simple easy-to-remember name.<p>Edit: has been settled, it's <a href="https://en.wikipedia.org/wiki/Meltdown_(security_bug)" rel="nofollow">https://en.wikipedia.org/wiki/Meltdown_(security_bug)</a> .
Is there any information available about whether the Linux KPTI patch mitigates the ability to use eBPF to read kernel memory?<p>I'm asking because eBPF seems to execute within the kernel, and KPTI seemed to be about unmapping kernel page table when userspace processes execute.<p>Are there any mitigations to the eBPF attack vector?
Isn't possible for the kernel to patch all clflush instructions when the software is loaded to keep a circular list of all evicted addresses that would be evicted again on the interrupt that happens when the protected address is read? This way the the timing attack would not be possible.
From <a href="https://meltdownattack.com/meltdown.pdf" rel="nofollow">https://meltdownattack.com/meltdown.pdf</a>, page 12:<p>> Thus, the isolation of containers sharing a kernel can
be fully broken using Meltdown.
Looks like the information was somewhat public available since middle of the last year on <a href="https://cyber.wtf/2017/07/28/negative-result-reading-kernel-memory-from-user-mode/" rel="nofollow">https://cyber.wtf/2017/07/28/negative-result-reading-kernel-...</a> and <a href="http://www.cs.binghamton.edu/%7Edima/micro16.pdf" rel="nofollow">http://www.cs.binghamton.edu/%7Edima/micro16.pdf</a>. Also similar methods from 2013 paper <a href="http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf" rel="nofollow">http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf</a> (timing side channel attacks).<p>Any reason for the panic now? Any know malware using it?
Can someone show me an example of JavaScript code running in a browser that would display a password stored in kernel space?<p>Websites like the Guardian report that this is now the case but I don't understand how that's possible.
Thanks to incidents like these, I'm very happily employed. One of the perks of working in infosec.<p>I hereby nominate 2018's song to be Billy Joel's <i>We Didn't Start the Fire</i>.
Thanks again to the geniuses who arranged things so that almost anyone can write code that I must run just so I can use the internet to find and to read public documents<p>(unless I undergo the tedious process of becoming a noscript user or something similar).
"Testing also showed that an attack running on one virtual machine was able to access the physical memory of the host machine, and through that, gain read-access to the memory of a different virtual machine on the same host."<p>Holy shit.
>running on the host, can read host kernel memory at a rate of around 1500 bytes/second,<p>I kinda get how it works now. They force a speculative execution to do something with a protected memory address, and then measure the latency to guess the content. They did not found a way to continue execution after a page fault as rumors were.<p>The fact that speculative execution branch can access protected memory, but not to commit its own computation results to memory in ia32 was known since pentium 3 times.<p>It was dismissed as "theoretical only" vulnurability without possible practical application. Intel kept saying that for 20 years, but here it is, voila.<p>The ice broke in 2016 when Dmitry Ponomarev wrote about first practical exploit scenario for this well known ia32 branch prediction artifact. Since then, I believe, quite a few people were trying all and every possible instruction combination for use in timing attack until somebody finally got one that works that was shown behind closed doors.<p>Edit: google finally added reference to Ponomarev's paper. Here is his page with some other research on the topic <a href="http://www.cs.binghamton.edu/~dima/" rel="nofollow">http://www.cs.binghamton.edu/~dima/</a>
link for details for that from Project Zero:<p><a href="https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html" rel="nofollow">https://googleprojectzero.blogspot.com/2018/01/reading-privi...</a>
In 1-2 words, IMO, the problem is "over-optimisation".<p>It is perhaps beneficial to be using an easily portable OS that can be run on older computers, and a variety of architectures.<p>Sometimes older computers are resilient against some of todays attacks <i>to the extent those attacks make assumptions about the hardware and software in use</i>. (Same is true for software.)<p>When optimization reaches a point where it exposes one to attacks like the ones being discussed here, then maybe the question arises whether the optimization is actually a "design defect".<p>What is the solution?<p>IMO, having choice is at least part of any solution.<p>If <i>every user is effectively "forced" to use the same hardware and the same software</i>, perhaps from a single source or small number of sources, then that is beneficial for those sources but, IMO, counter to a real solution for users. Lack of viable alternatives is not beneficial to users.
More details at <a href="https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html" rel="nofollow">https://googleprojectzero.blogspot.com/2018/01/reading-privi...</a>
I wonder what this sentence in the Google product status page (<a href="https://support.google.com/faqs/answer/7622138" rel="nofollow">https://support.google.com/faqs/answer/7622138</a>) means, particularly what the inter-guest attack refers to:<p>"Compute Engine customers must update their virtual machine operating systems and applications so that their virtual machines are protected from intra-guest attacks and inter-guest attacks that exploit application-level vulnerabilities"
Does anyone know what kind of isolation still can work after all the patches? Let's say we want to host users' processes or containers and some of them could be pwned. I see Google claiming that their VMs are isolated between the kernel and each other.
Intel has released a statement for the codename Meltdown bug:<p><a href="https://newsroom.intel.com/news/intel-responds-to-security-research-findings/" rel="nofollow">https://newsroom.intel.com/news/intel-responds-to-security-r...</a>
> We have some ideas on possible mitigations and provided some of those ideas to the processor vendors; however, we believe that the processor vendors are in a much better position than we are to design and evaluate mitigations, and we expect them to be the source of authoritative guidance.<p>Intel: "Recent reports that these exploits are caused by a “bug” or a “flaw” [..] are incorrect."<p>So much for "authoritative guidance", fuck these guys.
The papers take a while to get to the point. I nearly fell asleep re-reading the same statements until they got to the point: speculative execution of buffer overflows.<p>Could have been said more concisely. Sadly, this seems to be the norm with academic texts.