This is very similar to a design Apple announced for iCloud Keychain several years ago at Black Hat.<p>iCloud Keychain synchronizes keychains across iOS devices, storing their contents encrypted under the user's passphrase on Apple's cloud servers. In theory, Apple has no access to this data, since they don't know the relevant passphrase. In practice, however, passphrases are weak, even under PBKDF2. An attacker that got access to Apple's cloud environment would simply dictionary attack the encrypted blobs, and would probably succeed a lot of the time.<p>So instead of the obvious naive design, Apple stores enough secret data in an HSM so that you can't attempt a decryption without the involvement of the HSM. At the same time, the HSM enforces an attempt counter, preventing brute force attacks. To scale the design, Apple partitions customers into "clubs" of HSMs, with the attempt counter synchronized among the HSMs of the club using a distributed commit algorithm.<p>(Somewhat infamously, Ivan Krstic detailed how they protected the HSMs themselves from malicious attacks by putting their software update signing keys through a "physical hash function" called "Vitamix blender".)<p>What Signal is doing here is essentially what Apple did, but using SGX instead of an HSM, and RAFT as the consensus algorithm to synchronize the counters. You might reasonably prefer the Apple approach to SGX, but at the same time, the data that Signal is storing is a lot less sensitive than the data Apple stores.<p>Probably the biggest end-user takeaway from this announcement is that it's the start of a process where Signal is able to durably and securely store social graph information for its users (without revealing the social graphs directly to Signal itself, unlike virtually every other secure messaging system). Once they can do that, they'll have ended most of their dependence on phone numbers.
I'm very puzzled by the consensus group load balancing section. The article emphasizes correctness of the Raft algorithm was super important (to the point that they skipped clear optimizations!!11), but, then immediately follows up with (as far as I can tell) a load-balancer wrapper approach for rebalancing and scaling. My "this feels like consensus bug city" detectors immediately went off.<p>Consensus algorithms (including Raft and Paxos) are notoriously picky and hard to get right around cluster membership changes. If you try to end run around this by sharding to different clusters with a simple traffic director to choose which cluster, how does the traffic director achieve consensus with the clusters that the traffic is going to the right cluster? You haven't solved any consensus problem, you've just moved it to your load balancers.<p>A solution for this problem (to agree on which cluster the data is owned by) is 2-phase commit on top of the consensus clusters. It didn't appear from the diagrams that that's what they did here, so either I missed something, or this wouldn't pass a Jepsen test.<p>Did I miss something?<p>[If you did build 2PC on top of these consensus clusters, you'd have built a significant portion of Spanner's architecture inside of a secure enclave. That's hilarious.]
> <i>These were hardscrabble people, living off of whatever meager storage they could scrounge together. They’d zip things, put them on zip drives, and hope for the best. Then one day almost everyone looked up towards the metaphorical sky and made a lot of compromises.</i><p>I wish every tech blog was written like this. Light-hearted and serious at the same time, almost like a work of fiction.
How exactly does SGX remote attestation work? From the linked document (<a href="https://software.intel.com/en-us/articles/innovative-technology-for-cpu-based-attestation-and-sealing" rel="nofollow">https://software.intel.com/en-us/articles/innovative-technol...</a>) it seems like it hashes execution state, but what's stopping the enclave from emulating the execution while also on the side performing some malicious operation?
I don't get what prevents an attacker from deleting the secrets of everyone by just guessing? Shouldn't the guesses be time limited instead (e.g. once per hour)? Even then you could easily bring the service down...
This is inherently good in itself. But, I ask myself if the oft- requested 'can we be people without phone # in signal' is now actually a deliverable, or if they only stated it in hypothesis, and still don't have that as a roadmap outcome?<p>I want to have two (or more) non-phone enabled devices able to be in signal. My tablet, and my computer. I realize there are adjunct methods, but depending on a physically present device to have one thing hooked up isn't actually what we want here, the phone is <i>not</i> a useful second factor, its a hack I believe they worked out to get beyond the 'must be a phone' state without having to re-engineer the back end.<p>So: do we now get phone-less signal identity? This feels like a precursor. Does that definitionally say phone-less identity will follow?<p>(again only from belief, I believe the secure enclave on the phone is bound into identity along with the IDD, so having a secure enclave backed in the cloud breaks one of the two dependencies out a bit)
I still don’t know how we are going to be able to validate the remote attestation comes from the enclave and not, say, a virtualized one that just logs all the secrets but still attests correctly. Are they going to ship intel device hardware pubkeys or certs to the clients?
This is the tech that would allow the key management we described in section 3.4 of the MobileCoin whitepaper [0] to exist.<p>[0] <a href="https://www.mobilecoin.com/whitepaper-en.pdf" rel="nofollow">https://www.mobilecoin.com/whitepaper-en.pdf</a>
Is this going to make app backup/restore even more torturous?<p>I already almost got bit by the transition to the current magic number + special in-app export, vs. the previously-working Titanium Backup APK + data snapshot method.
It's not clear to me how the nodes authenticate a new node on node replacement. They say that the nodes check the new node's MRENCLAVE value, so what happens if there is a software update and the MRENCLAVE value has to change? How do the old nodes know what the new MRENCLAVE value should be?
Why does Signal want to make things so complicated?<p>This starting premise about the 'normal approach' is not true:<p><i>> However, you may want to change devices, and accidents sometimes happen. The normal approach to these situations would be to store data remotely in an unencrypted database,</i><p>No, the normal approach would be to <i>let authenticated users export their data from their own secure device to a place of their choosing</i>, then <i>let authenticated users also import that data</i>.<p>For paternalistic control-freaks like the Signal team, the data could even only ever be exported in an encrypted format, for encryption-at-rest. Sure, there'd be some risk that the encryption key is not well-protected, or the data-at-rest is subject to brute-force attacks – but many users can manage those risks themselves.<p>So why this strawman premise that the baseline is "remote" and "unencrypted"? Just give me a local export, and import, and I'll protect my data fairly well, thank you.<p>That would solve a major usability disaster of Signal on iOS devices: that even an orderly, planned device upgrade where <i>both</i> devices are in your sole control – and could conceivably do a direct transfer of all sensitive data! – will still lose all your history and Signal-contacts.<p>The cloud-centric strawmen from Signal continue with this related claim:<p><i>> In the example of a non-phone-number-based addressing system, cloud storage is necessary for recovering the social graph that would otherwise be lost with a device switch or app reinstall.</i><p>No, again, all a user needs is a local backup/transfer method for that list of usernames/identity-endpoints. (And to be no worse than the current Signal approach of re-using the device's native contact list, this list-at-rest or list-in-transit only needs protection as good as the native contact list, a pretty low bar.)