(Disclosure: I work on Kubernetes/EKS Security at AWS)<p>I'm curious why they didn't look into using Kubernetes ProjectedVolumeTokens for authenticating to Vault? The tokens Kubernetes issues are not stored in etcd, and they contain pod-specific metadata so they are invalidated as soon as the pod dies (when using TokenReview). Alternatively, they can be used to directly authenticate with Vault since they're OIDC-valid tokens [1].<p>The semantics around secrets in Kubernetes aren't nearly as robust as Vault, so I was surprised to not see this more clearly called out (ex: list secrets == get all keys and values). Even if you use KMS/AES encryption (which they reference) that doesn't help with access control.<p>[1] <a href="https://www.vaultproject.io/docs/auth/jwt.html" rel="nofollow">https://www.vaultproject.io/docs/auth/jwt.html</a>
The intersection of microservices + Vault is something I've long had to deal with - as far back as Vault 0.5.0, so I'm a bit "surprised" that isn't something that is turn-key with Vault+k8s today.<p>I was working with Mesos (before k8s had taken the world by storm), and had a similar issue - how do services get vault tokens without having a workflow that may include storing "secrets" in configuration. What I ended up writing was a tool[1] that a service could query, with it's Mesos Task ID, to get a token. The tool would then read it's own configuration, as well as the current mesos state to determine if the request was valid or not.<p>Unlike the k8s solution, as I understand it, you don't need to treat the 'service account token' as a secret (reducing the attack surface of when someone steals that token _and_ also has access to Vault). This is accomplished in two ways:<p>1. You can determine if a request is valid by looking at how long the service is running. If someone steals a Task ID, but the service has already been running for 2 minutes, then the Task ID is useless.<p>2. The Vault token is only issued once per service launch. This means if an attack steals the Task ID, but the token has already been given out, the Task ID is useless. If the attacker beats the the service in asking for that key, then the service should raise the alarm bells about it's key potentially being stolen. If you are even more paranoid, you could even decide to invalidate all active keys and reduce the amount of time the attacker has a valid Vault Token.<p>This was largely designed years ago with some cues I took from one of Vault's lead engineers so I expected that the k8s integration would work similarly. I'm not too familiar with k8s however so there might be other constraints I'm overlooking in why Vault's k8s is integrated the way it is.<p>[1] <a href="https://github.com/nemosupremo/vault-gatekeeper" rel="nofollow">https://github.com/nemosupremo/vault-gatekeeper</a>
> We run about 1,100 microservices written in Go<p>Asking as a university student: is this a common number of microservices to have running in production? It looks like monzo has about 1,351 total employees [0]. If all of them were software engineers, this would be a little less than one microservice per engineer. How do you handle code reuse and reliability among thousands of microservices? It seems like the number of possible failure states would be unthinkable.<p>[0] <a href="https://en.wikipedia.org/wiki/Monzo_(bank)" rel="nofollow">https://en.wikipedia.org/wiki/Monzo_(bank)</a>
The bit where a secret gets pasted into the Very Secure system is a clear problem. Because before it was pasted, and while it was being pasted, it's not in that secure system it's on some dev's laptop.<p>Most of your secrets will be/ should be just random bits maybe in some particular format that was convenient for a particular application e.g. a 4-digit PIN or a sixteen character Hexadecimal string, or 10 characters of A-Za-z0-9<p>So for these cases there's no reason that secret is ever on a developer's laptop. Best case the developer made a truly random secret, maybe they (like me) keep a set of hexadecimal dice on their desk for random choices. Just as likely it's tainted, the developer ran a randomizer until out popped a string they liked - or even they found one on a web site, or used the same one as in their test environment.<p>Either way, since what you wanted was random bits it makes sense in most cases (not all cases, obviously a secret key you were sent by somebody else, for example an API key for their system will have to be copied somehow) to have a feature that just spits the right format of random bits into the secure system without any human seeing them at all.<p>Even better, in cases where it's an option, is not to rely on stored secrets at all. I think Monzo's post is not worrying about this difference, but it can be critical in terms of decisions about debugging to prefer to have entirely ephemeral secrets. When a pod goes away, the ephemeral secrets that pod had vanish with it, and so you aren't storing them anywhere anyway. If they aren't stored, they can't get stolen by anybody and you've got one less thing to go wrong.
I so, so, so, badly want to use Vault everywhere.<p>The one thing keeping me from adopting isn't that it won't be secure enough. I worry _constantly_ that I'll lock myself out of my data, my infrastructure, etc.<p>Have others had that worry? How'd you get over it and just start using Vault? (probably through incremental, low-risk adoption first?)
Vault interests me but seems to come with a lot of complexity/requires an army of devops. There's a YC backed company called Envkey[0] that looks interesting + simple, but there is no option to self host yet<p>[0] <a href="https://www.envkey.com/" rel="nofollow">https://www.envkey.com/</a>
One thing bugs me about Vault and I rarely see it beeing discussed: how come a design where Vault simultaniously stores secret AND is able to access other systems these secrets are for is deemed as good and secure?<p>Vault is a company-wide "root account", reachable from every part of internal network, storing all the company secret data and have reach to many internal and external systems like databases, where it have full rights to dynamically configure short lived credentials. Doesn't it put too much trust into a single system?
Ha I love how they explain what "cryptography" means, as if anyone reading this wouldn't know.<p>Also I love the level of openness. No other bank would do this. (If you're in the UK, Monzo is honestly amazing. I've cancelled all my other bank accounts.)
>> We check that a secret exists in our staging environment (s101) when writing to prod, and warn if it doesn't.<p>My first thought ... so you have a script that can see prod and non prod at the same time?<p>I think I may be developing paranoia.
Slightly related, but what’s a good practice for storing secrets that need to be recovered as plain text?<p>I’m thinking of a system where the user can register her/his API keys to other third-party systems.
Nitpick: k8s has supported encryption at rest for secrets for a while.<p><a href="https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/" rel="nofollow">https://kubernetes.io/docs/tasks/administer-cluster/encrypt-...</a>
Related: I have seen lot of people make mistake of hardcoding the secrets in the android app, please make sure you do not do that. I have a tool to check these embedded secrets: <a href="https://android.fallible.co/" rel="nofollow">https://android.fallible.co/</a>