Background: I work at Block/Square, on the team that owns (but didn't build) our internal Feature Flag system, and also have a lot of experience with using LaunchDarkly.<p>I like the idea of caching locally, although k8s makes that a bit more difficult since containers are typically ephemeral. People <i>will</i> use feature flags for things that they shouldn't, so eventually "falling back go default values" will cause production problems. One thing you can do to help with this is run proxies closer to your services. For example, LaunchDarkly has an open source "Relay".<p>Local evaluation seems to be pretty standard at this point, although I'd argue that delivering flag definitions is (relatively) easy. One of the real value-add of a product like LaunchDarkly is all the things they can do when your applications send evaluation data upstream: unused flags, only-ever-evaluated-to-the-default flags, only-ever-evaluated-to-one-outcome flags, etc.<p>One best practice that I'd love to see spread (in our codebases too) is <i>always</i> naming the full feature flag directly in code, as a string (not a constant). I'd argue the same practice should be taken with metrics names.<p>One of the most useful things to know (but seldom communicated clearly near landing pages) is a basic sketch of the architecture. It's necessary to know how things will behave if there is trouble. For instance: our internal system uses ZK to store (protobuf) flag definitions, and applications set watches to be notified of changes. LaunchDarkly clients download all flags[1] in the project on connection, then stream changes.<p>If I were going to build a feature flag system, I would ensure that there is a global, incrementing counter that is updated every time any change is made, and make it a fundamental aspect of the design. That way, clients can cache what they've seen, and easily fetch only necessary updates. You could also imagine annotating that generation ID into W3C Baggage, and passing it through the microservices call graph to ensure evaluation at a consistent point in time (clients would need to cache history for a minute or two, of course).<p>One other dimension in which feature flag services vary is by the complexity of the rules they allow you to evaluate. Our internal system has a mini expression language (probably overkill). LaunchDarkly's arguably better system gives you an ordered set of rules within which conditions are ANDed together. Both allow you to pass in arbitrary contexts of key/value pairs. Many open source solutions (Unleash, last I checked, some time ago) are more limited: some of them don't let you vary on inputs, some only a small set of prescribed attributes.<p>I think the time is ripe for an open standard client API for feature flags. I think standardizing the communication mechanisms would be constricting, but there's no reason we couldn't create something analogous to (or even part of) the Open Telemetry client SDK for feature flags. If you are seriously interested in collaborating on that, please get in touch. (I'm "zellyn" just about everywhere)<p>[1] Yes, this causes problems if you have too many flags in one project. They have a pretty nice filtering solution that's almost fully ready.<p>[Update: edited to make 70% of it not italics ]