Feature Flags: Theory vs. Reality

141 pointsby benpapillonalmost 2 years ago

24 comments

jbreckmckyealmost 2 years ago

I've definitely lived with the zombie flags problem. Teams ship experiments that double the size of a piece of code, but never go back to refactor out the unused code branches. In shared codebases this becomes a nightmare of thousands of lines of zombie code and unit tests.This is a social problem as much as a technical one: even if you have LaunchDarkly, DataDog etc making very clear that a flag isn't used, getting a team to prioritise cleanup is difficult. Especially if their PM leaned on engineers to make the experiment "quick n dirty" and therefore hard to clean up.At The Guardian we had a pretty direct way to fix this: experiments were associated with expiry dates, and if your team's experiments expired the build system simply wouldn't process your jobs without outside intervention. Seems harsh, but I've found with many orgs the only way to fix negative externalities in a shared codebase is a tool that says "you broke your promises, now we break your builds".

评论 #36671469 未加载

评论 #36671663 未加载

评论 #36670538 未加载

评论 #36670771 未加载

评论 #36669998 未加载

评论 #36676321 未加载

评论 #36670901 未加载

评论 #36671377 未加载

评论 #36670016 未加载

评论 #36675044 未加载

评论 #36676951 未加载

jerfalmost 2 years ago

This is not a complete solution, but it seems to me an aspect of the solution is similar to the way the programming world over the past 5-10 years has been acknowledging that dependencies carry a certain cost with them that must be accounted for. Feature flags do too. If you account for them as just the in-the-moment costs of adding a flag for something, then you are grotesquely underestimating their costs.Personally I tend to resist them, for much this reason. I don't mean that I never use them and you can't find any in my code, but I resist them. They need to prove their utility to me before I add them, in much the same way I tend to make dependencies prove their worth beyond some momentary convenience before they are allowed in. There are times they leap that bar, but I think that generalized resistance has helped keep the code bases in better order than they otherwise would be. I've seen other teams who did not resist and they've developed a proliferation problem.

lucas_membranealmost 2 years ago

I worked on a team of hundreds that developed and maintained a vertical market app enterprise app for a few thousand client companies, probably more than 100,000 end-user seats, but fewer than 500,000. My small sample size (1) observation is that the developer organizations least able to manage feature flags are the ones most likely to buy into such a magic pill cargo cult solution.If your software has accumulated or is built to support numerous independent client organizations, it almost certainly has features that are not used by all users, and thereby the software has implicit feature flags embedded in the data that it is already processing. Regardless of whether those feature flags in data work well or work poorly, why in the world would you want to add a second feature-control subsystem? Because it is meta-programming, I suppose, and we all know that meta-programming just adds another level of power to everything, and your first feature-control system may be a little hard to disentangle, and you can make feature-flags work by having the meta-programming done by a select few who really know what they are doing, and it will be a worthwhile challenge, and even if it doesn't work you will learn a lot, and it will look good on your resume, and give everyone a few good laughs when they realize what they were trying to do.

mvdtnzalmost 2 years ago

These are real problems but not insurmountable. I think the author does an excellent job of laying out the problem and has pretty decent solutions in mind.I caution strongly against the proposed solution to fail CI if zombie flags are detected. CI should ONLY fail if there are changes in the branch that cause the failure. Detecting zombie flags (eg, this branch contains a flag which has been turned on and untouched for 90 days) is setting a CI time bomb. Find another way to alert developers of the zombie instead of failing good code at CI time.

评论 #36675066 未加载

samthoalmost 2 years ago

My biggest problem with 3rd party feature flag setups is that I have high expectations for them and it is technically difficult to meet all of them:- local/static access: it should not have to call out to a 3rd party server to get basic runtime config- unused-flag detection: flags should have three reported states: never used, recently used, not recently used. These will be different from the user-controlled states of active, inactive, etc.- sticky a/b testing: should follow the logged in user until the flag is removed- integration with logger: I should be able to use it with my logger out of the box to report only relevant feature flags. Alternatively can provide a packed value of all relevant flags, would probably have to do flag state versioning.- integration with linter: should warn me if flag has not recently been used or I used a flag in the code that is not in our database (alternatively, will upsert the flag automatically if it doesn’t exist)- hashed flag names on frontend build: prevent the leakage of information, not a perfect solution, but I would want to avoid writing “top-secret-feature” where we can.I fully acknowledge that a lot of solutions come close, but I haven’t looked at the current state of things in the last few years so it may have improved.

评论 #36670098 未加载

评论 #36670467 未加载

MilStdJunkiealmost 2 years ago

I've got a few friends that work at LaunchDarkly, and from what I can tell, they've got a very good handle on the challenge. Better than the equivalent vendors in my business, anyway. I've had some great talks with the LD people, even though, strictly speaking, I don't get my paychecks from programming, per se.What brought me into the talks was that the feature flag problem is a similar scope to the central one faced by CCSs (component content systems). By definition, CCS requires the content equivalent of feature flags, implemented in a variety of ways, depending on . . lots of things. That problem is this: both transclusion and conditionals necessarily couples the content to the business or product architecture. Ergo, when the product architecture goes bananas, so does your content system, and you find yourself with documents that aren't meaningful in a linguistic sense, or which just break the processor. This occurs in the content context because the natural language of a unified document is replaced in a CCS with the product or business architecture; how is a document chunked, what business needs do the conditions satisfy, at what support level are document deliverables composed. In a code context, the constructed syntax of the programming language is getting chopped by the conditionals driven from the business side; there's even more variance here regarding how code interacts with business.So not the same problem, but the same class of problem: regular rules that have to integrate with non-regular, non-linguistic business rules.I have a tiny chip on my shoulder regarding CCS systems, because I have seen so many years flushed down the "re-use craze" by businesses that had zero business trying to re-use anything. Feature flags are somewhat in the same bucket - a lot of things that a business wants to use flags for should really, really, really be built into the code or abstracted away - but of course a programming language has far richer ways to deal with bad abstractions than a markup language does. Which of course can be a double edged sword.

评论 #36690469 未加载

rcktmrtnalmost 2 years ago

I work more in the firmware space, so my experience with feature toggles is always with half-baked tooling and limited ability to change deployed products. We do use continuous development within the organization, so there is still a lot of applicability, but it's always interesting to see the way similar problems get addressed in a higher-level and more online environment.That said, I'm surprised this article doesn't mention the two words that always come to my mind when I see toggles: combinatorial explosion. Several times I've worked on projects that went way too toggle-happy and decided that new functionality should be split into indefinite life "features". Just in case the company someday wants to sell a model without that feature. Of course, when an old toggle finally gets turned off a year later, you realize that it crashes the system because several other features kind of half depend on them.

评论 #36670716 未加载

GeorgeMacalmost 2 years ago

We're attempting to address some of these problems at <a href="https://www.flipt.io/gitops" rel="nofollow noreferrer">https://www.flipt.io/gitops</a>. Having your flags defined as configuration and committed to repository opens up a range of possibilities in terms of static analysis.Additionaly, we've got a prototype static analysis tool to finding calls to our feature flag clients in both Go and Rust too.

评论 #36669670 未加载

jasdeepg1987almost 2 years ago

as a pm, there's a whole set of jobs that occur post-rollout that have often been poorly handled at companies i've been at. those include packaging, customer operations like allow-listing long-lived features for certain companies, optimization of bundles, etc.when we've built our own homegrown system, it's opaque and often neglected. when we've used feature flag tools, we co-opt them to do things they're not meant to support (e.g. persistent toggles in admin panels) so end up with complexity in the code and in operational processes around it.agree wholeheartedly with points in this article ... there are issues with how we manage flags generally, but we also bias towards assuming that once a feature is live, we can and should move on -- the feature is now persistent, part of a package, and it won't change frequently or ever.the reality is the feature lifecycle takes on a very different shape, and, at least in my experience, current FM tooling isn't built to accommodate that.

评论 #36672027 未加载

daliwalialmost 2 years ago

At my work, I have a somewhat clever (or idiotic) technical solution to the problems of feature flags: they are actually implemented as feature modules that monkey-patch the base application in runtime.There are a few benefits: removing features is dead simple, just delete the whole feature module, and there's no conditional branching in the base application.There are some drawbacks too: the base application must have entry points for the feature modules to overwrite. Usually the default values are no-op or some default behavior. Features also must implement setup and teardown, which can take longer to write than a conditional.

评论 #36676859 未加载

smrtinsertalmost 2 years ago

The zombie flags are a huge problem. Management is always pushing for feature completion and its done - behind a feature flag. The complication is now they never want to allow time to remove all the dead code paths later, which leaves you dependent on all sorts of potential things, imports, libraries maybe even connections. One day they inevitably find out something is "still in prod" and they get curious and don't understand why it's still there. Well, feature flags require more TCO, period. They don't want to give you more time though.

DenisMalmost 2 years ago

Couple of simple ideas for the zombie flag problem:- When adding a flag immediately file a bug to remove the flag by a certain date. Enforce in code review. The bug count will surface the problem to the management.- When a flag is past due date start firing non-fatal incidents. The incident count will also surface the problem to the management.

评论 #36675190 未加载

评论 #36683244 未加载

sb8244almost 2 years ago

My story of when bad feature flag hygiene resulted in a real technical problem is when our Redis kicked over one day. We had good monitoring so it was easy to identify the problem: network was saturated at 1 GB/s.I traced the problem back to the fact that we had 100+ feature flags that were fully launched, but still loaded into the backend when "all feature flags" were loaded for a team. The way this was implemented returned all team IDs that had the feature flag, and the way this was done had some flags with multiple thousand IDs in them.So 100+ flags, many with 2000+ int entries.We ended up quickly shipping some code to mark features GA, so they wouldn't be loaded from Redis. Cut usage by 99% instantly.

stillbournealmost 2 years ago

I work at truckstop.com and I came up with a way of managing feature flags that isn't madness. First I used the feature flags in conjunction with module federation. Then I create 3 flags per product, alpha, beta, rc. They looks something like this: mfe-load-search-alpha. The flags are managed by split.io and then tied to a federated endpoint deployment. Which flag gets loaded is determined by a router factory that selects the route with the correct federated endpoint based on the splits. That effectively allows me to decouple a deployment from a release.

withinboredomalmost 2 years ago

Probably my best story of “zombie flags” was when this guy accidentally deleted a production table. We disabled the feature flag, disabled some code written after it had been turned on and expected it to be on, then restored the table from a PIT backup. Finally, we reverted the code changes and feature flag. We were back up in a matter of hours (the table was hundreds of gb, so it took awhile to delete and restore). Some customers noticed the option missing from their options screen, but 99% of the customers never noticed the feature downtime.

malfistalmost 2 years ago

One thing I see missing in this article is another huge cost to these things.What happens when your homegrown feature flag microservice (because why pay for a hard cost when you can have the soft cost of making your own) goes down, even temporarily.Sane defaults at code review time, before launch aren't always the sane defaults after a feature has fully launched, or nearly fully launched.I've seen more than a few egregious outages due to a feature flagging tool being down and taking the user experience back a year or two.

评论 #36674895 未加载

morgantealmost 2 years ago

It seems like one of the biggest problems is prioritizing the cleanup of old flags. I know some companies have developed tools like Piranha[0] to automate this process and a few of our customers at grit.io have used it for that as well.Would love to hear if others have had success with automated flag cleanup.

评论 #36677104 未加载

scrubsalmost 2 years ago

I worked at bloomberg for an extended period. Feature flags are seriously used eg. >10k flags added per month across all code. Now, they came with ample management systems to enable/disable, rollout, and check for complete rollout. Various techniques (shared memory, caching) were used to drive down lookup time.Removing them came down to team discipline.Ideally, a Google like clang analysis of code would find flags ready for removal and alter code to remove the old code path. Recall Google used tools like this ro update or migrate deprecated api callsBbg however never got there. Instead you'd just get various alerts

zellynalmost 2 years ago

Modern feature flag tooling (eg. LaunchDarkly) cover most of the uses here. It'll even tell you whether flags are useful or not (if you push evaluation data back upstream).

评论 #36669630 未加载

noelwelshalmost 2 years ago

If you want a repeatable task done properly every time you give it to a computer. In this case, manipulating feature flags is a task for partial evaluation / staging. It's relatively well known in the programming language research community but hasn't made it into mainstream production languages. No amount of social process will ever be as effective.

esafakalmost 2 years ago

Are there any authorization products that handle feature flagging as an application?

评论 #36670714 未加载

mrblampoalmost 2 years ago

Yep, everything in this article is right on my money in my experience.

fahad19almost 2 years ago

Useful post outlining a lot of common pain points I have experienced myself in my career.One of the reasons I went for an open source solution ( <a href="https://featurevisor.com" rel="nofollow noreferrer">https://featurevisor.com</a> ) that's Git based, and every change is done via Pull Requests.Building blocks:- Attributes for conditions: <a href="https://featurevisor.com/docs/attributes/" rel="nofollow noreferrer">https://featurevisor.com/docs/attributes/</a>- Segments for targeting users: <a href="https://featurevisor.com/docs/segments/" rel="nofollow noreferrer">https://featurevisor.com/docs/segments/</a>- Features with variations and rules: <a href="https://featurevisor.com/docs/features/" rel="nofollow noreferrer">https://featurevisor.com/docs/features/</a>Process:- Merge PRs- Trigger CI/CD pipeline: <a href="https://featurevisor.com/docs/deployment/" rel="nofollow noreferrer">https://featurevisor.com/docs/deployment/</a>- Consume with SDK: <a href="https://featurevisor.com/docs/sdks/" rel="nofollow noreferrer">https://featurevisor.com/docs/sdks/</a>Use cases:- User entitlements: <a href="https://featurevisor.com/docs/use-cases/entitlements/" rel="nofollow noreferrer">https://featurevisor.com/docs/use-cases/entitlements/</a>- Testing in production: <a href="https://featurevisor.com/docs/use-cases/testing-in-production/" rel="nofollow noreferrer">https://featurevisor.com/docs/use-cases/testing-in-productio...</a>- A/B testing & experimentation: <a href="https://featurevisor.com/docs/use-cases/experiments/" rel="nofollow noreferrer">https://featurevisor.com/docs/use-cases/experiments/</a>- Remote configuration: <a href="https://featurevisor.com/docs/use-cases/remote-configuration/" rel="nofollow noreferrer">https://featurevisor.com/docs/use-cases/remote-configuration...</a>You can also generate types as a package for compile-time safety:- Code generation: <a href="https://featurevisor.com/docs/code-generation/" rel="nofollow noreferrer">https://featurevisor.com/docs/code-generation/</a>The post and the comments here give me more ideas on how to improve it with more features now.

nektroalmost 2 years ago

good article except for the rag on communism in the first paragraph