Automated detection of risky code can be very useful, but it inherently lacks the capacity to differentiate whether a particular case of bad practice actually is a flaw or not. For a simplified example, usage of eval() in many dynamic languages is potentially risky and <i>could</i> result in a total takeover of the application, so these tools will generally flag it as 'critical' - even if in that particular case this 'eval' is run on data that can't possibly be affected by any user-controlled input (in that case it was used in a slightly weird design for code generation to save on copypasta) and does not have any security risk.<p>Anecdotally, when I have done triage on such analysis, perhaps 5-10% of things the tools marked as 'critical' (i.e. <i>potentially</i> critical) were flaws which might have had some actual impact and need to be fixed. So when some vendor says in an article like this "The study found nearly two-thirds (63%) of the applications scanned had flaws in the first-party code", keep that in mind - they're generally treating all detections as real, but it's rarely the case.<p>But on the other hand, it may well be the case that it's simpler to have a process to just clean up all suspicious places in the code; just as the simplest way to avoid use-after-free errors is to use a garbage-collected language instead of just trying to ensure that every single memory allocation in C is correctly implemented.