Make formal verification and provably correct software practical and mainstream

204 pointsby peanutcrisisalmost 3 years ago

32 comments

I really want to like this, but it really comes across as more of a wishful thinking project without a lot of experience or intuition about how to solve the very real problems that formal methods run into in this domain. Like, the design goals literally include "verify any program" [1], which is almost certainly impossible.Important questions like how you implement the design pillars without running smack into the issue of decidability seem entirely ignored. They have a whole section on how "this idea exists in an incentive no man's land" without seemingly being aware of the rich history of formal methods in low level programming, from Ada through Java through formal C through Rust itself. The common issues these encountered like decidability, holes in the formal model (which contributed to the downfall of the Java sandbox as a security boundary), and the combinatorial explosion inherent in verification tools are all huge looming questions that should at least be mentioned.Maybe I'm being overly critical here and partial improvements are still improvements, but it'd be nice to see more moderate claims from authors tackling such an ambitious project.[1] <a href="https://github.com/magmide/magmide/blob/main/posts/design-of-magmide.md" rel="nofollow">https://github.com/magmide/magmide/blob/main/posts/design-of...</a>

评论 #31545579 未加载

评论 #31545290 未加载

评论 #31551859 未加载

评论 #31544914 未加载

评论 #31544862 未加载

Animatsalmost 3 years ago

"And all existing proof languages are hopelessly mired in the obtuse and unapproachable fog of research debt created by the culture of academia."Yes. As I wrote 40 years ago:"There has been a certain mystique associated with verification. Verification is often viewed as either an academic curiosity or as a subject incomprehensible by mere programmers. It is neither. Verification is not easy, but then, neither is writing reliable computer programs. More than anything else, verifying a program requires that one have a very clear understanding of the program’s desired behavior. It is not verification that is hard to understand; verification is fundamentally simple. It is really understanding programs that can be hard."[1]What typically goes wrong is one or more of the following:1) The verification statements are in a totally different syntax than the code.2) The verification statements are in different files than the code.3) The basic syntax and semantics of the verification statements are not checked by the regular compiler. In too many systems, it's a comment to the compiler.4) The verification toolchain and compile toolchain are completely separate.None of this is theory. It's all tooling and ergonomics.Then, there's5) Too much gratuitous abstraction. The old version was "everything is predicate calculus". The new version is "everything is a type" and "everything is functional".[1] <a href="http://www.animats.com/papers/verifier/verifiermanual.pdf" rel="nofollow">http://www.animats.com/papers/verifier/verifiermanual.pdf</a>

评论 #31544689 未加载

评论 #31547283 未加载

csande17almost 3 years ago

As the article mentions, formal verification techniques are primarily used today for two things:- Creating secure "core" code -- library functions and kernels and stuff, where the things they're supposed to do are very well-defined.- Verifying specific, narrowly defined properties, like how Rust's borrow checker guarantees that your program doesn't try to write to the same value from two different threads at once.I'm not sure formal techniques will be as useful when expanded to other areas. Most of the bugs I encounter day-to-day happen because the programmer had the wrong goal in mind -- if you asked them to create a formal proof that their code worked, they would be able to do that, but it would be a proof that their function did a thing which was not actually the thing we wanted. (Similarly to, e.g., unit tests that do not actually test anything because they're just line-by-line reproductions of the original code but with every function call mocked out.)Has anyone successfully applied proof techniques to reduce defects in UI development, "business logic", or similarly fuzzy disciplines?

评论 #31544681 未加载

评论 #31544627 未加载

评论 #31544659 未加载

jandrewrogersalmost 3 years ago

I am strongly inclined toward verifying my software to the extent possible but there are many practical challenges. I think academic formal verification methods look elegant, which appeals to me, but are extremely human intensive when what I really want to do is throw machines at the problem to the extent possible. There are also some important types of software correctness that are still difficult to capture with these methods, though the state-of-the-art has improved with time.I've toyed with many methods, tools, techniques, and approaches to get a sense of where the ROI maxima is for my own purposes. In practice, I've found that sophisticated and comprehensive application of less elegant methods amenable to throwing hardware at them, like exhaustive functional testing, thorough fuzzing infrastructure, systematic fault injection coverage, various types of longevity testing, etc when done well often found all the same design flaws as a tractable level of more academic formal verification. Also easier to maintain as code evolves. Furthermore, these less elegant approaches also found the occasional compiler and hardware bug that more elegant formal verification methods typically do not.I have wondered if developing and standardizing this less elegant tooling to a high level, so that it is easier to be lazy and throw hardware at the problem, would have at least as much impact on software quality as trying to get everyone to apply very academic formal verification methods, with their current limitations and theoretical constraints. As much as I like the concept of very pure formal verification, I lean toward whatever makes maximizing software quality practical and economic.

daenzalmost 3 years ago

Outside of mission critical applications, if the cost involved to make software "provably correct" (time, salaries) is greater than the cost of the bugs, it will never be adopted.Believe me, I see the appeal, but it's kind of like demanding your house have all perfect right angles and completely level surfaces. Living with manageable imperfection is far more realistic.

评论 #31544384 未加载

评论 #31544530 未加载

评论 #31544600 未加载

评论 #31544820 未加载

评论 #31544458 未加载

评论 #31544823 未加载

Genboxalmost 3 years ago

Microsoft developed the excellent Code Contracts[1] project. From the user's perpective, tt was a simple class called Contracts with a few methods such as Require(), Ensure() and Invariant()Underneath the hood it used the Z3 Solver[2], which is both intuitive, flexible and fast. It validated the contracts while coding and highlighted in the Visual Studio IDE when a contract was broken.You could write something like: Contracts.Requires(x > 5 && y > x);Which would get translated to a compile time check as well. Unfortunately, Code Contracts has been dead for years now, and it was even removed entirely from .NET[3] due to being hard to maintain, and the verifier stopped working in newer versions of VS.Luckily, C# developers now have a small taste of contracts due to nullability analysis[4], but even more exciting is that contracts is making its way into C# as a first-level standard[5].[1] <a href="https://www.microsoft.com/en-us/research/project/code-contracts/" rel="nofollow">https://www.microsoft.com/en-us/research/project/code-contra...</a>[2] <a href="https://github.com/Z3Prover/z3" rel="nofollow">https://github.com/Z3Prover/z3</a>[3] <a href="https://github.com/dotnet/runtime/issues/20006" rel="nofollow">https://github.com/dotnet/runtime/issues/20006</a>[4] <a href="https://docs.microsoft.com/en-us/dotnet/csharp/nullable-references" rel="nofollow">https://docs.microsoft.com/en-us/dotnet/csharp/nullable-refe...</a>[5] <a href="https://github.com/dotnet/csharplang/issues/105" rel="nofollow">https://github.com/dotnet/csharplang/issues/105</a>

ChrisMarshallNYalmost 3 years ago

While I laud the goals, I am skeptical of the ability to met them.I very much believe that there is an industry-wide crisis of terrible software, but I don't believe that it's practical to go directly from "garbage to gold." The path is long, and far from straight.Best Practices are how engineering disciplines, throughout history, have achieved progress in Quality.Currently, Best Practices aren't really a "thing," in software development, and it shows. People like Steve McConnell are not really respected, and a general culture of "move fast and break things" is still pervasive. Engineers flit around companies like mayflies, techniques and libraries come and go, and there's an enormous reliance on dependencies with very little vetting. We spend so much time, trying to perfect our tools, without trying to perfect ourselves.Academics and theorists have been proposing languages, libraries, infrastructure, and management practices that are designed to change lead into gold for decades, yet it never seems to happen.I have always been a fan of self-Discipline, and the apprenticeship model. That requires a lot of social infrastructure that does not currently exist. It's as old as human history, and absolutely proven to achieve results."In theory, there is no difference between theory and practice; while in practice, there is." -Benjamin Brewster"It is not enough to do your best; you must know what to do, and THEN do your best." -W. Edwards Deming"The significant problems we face cannot be solved by the same level of thinking that created them." -Albert Einstein"Everyone thinks of changing the world, but not one thinks of changing himself." -Tolstoy<a href="https://xkcd.com/2030/" rel="nofollow">https://xkcd.com/2030/</a>

评论 #31548286 未加载

toast0almost 3 years ago

Formal verification needs machine readible formal specifications, but any kind of written specification, informal or not was pretty hard to find in my career at internet giants. Maybe you can get a formal spec in aerospace or FDA regulated implanted devices, but cost to write the spec, let alone to follow the spec is way too high when the spec needs to change at the whim of a hat.

评论 #31544730 未加载

_vdppalmost 3 years ago

One thing that will help drive adoption is the ability to run SMT solvers more quickly so the proof stage of your design/build has a faster feedback loop.I ran some experiments with the Z3 and alt-ergo solvers (verifying SPARK/Ada code using GNATprove) on a base M1 Mini and it absolutely screamed, I’m not normally a Mac fan-boy but new chips like the M1 Ultra might have the possibility of driving a mini-renaissance in FV.I’d like to see more attention being given to GPU accelerated SMT solvers too but haven’t seen much movement outside of a handful of research papers.

pid-1almost 3 years ago

I've started watching Lamport's TLA+ course in YT and it totally blew my mind.What are other good resources in formal verification?

评论 #31546054 未加载

评论 #31544432 未加载

评论 #31544393 未加载

digitalicealmost 3 years ago

Been following the development of Dafny: <a href="https://www.microsoft.com/en-us/research/project/dafny-a-language-and-program-verifier-for-functional-correctness/" rel="nofollow">https://www.microsoft.com/en-us/research/project/dafny-a-lan...</a>

评论 #31547114 未加载

mkleczekalmost 3 years ago

“Complexity is the business we are in, and complexity is what limits us.”Yet, just as Turski recognized, some people seem to be philosophically offended by this notion, and try to fight the proven limitations. They are under the impression that the world — and I’m not just talking about computing now — is essentially simple, and it the stupidity of people and institution that needlessly complicate it (or, in the case of software, “stupid” programming languages). If we apply careful mathematical reasoning, we could find solutions to anything.Computer science is the very discipline that proved that essential complexity can arise even in the smallest of systems. Yet sometimes it is computer scientists and software developers who attempt to challenge the very foundation of their own discipline. Complexity is essential. It cannot be tamed, and there is no one big answer.The quote is from excellent <a href="https://pron.github.io/posts/correctness-and-complexity" rel="nofollow">https://pron.github.io/posts/correctness-and-complexity</a> and it would be really good if the authors read it carefully first.

markisusalmost 3 years ago

I think there is a project quite similar to this one called Verifiable Software Toolchain (VST) in which you can write a C program, convert it into a massive Coq expression, and then write theorems about that expression in Coq. The Software Foundations series has a volume about it [1], which I found to be an order of magnitude harder to understand than the other volumes.It feels like the magmide project aims to the same goal as VST. It's unclear how it will improve on what VST has done. It may just be that formal verification of real world languages is inherently complex.[1] <a href="https://softwarefoundations.cis.upenn.edu/vc-current/index.html" rel="nofollow">https://softwarefoundations.cis.upenn.edu/vc-current/index.h...</a>

nimmeralmost 3 years ago

While this is a very ambitious goal, Nim implements formal proof for invariants using a theorem prover:<a href="https://nim-lang.org/docs/drnim.html" rel="nofollow">https://nim-lang.org/docs/drnim.html</a>

Cloudefalmost 3 years ago

Do we have formal verification for formal verification yet? I want to make sure my verification does not have bugs.

评论 #31545251 未加载

yourapostasyalmost 3 years ago

> ...existing uses of Iris perform the proofs "on the side" using transcribed syntax tree versions of the target code rather than directly reading the original source.I'm a formal verification dummy, so can someone please confirm if this means these uses of Iris are creating an Abstract Syntax Tree (AST) of the source, then operating upon that AST?If so, can I please get an ELI5 why there is a salient formal verification outcomes difference between using the AST and "directly reading the original source"?

yjftsjthsd-halmost 3 years ago

So this actually looks really neat if it works. However, hopefully in the spirit of constructive criticism, I would be very nervous about sticking this in big letters at the top of the introduction:> Software can literally be perfectbecause that is a wonderful way to get people to invest in really robust, excellent, high-quality software - and then trust it blindly and ignore that even if everything goes well and the software is itself perfect, and the verification has no bugs, and the model that it perfectly implements actually maps the problem space correctly, it will still run on fallible hardware, interfacing with other software that is imperfect, taking direction and data from humans who can make mistakes. Now to the author's credit! Further down, under "Do you think this language will make all software perfectly secure?" and "Is logically verifying code even useful if that code relies on possibly faulty software/hardware?", this is discussed. And I think the writer actually does appreciate the limits of what this can actually do, and I very much appreciate them explaining that in what I'd call clear terms. Just... maybe don't headline with a claim like that when it has caveats and people are liable to read the claim and ignore the caveats?

xavxavalmost 3 years ago

Honestly, to me this project is a means seeking an end, the same way JS devs love to play around with frontend frameworks, the author saw a bunch of shiny powerful (highly complex) tools and decided that combining them all was the solution to our problems.I don't want to discourage them from learning Iris, or designing a dependently typed language, but I really think that's missing the difficulty in formal verification.I think the two areas that need focus are: ease of specification and automation. In short, we need to lower the cost of verifying a line of code, by at least an order of magnitude. These two objectives are also directly opposed to the direction Magmide sets as the goal. Ease of specification means we want to use the least amount of seperation logic possible, and hide it from the user if possible. Doing proofs / writing specs in seperation logic sucks and not for interesting reasons. Automation means favoring simpler logics, specifically we want to stick as much as possible to FOL since that's where we have good automation. By doing everything in a rich dependently typed language from the start it also makes it harder to do incremental verification, I think there is a lot of value in having a 'pyramid of trust' with more and more powerful tools which take you up a level of trust and verification, potentially requiring more input from engineers as they go up.Finally, I think there's a lot of potential to explore in the interfaces we use to write, read, and debug proofs. I don't think tactic languages (as exist today) are the last word, and I think we should be doing a lot more interesting things to interface with and explore the proofs.

评论 #31617138 未加载

IngoBlechschmidalmost 3 years ago

Try the proof assistant Agda without installation directly in your browser:<a href="https://agdapad.quasicoherent.io/" rel="nofollow">https://agdapad.quasicoherent.io/</a>Or Lean: <a href="https://www.ma.imperial.ac.uk/~buzzard/xena/natural_number_game/" rel="nofollow">https://www.ma.imperial.ac.uk/~buzzard/xena/natural_number_g...</a>

vivegialmost 3 years ago

One of the approaches for formal program verification is to convert an unrestricted grammar G_1 into a context-sensitive grammar G_2 subject to some constraints C. We then derive a linear bounded automaton A_2 that accepts the language L(G_2). We then transform the input program i.e., a string S_1 in L(G_1) to a modified program i.e., a string S_2 in L(G_2). If A_2 halts on S_2 then A_1 halts on S_1. By definition, A_2 accepts S_2. Therefore, A_1 accepts S_1.Of course, L(G_2) is a subset of L(G_1) which means that many programs written in G_1 that do not meet the constraints C cannot be verified. But the benefit is that programs that do meet the constraints C are provably verified.The tension lies in keeping C small and maximizing utility of the approach for a wide class of programs/libraries/programming paradigms etc.,

muglugalmost 3 years ago

This builds on the success of Rust, but Rust has not been a success when it comes to [number of engineers writing professional code in the language]. By that measure it's still incredibly niche compared to interpreted languages.The main reason why formal verification has not had even the success of Rust is that most developers (myself included) don't know enough about the area to take an interest, and certainly don't know enough about the area to pursuade skeptical managers.Unless a big company comes forward with a bunch of case studies about how they used formal verification successfully I can't see the developer mindset changing.

评论 #31544702 未加载

评论 #31549686 未加载

88913527almost 3 years ago

For functional stuff, sure, but I don't think this is achievable within the UI domain. CSS rules have implementation details that change how you write it (some problems have workarounds), for example there's a documented set of issues in flex implementations maintained here: <a href="https://github.com/philipwalton/flexbugs" rel="nofollow">https://github.com/philipwalton/flexbugs</a>It might be practical and possible to become mainstream for some domains, but it's doubtful for others. The most practical solution for UI is visual regression testing across browsers.

评论 #31617114 未加载

hn_urbit_thr123almost 3 years ago

Is it even realistic to make a provably secure/stable applications on an OS like windows or linux? This (provable correctness of programs, at the expense of performance) is one of the explicit design goals or urbit. I know HN hates urbit and don't want to rehash that, but it seems like a good goal for some use cases and I'm not sure it's possible to achieve without building the OS around it.

评论 #31545872 未加载

评论 #31617121 未加载

评论 #31545190 未加载

Buttons840almost 3 years ago

I applaud the research. Of course, those organizations creating and suffering from the most bugs will be the least able to utilize such a language.

akomtualmost 3 years ago

I suspect that verifying software is a lot like the termination problem of Turing machines: the more useful properties you want to verify, the closer it is to NP completeness. So a practical verifier should limit its scope to a modest subset of software and settle on verifying sonething with a sufficient degree of confidence, which is lower than 100%.

评论 #31554844 未加载

评论 #31617107 未加载

评论 #31545502 未加载

jbaba101almost 3 years ago

I want to understand what is verification process? How would programming will take new step forward if this is achieved? I have been programmer for a while but I don't understand context and discussion around verification. Please point me any useful resources which can give me deep understanding of what's being discussed here.

dgb23almost 3 years ago

I recently came across Ur/web. It makes some promises that are quite attractive. One of the benefits of having a DSL rather than something general purpose is that it can make these promises in a more comprehensive and focused manner.But despite being around for a while now it didn’t get adopted very widely.

mkl95almost 3 years ago

Many engineers and teams are aware that they write bad code, and they love it. You can get very far as a clumsy code vendor. Even if formal verification was practical, it would be pretty difficult to make it mainstream.

nottorpalmost 3 years ago

You can prove that an algorithm is correct most of the time (yes, halting and decidability but for practical purposes you mostly can).How do you prove an event driven application is correct?

评论 #31556116 未加载

pronalmost 3 years ago

Wow, the language here is even more optimisitc than the rosiest descriptions you see from young researchers, which prompted me to check if the author has had much experience deductively verifying interesting "deep" functional properties of non-trivial programs. The answer seems to be no.Like a newcomer to the field, he focuses on "first-day" problems such as language convenience, but the answer to his question of why this hasn't been done before is because that's not the hard problem, something he'll know once he obtains more experience.One of the biggest issues — indeed, the problem separation logic tries to address, but does so successfully only for relatively simple properties — is that "correctness" does not feasibly (affordably) compose. The difficulty of proving the correctness of a program made out of components, each of which has already been proven correct for any desired property is not easier than proving the correctness of the program from scratch, without concern for its decomposition. I.e. proving the correctness of a program made of ten provably-correct 500-line components is no easier than proving the correctness of all 5000 lines at once. This has been shown to be the case not only in the theoretical worst case, but also in practice.Here's an example to make things concrete. Suppose we have the following (Java) method, calling some unknown, pure, `bar` method:<pre><code> long foo(long x) { if (x <= 2 || (x & 1) != 0) return 0; for (long i = x; i > 0; i--) if (bar(i) && bar(x - i)) return i; throw new Error(); } </code></pre> It is very easy to describe exactly under what conditions an error would be thrown. Similarly, it is easy to describe the operation of the following method, and prove that it performs its function correctly:<pre><code> boolean bar(long x) { for (long i = x - 1; i >= 2; i--) for (long s = x; s >= 0; s -= i) if (s == 0) return false; return true; } </code></pre> However, it is not easy to determine which inputs, if any, would cause foo to throw an error using that particular bar. In fact, we only happen to know that this particular question is extremely hard because it is one that has interested mathematicians for 300 years and remains unanswered.While most verification tasks don't neatly correspond to well-known mathematical problems, and most require far less than 300 years to figure out, this kind of difficulty is encountered by anyone who tries to deductively verify non-trivial programs for anything but very specific properties (such as "there's no memory corruption", which separation logic does help with). Various kinds of non-determinism, such as the result of concurrency or any kind of interaction, only makes the possible compositions more complex.In short, the effort it takes to verify a program does not scale nicely with its size, even when it is neatly decomposed, and it is this practical affordability — which is not a result of the elegance of the tools used — that makes this subject so challenging (and interesting), and requires some humility and lowering of expectations even when it is useful (and it can be certainly useful when yielded properly and at the right scope).Another problem is an incorrect model of how programs are constructed. One might think that if a programmer has written a program, then they must have some informal (but deductive) model of it in their mind, and all that's missing is "just" formally specifying it. But that is not how programs are constructed over time when many people are involved. In practice, programmers often depend on inductive properties in their assumptions, such as "if the software survived for many years, then local changes are unlikely to have global effects that aren't caught by existing tests." Those assumptions are good enough for providing the (often sufficient) level of correctness we already reach, but insufficient for constructing software that can be formally specified, let alone deductively proven correct.That is why much of the contemporary research focuses on less sound approaches, that aren't fully deductive, such as concolic testing (e.g. Klee), that allow better scaling for both specification and verification at the cost of "perfection".The reason why both research and industry don't all do what is proposed here is because they know that's not where to real problems are. There are bigger issues to tackle before making the languages more beginner-friendly.

评论 #31552031 未加载

eurasiantigeralmost 3 years ago

Can we formally verify the software cannot be used for evil?

评论 #31544331 未加载

评论 #31544328 未加载

namibjalmost 3 years ago

I've been looking at writing a code generator, which spits out vectorized code for multi-way joins (kind of like "dynamic programming" to not suffer from being locked in a specific order of binary joins), with the additional complications of being for the incremental/difference-based "Differential Dataflow"[0], the multi-temporal aspect DDFlow needs to handle it's iterative/fixpoint operator, and dynamically adjusting IO concurrency (the asymptotics of part of the "dynamic programming" index traversal suffer when squeezing concurrency out of it, as it's pretty much cache prefetching).It's been about a year since I realized the impact of a JIT-like query compiler for these kinds of join-project queries, but needing to balance IOPS, vectorization, and likely even applying (vectorized) B-tree tactics within a 4k page, on top of the weirdness from multi-temporal (not just bi-temporal) delta-based records (~change-data-capture stream; needs to be integrated/materialized to get a point-in-time view)... sounds like a recipe for logic bugs/off-by-one errors.They are already hard-to-impossible to notice if it'd be used in production, it being a code generator allows situations with only edge case queries having data-dependent edge case bugs, and the addition of dynamic/adaptive IO concurrency makes reproducing/debugging a detected error next to impossible.I wouldn't dare to trust the code generator I'd write, if not formally verified. Not because I don't have faith in my skills, but because it's extremely complex code that has to be fast and fall out of a JIT and it's nature makes off-by-one errors in index access unusually likely. And debugging a known error might well be harder than formally verifying to find the bug.[0]: <a href="https://news.ycombinator.com/item?id=25867693" rel="nofollow">https://news.ycombinator.com/item?id=25867693</a> <a href="https://news.ycombinator.com/item?id=27512224" rel="nofollow">https://news.ycombinator.com/item?id=27512224</a>TL;DR: There is code that would be useful if written, but is so complex and neigh-impossible to debug that formally verifying it might be easier than debugging it to "production-grade". I hope this project makes that kind of approach practical for those engineers who could write the code and make it pass integration tests.