My unusual hobby

856 点作者 curryhoward超过 7 年前

26 条评论

j2kun超过 7 年前

> What's really amazing to me is that Stephen Kleene probably proved this without the help of a computer, but I was able to verify it with the highest possible scrutiny. It's as if he wrote an impressive program without ever having run it, and it turned out not to have any bugs!This offhand comment (which we can all forgive) makes it seem like mathematics happens in a vacuum. I can understand the temptation to think that way, and I hope I can make it clear that this is not the case.General theorems typically grow out of a set of representative examples of interesting behavior (unit tests, if you will), are judged by their relationship to other mathematical objects and results (integration tests), and are reviewed by other mathematicians (user acceptance tests).Mathematicians don't just write down a proof and cross their fingers. The proof is often one of the last steps. There is a high-level intuition that explains why a theorem or proof sketch is probably true/correct before a formal proof is found. These explanations are usually compact enough to help one re-derive proof ("X doesn't have enough 'space' to fill Y, and Z tracks that gap", or something like that), but aren't formal enough to put into any formal logic system I know of.

评论 #15776727 未加载

评论 #15778098 未加载

评论 #15777572 未加载

评论 #15776959 未加载

mcguire超过 7 年前

And this..."To give you an idea of what an actual proof looks like in Coq, below is a proof of one of the easier lemmas above. The proof is virtually impossible to read without stepping through it interactively, so don’t worry if it doesn’t make any sense to you here."is why Coq is not one of my favorite tools.A proof consists of two things: the "proof state"---a statement of what you know at a given point--and a sequence of "proof operations"---the manipaulations that move from one proof state to the next.Most mathematical proofs you have ever seen have been presented as a list of proof states, with the operations either explained in comments or taken to be obvious.Coq (and most other similar systems) present the proof as a sequence of proof operations, with the state invisible unless you're going through it in a Coq IDE.On one hand, that's the same thing as a program: a Coq proof is a program for creating the proof. Unfortunately, on the other, I've never been able to read Coq proofs the way I can read programs. Probably a lack of experience, I guess.On the other hand, Coq is a damn fun computer game.

评论 #15776824 未加载

评论 #15776840 未加载

评论 #15787827 未加载

评论 #15778324 未加载

评论 #15785166 未加载

david-gpu超过 7 年前

I had to take a semester on formal proofs using Coq, the same tool the article talks about.Putting aside their steep learning curve, formal proof methods do not guarantee that the code you've written is bug free. They only guarantee that the code follows the requirements you defined given the conditions you also set on your inputs. You can think of it as a mathematical proof of your postconditions will hold given that your preconditions are true.And you know what? People do make mistakes establishing those preconditions and postconditions all the time, and the tool is not going to be of any help. You've proven your code, but it may not do what you actually wanted it to do.During a lecture I saw a tenured professor, who was a well-respected mathematician and expert in Coq, make some rather obvious mistakes while establishing those preconditions and postconditions. His proof was valid but worthless.When you add that writing those proofs is actually rather time consuming, you end up with a niche product with little real-world applicability, except possibly for safety-critical environments.

评论 #15776966 未加载

评论 #15776784 未加载

评论 #15777246 未加载

评论 #15776783 未加载

评论 #15777689 未加载

评论 #15779898 未加载

评论 #15776641 未加载

评论 #15776622 未加载

评论 #15778854 未加载

yaseer超过 7 年前

If you're a programmer that's not tried theorem proving before, I highly recommend giving it a go. It gave me the same 'thrill' I got when I first started programming as a 14-year old and realised a new way of thinking. I find it sharpens your mind in a new, and different way - that's really an exciting feeling.I prefer Isabelle myself, as I find its proofs more readable in a traditional mathematical sense. Coq however, is probably easier for most programmers to experiment with.

评论 #15777104 未加载

kelukelugames超过 7 年前

Please use a more informative title and name the hobby. The current title "My unusual hobby" is click bait.

评论 #15776862 未加载

amelius超过 7 年前

Reminds me of Principia Mathematica, where the author (Bertrand Russell) needs 379 pages to prove that 1+1=2. [1]> "From this proposition it will follow, when arithmetical addition has been defined, that 1 + 1 = 2." —Volume I, 1st edition, page 379.[1] <a href="https://en.wikipedia.org/wiki/Principia_Mathematica" rel="nofollow">https://en.wikipedia.org/wiki/Principia_Mathematica</a>

评论 #15779187 未加载

评论 #15777669 未加载

kawyut超过 7 年前

I've read a fair bit about functional programming but the right-associative nesting of -> still bothers me.(first precondition IMPLIES (second precondition IMPLIES (implication))is inherently harder for me to understand than(first precondition AND second precondition) IMPLIES implicationeven though logically they are the same. The first version carries more mental overhead because it seems to indicate that there is some sort of hierarchy/assymetry to the two preconditions.It's the same thing when I try to read a Haskell function type signature:f :: a -> b -> cf is a function taking a and returning (a function taking b and returning (c))vs.f is a function taking two parameters a and b and returning cThere is no hierarchy anyhow because the parameters can be swapped:f' = \b a -> f a bCurious to know if it's just me or there are others who experience the same.

评论 #15778604 未加载

fizixer超过 7 年前

So I'm wondering what is the bridge between "proof-assistant" and "automated (or partially automated) theorem prover".As someone with "journalistic" (but fairly in-depth) knowledge about these topics, my guess is that something like a proof-assistant will have to be paired up with some kind of logic programming system (like Prolog, or a kanren-derived system or something).I guess then the problem of combinatorial explosion in logic search translates directly to the combinatorial explosion of proof-steps in a proof assistant. Hence my mention of partially-automated (more doable than fully-automated).Probably some kind of a "database of facts" would also be involved, making it an expert system (reminds me of metamath [1]).Is it fair to say, partially-automated theorem proving is a fairly advanced version (and not without added complexity) of a proof-assistant, and will help ease some of the grunt work involved in using a proof-assistant by hand?[1] <a href="http://us.metamath.org/" rel="nofollow">http://us.metamath.org/</a>

评论 #15778484 未加载

评论 #15777167 未加载

评论 #15778201 未加载

评论 #15779561 未加载

zaptheimpaler超过 7 年前

Cool post! The stuff about Curry-Howard was really interesting and relates to a question i have been thinking about. But I'm a novice when it comes to the theory of type systems or theorem provers. Maybe someone here can enlighten me?As type systems become more complex (like say in Scala or Haskell), they move towards having specialized syntax and weird hacks to express what are arguably simpler logical theorems/invariants about values in a type.. so why can't we have a programming language that gives you a full blown logic system(Coq? or Microsoft Z3? a SAT Solver?.. i don't understand the differences in detail) rather than part of one? Are there attempts at that, or does it turn out to be unworkable?

评论 #15776980 未加载

评论 #15778364 未加载

评论 #15776981 未加载

评论 #15778033 未加载

ivan_ah超过 7 年前

Here is a paper that shows Coq proofs of two famous theorems by Shannon: the source compression rate of H(X) and the channel capacity of I(X,Y). <a href="https://staff.aist.go.jp/reynald.affeldt/documents/affeldt-itp2012-preprint.pdf" rel="nofollow">https://staff.aist.go.jp/reynald.affeldt/documents/affeldt-i...</a>I wonder if simplified versions of these proofs couldn't be used with students. Right now it seems like too much details, but maybe there can be a common "stdlib" of standard math arguments that we assume to be true, and then focus on the specific machinery for each proof?

j2kun超过 7 年前

My question for the author: does implementing a proof formally in Coq honestly give you a better understanding of the theorem/proof?It would seem to me that, if you're filling in details from an existing proof or making some existing formal (in the math sense) intuition more formal (in the Coq sense), there's not much deep insight to gain from implementing a proof in Coq.

评论 #15776776 未加载

评论 #15776820 未加载

pron超过 7 年前

Terrific!I took the liberty of translating your module to TLA+, my formal tool of choice (recently, I've been learning Lean, which is similar to Coq, but I find it much harder than TLA+). I tried to stick to your naming and style (which deviate somewhat from TLA+'s idiomatic styling), and, as you can see, the result is extremely similar, but I guess that when I try writing the proofs, some changes to the definitions would need to be made.One major difference is that proofs in TLA+ are completely declarative. You just list the target and the required axioms, lemmas and definitions that require expanding. Usually, the proof requires breaking the theorem apart to multiple intermediary steps, but in the case of the proof you listed, TLA+ is able to find the proof completely automatically:<pre><code> LEMMA supremumUniqueness ≜ ∀ P ∈ SUBSET T : ∀ x1, x2 ∈ P : supremum(P, x1) ∧ supremum(P, x2) ⇒ x1 = x2 PROOF BY antisym DEF supremum (* we tell the prover to use the antisym axiom and expand the definition of supremum *) </code></pre> The natDiff lemma doesn't even need to be stated, as it's automatically deduced from the built-in axioms/theorems:<pre><code> LEMMA natDiff ≜ ∀ n1, n2 ∈ Nat : ∃ n3 ∈ Nat : n1 = n2 + n3 ∨ n2 = n1 + n3 PROOF OBVIOUS (* this is automatically verified using just built-in axioms/theorems *) </code></pre> Another difference is that TLA+ is untyped (which makes the notation more similar to ordinary math), but, as you can see, this doesn't make much of a difference. The only things that are different from ordinary math notation is that function application uses square brackets (parentheses are used for operator substitution; operators are different from functions, but that's a subtelty; you can think of operators as polymorphic functions or as macros), set comprehension uses a colon instead of a vertical bar, a colon is also used in lieu of parentheses after quantifiers, and aligned lists of connectives (conjunctions and disjunctions) are read as if there were parentheses surrounding each aligned clause. Also `SUBSET T` means the powerset of T.Here's the module (without proofs, except for the one above):<pre><code> ------------------------------- MODULE Kleene ------------------------------- EXTENDS Naturals (* Assumption: Let (T, leq) be a partially ordered set, or poset. A poset is a set with a binary relation which is reflexive, transitive, and antisymmetric. *) CONSTANT T CONSTANT _ ≼ _ AXIOM refl ≜ ∀ x ∈ T : x ≼ x AXIOM trans ≜ ∀ x, y, z ∈ T : x ≼ y ∧ y ≼ z ⇒ x ≼ z AXIOM antisym ≜ ∀ x, y ∈ T : x ≼ y ∧ y ≼ x ⇒ x = y (* A supremum of a subset of T is a least element of T which is greater than or equal to every element in the subset. This is also called a join or least upper bound. *) supremum(P, x1) ≜ ∧ x1 ∈ P ∧ ∀ x2 ∈ P : x2 ≼ x1 ∧ ∀ x3 ∈ P : ∀ x2 ∈ P : x2 ≼ x3 ⇒ x1 ≼ x3 (* A directed subset of T is a non-empty subset of T such that any two elements in the subset have an upper bound in the subset. *) directed(P) ≜ ∧ P ≠ {} ∧ ∀ x1, x2 ∈ P : ∃ x3 ∈ P : x1 ≼ x3 ∧ x2 ≼ x3 (* Assumption: Let the partial order be directed-complete. That means every directed subset has a supremum. *) AXIOM directedComplete ≜ ∀ P ∈ SUBSET T : directed(P) ⇒ ∃ x : supremum(P, x) (* Assumption: Let T have a least element called bottom. This makes our partial order a pointed directed-complete partial order. *) CONSTANT bottom AXIOM bottomLeast ≜ bottom ∈ T ∧ ∀ x ∈ T : bottom ≼ x (* A monotone function is one which preserves order. We only need to consider functions for which the domain and codomain are identical and have the same order relation, but this need not be the case for monotone functions in general. *) monotone(f) ≜ ∀ x1, x2 ∈ DOMAIN f : x1 ≼ x2 ⇒ f[x1] ≼ f[x2] (* A function is Scott-continuous if it preserves suprema of directed subsets. We only need to consider functions for which the domain and codomain are identical and have the same order relation, but this need not be the case for continuous functions in general. *) Range(f) ≜ { f[x] : x ∈ DOMAIN f } continuous(f) ≜ ∀ P ∈ SUBSET T: ∀ x1 ∈ P : directed(P) ∧ supremum(P, x1) ⇒ supremum(Range(f), f[x1]) (* This function performs iterated application of a function to bottom. *) RECURSIVE approx(_, _) approx(f, n) ≜ IF n = 0 THEN bottom ELSE f[approx(f, n-1)] (* We will need this simple lemma about pairs of natural numbers. *) LEMMA natDiff ≜ ∀ n1, n2 ∈ Nat : ∃ n3 ∈ Nat : n1 = n2 + n3 ∨ n2 = n1 + n3 (* The supremum of a subset of T, if it exists, is unique. *) LEMMA supremumUniqueness ≜ ∀ P ∈ SUBSET T : ∀ x1, x2 ∈ P : supremum(P, x1) ∧ supremum(P, x2) ⇒ x1 = x2 PROOF BY antisym DEF supremum (* Scott-continuity implies monotonicity. *) LEMMA continuousImpliesMonotone ≜ ∀ f : continuous(f) ⇒ monotone(f) (* Iterated applications of a monotone function f to bottom form an ω-chain, which means they are a totally ordered subset of T. This ω-chain is called the ascending Kleene chain of f. *) LEMMA omegaChain ≜ ∀ f : ∀ n, m ∈ Nat : monotone(f) ⇒ approx(f, n) ≼ approx(f, m) ∨ approx(f, m) ≼ approx(f, n) (* The ascending Kleene chain of f is directed. *) LEMMA kleeneChainDirected ≜ ∀ f : monotone(f) ⇒ directed({ approx(f, n) : n ∈ Nat }) (* The Kleene fixed-point theorem states that the least fixed-point of a Scott- continuous function over a pointed directed-complete partial order exists and is the supremum of the ascending Kleene chain. *) THEOREM kleene ≜ ∀ f : continuous(f) ⇒ ∃ x1 ∈ T : ∧ supremum({ approx(f, n) : n ∈ Nat }, x1) ∧ f[x1] = x1 ∧ ∀ x2 : f[x2] = x2 ⇒ x1 ≼ x2 =============================================================================</code></pre>

评论 #15777595 未加载

评论 #15777557 未加载

losvedir超过 7 年前

Whoa, very cool post! It's interesting that "small but novel" languages can be formally proved. Do any mainstream languages have a sound foundation like that? If someone is interested in creating a new programming language, is it worth getting up to speed on proof writing and using that to build the language? It's not really a topic that comes up in the Dragon Book or mainstream general compiler study, AFAIK.

评论 #15778897 未加载

carapace超过 7 年前

I've been working in a CPS Joy variant, deriving recursive functions on trees and such. I've noticed that the code always works the first time. So far my "data" on this phenomenon is still too anecdotal to get excited, but I wanted to mention it (here where people might be interested.) It surprised me at first, but then I realized that this is part of the promise of Backus' Functional Programming: being able to do algebra to derive programs gives you error-free code that doesn't require tests.It leads me to speculate... what if I don't need types to write correct software? I assumed I would implement typing and type inference (like Factor language already has) at some point. But if the code has no bugs without it... YAGNI?My reasoning goes like this: Let's postulate that we have a library of functions, each of which has the quality Q of being error-free. If we derive new functions from the existing ones by a process that preserves Q, so that new functions are all also error-free, then we cannot have errors in our code, just user error. I.e. the user can present bad inputs, but no other source of error exists (other than mechanical faults.) This would be a desirable state of affairs. Now if the Q-preserving new-function-deriving process doesn't require types to work, then we don't need them. Whether or not typing helps prevent user error is another question. Personally, I lean toward defining user error as UI error, but I am ranging far from my original point now.(If you're interested: <a href="https://github.com/calroc/joypy/tree/master/docs" rel="nofollow">https://github.com/calroc/joypy/tree/master/docs</a> Still very DRAFT versions. <a href="https://github.com/calroc/joypy/blob/master/docs/3.%20Developing%20a%20Program.md" rel="nofollow">https://github.com/calroc/joypy/blob/master/docs/3.%20Develo...</a> )

pagnol超过 7 年前

I'm currently working through the material here: <a href="https://github.com/pigworker/CS410-17" rel="nofollow">https://github.com/pigworker/CS410-17</a>

yodsanklai超过 7 年前

I've never managed to have fun with Coq even though I really tried. I did tons of exercises and labs. On the other hand, I do love (functional) programming, logic and maths in general.It's almost like you get the bad side of programming (time-consuming, tedious) without the reward (seeing your program doing cool things). And you don't get the satisfaction of doing maths either. At least I don't. Maybe I didn't try hard enough.What is very interesting though is to learn about type theory and the Curry Howard isomorphism.

phkahler超过 7 年前

Is it possible for one of these "programs" to run forever? If so, does that illustrate a direct correspondence between the halting problem and godels incompleteness theorem?

评论 #15776889 未加载

maxhallinan超过 7 年前

A lot of these comments are focused on the merits of Coq and some of the mathematical details of your post. As someone who has no experience with Coq and little experience with mathematical proofs, I want to add an additional perspective. This is a pretty cool hobby. It's really nice to hear about hobbies that aren't motivated by ambitions of profit or fame. Thanks for sharing!

zitterbewegung超过 7 年前

Anyone else here do recreational mathematics ? I mainly try to answer questions on CSTheory.stackexchange.com

tluyben2超过 7 年前

I have the same hobby; I play with Coq, Idris, Agda, F* and pen and paper. I do not have time to do it for work most of the code, however all critical parts I do. One of my dreams is to do this full time.

lordleft超过 7 年前

Are there mathematical proofs that are incapable of being expressed in Coq? Isn't there a limit to kind of ideas a computer could verify? Curious about the boundaries of this language.

评论 #15777073 未加载

评论 #15777046 未加载

评论 #15778302 未加载

Bromskloss超过 7 年前

> The CI system runs Coq on my code and prevents me from merging it into the master branch if there is a mistake.How does one make Travis CI prevent a merge?

评论 #15779518 未加载

solotronics超过 7 年前

thinking about this same topic there is probably amazing opportunities out there for programmer mathematicians. Don't get me wrong, of course there are quants and such for trading firms but intuitively I would think there are an amazing number of applications for this kind of advanced mathematical programming that seem relatively under utilized.

libeclipse超过 7 年前

This is honestly beautiful.

dclowd9901超过 7 年前

This is how I imagine Flowtype works for Javascript (more or less). What's interesting to me, is that this could be the foundation for AIs taking over programming from developers.

sillysaurus3超过 7 年前

How do you pronounce Coq? I searched it but all I found was <a href="https://m.youtube.com/watch?v=TAHH83n2dmY" rel="nofollow">https://m.youtube.com/watch?v=TAHH83n2dmY</a>

评论 #15777685 未加载

评论 #15778729 未加载

26 条评论

j2kun超过 7 年前

评论 #15776727 未加载

评论 #15778098 未加载

评论 #15777572 未加载

评论 #15776959 未加载

mcguire超过 7 年前

评论 #15776824 未加载

评论 #15776840 未加载

评论 #15787827 未加载

评论 #15778324 未加载

评论 #15785166 未加载

david-gpu超过 7 年前

评论 #15776966 未加载

评论 #15776784 未加载

评论 #15777246 未加载

评论 #15776783 未加载

评论 #15777689 未加载

评论 #15779898 未加载

评论 #15776641 未加载

评论 #15776622 未加载

评论 #15778854 未加载

yaseer超过 7 年前

评论 #15777104 未加载

kelukelugames超过 7 年前

Please use a more informative title and name the hobby. The current title "My unusual hobby" is click bait.

评论 #15776862 未加载

amelius超过 7 年前

评论 #15779187 未加载

评论 #15777669 未加载

kawyut超过 7 年前

评论 #15778604 未加载

fizixer超过 7 年前

评论 #15778484 未加载

评论 #15777167 未加载

评论 #15778201 未加载

评论 #15779561 未加载

zaptheimpaler超过 7 年前

评论 #15776980 未加载

评论 #15778364 未加载

评论 #15776981 未加载

评论 #15778033 未加载

ivan_ah超过 7 年前

j2kun超过 7 年前

评论 #15776776 未加载

评论 #15776820 未加载

pron超过 7 年前

评论 #15777595 未加载

评论 #15777557 未加载

losvedir超过 7 年前

评论 #15778897 未加载

carapace超过 7 年前

pagnol超过 7 年前

I'm currently working through the material here: <a href="https://github.com/pigworker/CS410-17" rel="nofollow">https://github.com/pigworker/CS410-17</a>

yodsanklai超过 7 年前

phkahler超过 7 年前

Is it possible for one of these "programs" to run forever? If so, does that illustrate a direct correspondence between the halting problem and godels incompleteness theorem?

评论 #15776889 未加载

maxhallinan超过 7 年前

zitterbewegung超过 7 年前

Anyone else here do recreational mathematics ? I mainly try to answer questions on CSTheory.stackexchange.com

tluyben2超过 7 年前

lordleft超过 7 年前

Are there mathematical proofs that are incapable of being expressed in Coq? Isn't there a limit to kind of ideas a computer could verify? Curious about the boundaries of this language.

评论 #15777073 未加载

评论 #15777046 未加载

评论 #15778302 未加载

Bromskloss超过 7 年前

> The CI system runs Coq on my code and prevents me from merging it into the master branch if there is a mistake.How does one make Travis CI prevent a merge?

评论 #15779518 未加载

solotronics超过 7 年前

libeclipse超过 7 年前

This is honestly beautiful.

dclowd9901超过 7 年前

This is how I imagine Flowtype works for Javascript (more or less). What's interesting to me, is that this could be the foundation for AIs taking over programming from developers.

sillysaurus3超过 7 年前

How do you pronounce Coq? I searched it but all I found was <a href="https://m.youtube.com/watch?v=TAHH83n2dmY" rel="nofollow">https://m.youtube.com/watch?v=TAHH83n2dmY</a>

评论 #15777685 未加载

评论 #15778729 未加载