"BitC isn't going to work in its current form"

144 点作者 kumarshantanu大约 13 年前

10 条评论

luriel大约 13 年前

I have met Shapiro and he is a very nice and smart guy, but I find it rather shocking that somebody could invest so much time and effort in designing new language before building a library to do something as fundamental as I/O.Is interesting to contrast his approach building BitC to that of Ken Thompson and Rob Pike (both former colleagues of his at Bell Labs) building Go.Go has been criticized for "not paying attention to PL research", which might be true, but is already extremely useful and even before its first 'stable' release people already have used to build large systems used in production.I think what applies to building other kinds of software applies to designing programming languages too: you need something that you can start using to build real systems as soon as possible and then iterate from there based on your experience.The problem with languages is that changing the language involves throwing away the code you wrote, which discourages one from building real systems with a language that is still changing.That puts even more value in keeping languages simple. And tools like gofix are also quite nice to help with this.

评论 #3750584 未加载

评论 #3750501 未加载

评论 #3750839 未加载

评论 #3750736 未加载

评论 #3752326 未加载

srean大约 13 年前

Ah! bummer.Since a long time I had an eye on the BitC project, the idea seemed very compelling. When Jonathan Shapiro quit Microsoft and resumed @ Coyotos I was really hopeful that the pace of the development will kick up a notch or two and we would have an usable BitC shortly.Now it is clear that is not going to happen, but in the process if we have a better language its perhaps good (though honestly cant hide the disappointment that it is going to be a longer wait). For those who are interested in things similar follow the development of decac<a href="http://code.google.com/p/decac/" rel="nofollow">http://code.google.com/p/decac/</a>It has been on HN a few times, and ATS. Snippets from<a href="http://cs.likai.org/ats" rel="nofollow">http://cs.likai.org/ats</a><pre><code> ATS is a bit similar to and can be used like: C, which offers you very fine control over memory allocation as well as the management of other kind of resources. ML, except you have to annotate types for at least the function arguments and return types Dependent ML, using dependent types to help you reason about data structure integrity, from something as simple as requiring two lists to be zipped to have the same length, to preserving tree-size invariant when rotating AVL trees to the left or to the right in order to guarantee well-balanced property of AVL trees. ...and it is mighty fast. </code></pre> <a href="http://en.wikipedia.org/wiki/ATS_(programming_language)" rel="nofollow">http://en.wikipedia.org/wiki/ATS_(programming_language)</a>

VMG大约 13 年前

BitC is a systems programming language developed by researchers[1] at the Johns Hopkins University and The EROS Group.<a href="http://en.wikipedia.org/wiki/BitC" rel="nofollow">http://en.wikipedia.org/wiki/BitC</a>

评论 #3750201 未加载

ComputerGuru大约 13 年前

An interesting email. It's a shame, and it seems like we have a long way to go (still) in order to have a normal-ish programming language that supports formal verification, but Shapiro brings up some really insightful points about language design and why things are the way they are.Sidenote: This article isn't about BitCoin, it's about a programming language called BitC that's focused on making low-level simpler and formally verified. (I personally thought it was about BitCoin until I saw the coyotos.org next to the title.)

gruseom大约 13 年前

The following passage is interesting because it points away from the most widely held belief about concurrency and FP. I'd like to know more about what he means.The other argument for a pure subset language has to do with advancing concurrency, but as I really started to dig in to concurrency support in BitC, I came increasingly to the view that this approach to concurrency isn't a good match for the type of concurrent problems that people are actually trying to solve, and that the needs and uses for non-mutable state in practice are a lot more nuanced than the pure programming approach can address. Pure subprograms clearly play an important role, but they aren't enough.

评论 #3750353 未加载

pcwalton大约 13 年前

It's interesting to compare the issues that Shapiro found with the issues that we found during the development of Rust. It turns out that we ran into many of the same issues. Since Go is being mentioned here too, I'll compare the way Go dealt with the issues as well.(0) Objects and inheritance: Rust had support for objects (via prototype-based inheritance) in the first version of the compiler, but we found that they weren't being used. We attempted to be as minimalist and simple as possible regarding objects, and as a result we ended up with a system that didn't have enough features to be useful. It also didn't fit well with the rest of the language, so we scrapped it and added typeclasses instead. Those worked a lot better, and now most of our standard library is using typeclasses for OO-like patterns. Recently, we've found that we really do want objects, but mostly as a way to achieve code reuse, privacy, and a more direct association between a type and its method than typeclasses alone provide. The current system that is being implemented is a nice way, in my opinion, to unify typeclasses and object-oriented programming. There are just a few concepts to learn, and it all meshes together quite nicely.Go's interface system is quite similar to Rust's typeclass system. The main things that Rust has that Go doesn't are first-class constructors, the trait system (not yet implemented) to facilitate method reuse, and the ability to define multiple implementations of an interface per type. The things that Go has that Rust doesn't are duck typing (which is a quite good idea, and I'd like to add it to Rust as well) and downcasting (which we don't want to support because we have more type-safe mechanisms for the same thing).(1) The compilation model: Rust uses dynamic linking pervasively, because OS's have good support for it and it helps keeps binaries small. It also has strong support for separate compilation, because we want to make compile times fast. So far, so good, but, just like BitC did, we discovered that type abstraction (which you use generics in Rust to achieve) doesn't mix well with separate compilation. We didn't want to have a uniform value representation like the MLs do (supporting only 31-bit ints and boxing everything else doesn't fly in a systems language), so we tried to use dynamic size calculations for all of the values. It resulted in a huge amount of complexity (we never shook out all the bugs), and it also had a large runtime performance penalty. Unlike C#, we couldn't fall back on a JIT, because Rust is an ahead-of-time-compiled language. So we moved to a "monomorphization" scheme for Rust 0.2, which is basically like C++ template instantiation, only without the overhead of reparsing all the code from scratch. Even with this scheme, you only pay for monomorphization when you use generics, you can still dynamically link all non-generic code (which is most of it), and your runtime performance is unaffected by your use of generics.Go, of course, doesn't have generics. I don't personally believe that buys them much though; the programmer ends up working around it in a way that either involves boxing (via interface{}) or by-hand monomorphization (by duplicating the code for each type). To me, generics are just a way for the compiler to do work the programmer would end up having to do anyway.(2) Insufficiency of the type system regarding reference and by-reference types. It's spooky to read this, because it's precisely the problem we ran into with Rust. At the moment we have by-value and by-reference modes for parameters, and we've found that this isn't sufficiently expressive. (We also tried making the only difference between by-value and by-immutable-reference internal to the compiler, which didn't work well due to higher-order functions and interoperability with C code.) We also found that parameters really aren't the only place you want by-reference types; you really want to be able to return references and place them within other data structures. Whenever we wanted to do this, we had to fall back onto heap allocation, and that was significantly hurting our performance, especially when unique types were involved (since aliasing a unique type is impossible, you have to copy it). Profiling Rust programs showed an alarming amount of time spent in malloc and free. So we're in the process of bringing up a new regions system that I'm excited about: it's too early to say for sure, but I think we've stumbled upon a way to make regions not require a doctorate in type theory to understand. Regions allow you to have safe pointers into the stack and into the heap and pass them around as first-class values.Go doesn't have zero-cost reference types at all; it just does simple escape analysis to allocate structures on the stack when it can and falls back on tracing GC for the rest (note that this is what Java does nowadays too). This is one of the most significant differences between Go and Rust; Go's memory model is essentially identical to that of Java, plus the ability to allocate structures inside other structures, while Rust has a much more C++-like memory model (but safe, unlike C++). This decision is based on our experience with Firefox; fine-grained control over memory use is so important that we didn't want to place our bets on pervasive use of GC.(3) Inheritance and encapsulation: Rust still has no concept of inheritance; it's our hope that a combination of enums (datatypes like in Haskell or case classes from Scala) and traits will allow us to avoid introducing the complexity of inheritance into the language. Time will tell, of course. As for encapsulation, we thought we didn't need it, but it turns out that we really did want private fields. This we're solving with the class system, mentioned above.Go achieves inheritance through anonymous fields. Anonymous fields are multiple inheritance in all but name, complete with the "dreaded diamond" problems of C++. We were hoping to avoid that. Go has standard privacy through "unexported fields".(4) Instance coherence. Since you can define multiple typeclass implementations for a given type, and the caller chooses which implementation to use, you have to make sure in many contexts (for example, the hash method of a hash table) that the same implementation gets used for the same data. That's one of the reasons we introduced classes -- they tie implementations of methods to a type.Go doesn't have this problem, because it only permits one implementation of an interface for each type, and it has to be defined in the same module as the type. Basically, Go's interfaces are much like our classes in that regard. We wanted to allow people to add extra methods to types -- for example, to add extension methods on vectors (think what Rails does to the Ruby standard library, but in a way that doesn't involve monkey patching) -- so we didn't want to force this restriction on users.I think that one of the most important things to underscore is that we would have never found these things so early unless we had written the Rust compiler in Rust itself. It forces us to use the language constantly, and we quickly find pain points. I highly encourage all languages to do the same; it's a great way to find and shake out design issues early.

评论 #3751077 未加载

评论 #3751338 未加载

评论 #3753023 未加载

mindslight大约 13 年前

It seems to me that it's pretty impossible to actually replace C (although I fully support the people trying to do so). All the things that "make C fast" are really ways of making the "underlying machine" fast through C, and trying to come up with new language constructs that facilitate these techniques seems like a losing game. Any new languages are going to have a hard time catching up to C when it comes to compiler optimizations and can never really catch up when it comes to ubiquity. (Would your 5-year-old compiler target dsPIC and have support for their IO ports and interrupts? clearly not)So, what about a modern safe language that's meant to be used alongside C? For example, the language presents a modern approach to modules and imports, but a function 'bar' in a module 'foo' is compiled down to a straightforward foo___bar() in the generated C. The generated C has straightforward type declarations and readable code, with a minimum of macro abstractions. The source language emphasizes features that allow one to write concise reasonably performant code (H-M, local inference, sum types, some kind of object polymorphism). It has implicit GC, but mostly relies on linear references and stack allocation, falling back to real GC only for explicit unknown-life allocations (with swappable collectors for different runtime overhead). Do-everything capabilities like unsafe pointer arithmetic are completely left out, as the programmer can drop back to C in a neighboring file or inline. The per-project ratio of this new language to C would vary depending on the type of the project - something like a network server would use a minimum of C, perhaps just in cordoned-off performance critical code.(I've been mulling on this idea for a few weeks. I ran across Felix the other day, and found myself nodding a lot, but the language seemed quite complicated and the array details in particular left me wondering if memory safety was even a design goal.)

评论 #3750575 未加载

eli_gottlieb大约 13 年前

My full and proper response to this was sent to the mailing list. Since we appear to have at least one Rust person here and the mailing list may have blocked the email or something, I'm CC'ing here.Hey, I saw this linked on Hacker News, so I thought I would give it a read. And boy, this is a strange one. It's like watching a painful portion of my life flash before my eyes, and realizing someone else lived it before I did. I can't say how much I and my work owe to you and BitC. Seriously, thank you for all your effort and for having the courage to do a public autopsy of your own language.I've seen everything you said about the compilation model happen in my own attempts. My only way of slinking around that issue has been to output to a bitcode able to do link-time resolution of opaque/abstracted type representations. This means that I have to allow for some whole-program tracking of abstracted representations. It leaves a bad taste in one's mouth, but it can be... workable. Its downside is that representation sizes have to be specified for abstracted types exported as binary blobs from libraries, because the library client's dynamic linker (to my knowledge) cannot see inside the existential. I need the same whole-program tracking to perform polyinstantiation (although theoretically the typed IR language would admit code to copy it and then resolve an Opaque Type to an instantiation).Ideally, I think we'd compile a New Systems Language down to a Typed Assembly Language, and the portion of the compiler dealing with polyinstantiation and type abstraction would retain access to generated TAL code. Work on TALs is ongoing.On the fortunate side, C++ already had an ABI problem with exporting classes from shared libraries. C++ requires both programmers use C++, and both to use the same C++ ABI. Yick.I have ended up having to "add back" regions into my own language just this year. Pointers have ended up being just another type constructor, and they do have to be associated with both a region and a mutability. The good news is that once you do associate them, the region can be a universal variable; a function can be polymorphic in the region it touches by taking a pointer as a parameter.The good news is: inference, polymorphism and subtyping do get along. They get along just fine, if you're willing to toss Top, Bottom and principal typing out the window in the special case where HM unification would end with a polymorphic generalization. Consistently viewing this problem as impossible (either a full semi-unification problem or a failure of the existence of principal types) has, I've noticed, been holding back certain strands of PL thought for a long time.Once you have subtyping, you can introduce existential types (possibly with a restriction such as "existentials can only be used in the this-pointer place") and recover a fair approximation to objects, interfaces, BitC capsules, etc. Since we're dealing with existentials, there's also a strong resemblance and possibly even a full morphism to first-class modules, thence to type-classes and instances... And then we all get lost in how deep our thoughts are.The lexical resolution issues of type-classes (especially with respect to their "looking like" object instances) can be dealt with as by Odersky et al in "Type-Classes as Objects and Implicits". That leaves us with the same instance-coherence issues you've already mentioned (was my ordered set built with this partial ordering or that partial ordering?), but no problem of ambient naming or authority. There might be a good hybrid approach in allowing only one truly global type-class instance per type, accessible via a special way of capturing the instance as an implicit parameter/value, but lexically resolve instances when we don't want the One True Instance. Then we could use object identity for testing instance equality to check, when given data built based on multiple instances, that they agree.Moving on, I've got a question: why did you tackle the problem of inference and mutability together by trying to treat mutability as part of the data type?Look at it from the C/C++ point of view rather than the ML point of view, and it sort of stops making sense. An integer value can be passed to a parameter expecting an integer, whether or not the procedure will modify the parameter slot in which it received the integer. In an imperative language, if we want to actually modify the parameter, we pass it by reference ("var i: integer" in Pascal) or by passing a pointer ("int i" in C). In ML or Haskell, I'd say it makes sense to classify mutable slots/references to mutable memory cells as their own data-type because everything* other than referenced memory cells is immutable. Those mutable cells are not real values: the reference to one is a real value, and the contents of one is a real value, but the memory cell itself is not a first-class citizen of the language.I sidestepped the issue for my own work by treating mutability as the property of the "slot" (the memory cell itself), which is a second-class citizen of the language, and then building pointer types that track the mutability of the slot to which they point. This is (once again) much easier to do once you already know you can handle subtyping ("submuting", haha, a vastly incomplete start of a short paper). For a systems language, I admit to having gone "the C way" with a little more formalism rather than "the ML/Miranda/Haskell way".But why go the ML/Haskell way in the first place? Just because doing things that way makes you more publishable in PL? You seem to be implying that, but I'm fairly sure the OO-PL community has their own literature on constancy and mutability ("Ownership and Immutability in Generic Java" being a paper I've seen referenced fairly often) that you could have dragged in to "shield yourself" against the ML/Haskell-style FP-PL community (and its horrifying armies of conference reviewers!).Anyway, my condolences for the death of your language by starvation of funding. It really sucks to see BitC go when you've learned so much about how to do BitC right, but you definitely did manage to make a ton of important contributions. Though, since I should ask, did you ever talk to or collaborate with the Cyclone folks? Habit folks? ATS folk(s)?--Eli

评论 #3751224 未加载

Abomonog大约 13 年前

Link to BitC site for the lazy.TLDR: "BitC is a new systems programming language. It seeks to combine the flexibility, safety, and richness of Standard ML or Haskell with the low-level expressiveness of C."

stuckk大约 13 年前

Am I the only one who thought this was about BitCoins.Because I think Bitcoin wont work in it's current form.

评论 #3750774 未加载

评论 #3750668 未加载