C Is Not a Low-Level Language

472 pointsby jodooshiabout 7 years ago

47 comments

mattnewportabout 7 years ago

This article makes some valid points but is overall rather misleading I think. Almost all of the reasons given why C is "not a low-level language" also apply to x86/x64 assembly. Register renaming, cache hierarchies, out of order and speculative execution etc are not visible at the assembly / machine code level either on Intel or other mainstream CPU architectures like ARM or Power PC. If C is not a low level language then a low level language does not exist for modern CPUs and since all other languages ultimately compile down to the same instruction sets they all suffer from some of the same limitations.It's really backwards compatibility of instruction sets / architectures that imposes most of these limitations. Processors that get around them to some degree like GPUs do so by abandoning some amount of backwards compatibility and/or general purpose functionality and that is in part why they haven't displaced general purpose CPUs for general purpose use.

评论 #16968080 未加载

评论 #16968631 未加载

评论 #16968435 未加载

评论 #16968284 未加载

评论 #16968219 未加载

评论 #16968657 未加载

评论 #16975300 未加载

评论 #16969212 未加载

评论 #16973301 未加载

评论 #16972087 未加载

评论 #16968150 未加载

munificentabout 7 years ago

I really really liked this article, and reading the comments here is blowing my mind. Did we read the same thing?I think it's a strong insight that insight that chip designers and compiler vendors have spent person-millenia maintaining the illusion that we are targeting a PDP-11-like platform even while the platform has grown less and less like that. And, it turns out, with things like Spectre and the performance cost of cache misses, that abstraction layer is quite leaky in potentially disastrous ways.But, at the same time, they have done such a good job of maintaining that illusion that we forget it isn't actually reality.I like the title of the article because many programmers today do still think C is a close mapping to how chips work. If you happen to be one of the enlightening minority who know that hasn't been true for a while, that's great, but I don't think it's good to criticize the title based on that.

评论 #16970324 未加载

评论 #16969556 未加载

评论 #16972235 未加载

评论 #16970671 未加载

评论 #16976470 未加载

评论 #16992542 未加载

评论 #16969193 未加载

United857about 7 years ago

It's worth noting that chips that were designed for high-performance computing (e.g. the Cell) from the outset generally don't have silicon devoted to things like out of order execution, register renaming, etc. In this case, the bulk of the optimization logic does shift to the programmer (aided by the compiler).The reason is that in these domains (e.g. game consoles, supercomputing), you know ahead of time the precise hardware characteristics of your target, you can assume it won't change, and can thus optimize specifically for that ahead of time.This isn't true for "mass-market" software that needs to run across multiple devices, with many variants of a given architecture.

评论 #16968731 未加载

评论 #16968721 未加载

umanwizardabout 7 years ago

The points made in the article are certainly valid, but C is low-level in an abstract sense: it is approximately the intersection of all mainstream languages.I.e. if a feature exists in C, it probably exists in every language most programmers are familiar with. (I worded this statement carefully to exclude exotic languages like Haskell or Erlang).Thus C, while not low-level relative to actual hardware, is low-level relative to programmers' mental model of programming. If this is what we mean, it's still true and useful to think of C as a low-level language.That said, it's important to keep the distinction in mind -- statements like "C maps to machine operations in a straightforward way" have been categorically wrong for decades.

评论 #16968870 未加载

评论 #16968792 未加载

评论 #16970843 未加载

评论 #16968756 未加载

评论 #16969687 未加载

评论 #16969106 未加载

Rebelgeckoabout 7 years ago

Going by their definition, I don't think there are any low level languages, at least on modern architectures. Even x86 assembly abstracts out a lot of what is going on within the CPU.

评论 #16968204 未加载

评论 #16968345 未加载

评论 #16969727 未加载

ChuckMcMabout 7 years ago

I enjoyed reading this, mostly because it made me angry, then curious, then thoughtful all in one go.Partly because I really like the PDP-11 architecture, and it's 'separated at birth' twin the 68K, it greatly influenced me in how I think about computation. I also believe that one of the reasons that the ATMega series of 8 bit micros were so popular was that they were more amenable to a C code generator than either the 8051 or PIC architectures were.That said, computer languages are similar to spoken languages in that a concept you want to convey can be made more easily or less easily understood by the target by the nature of the vocabulary and structure available to you.Many useful systems abstractions, queues, processes, memory maps, and schedulers are pretty easy to express in C, complex string manipulation, not so much.What has endeared C to its early users was that it was a 'low constraint' language, much like perl, it historically has had a fairly loose policy about rules in order to allow for a wider variety of expression. I don't know if that makes it 'low' but it certainly helped it be versatile.

dahartabout 7 years ago

> A processor designed purely for speed, not for a compromise between speed and C support, would likely support large numbers of threads, have wide vector units, and have a much simpler memory model.Sounds like a GPU?> Running C code on such a system would be problematic, so, given the large amount of legacy C code in the world, it would not likely be a commercial success.It seems like ATI & NVIDIA are doing okay, even with C & C++ kernels. GLSL and HLSL are both C-like. What is problematic?

评论 #16968031 未加载

评论 #16968416 未加载

评论 #16974541 未加载

ovaoabout 7 years ago

To me the argument's akin to suggesting that Robert Wadlow wasn't tall, because giraffes are taller than Robert Wadlow.When the spectrum of the context is unambiguous, that's not an argument for finding a way to make it ambiguous.

评论 #16976639 未加载

cryptonectorabout 7 years ago

> The root cause of the Spectre and Meltdown vulnerabilities was that processor architects were trying to build not just fast processors, but fast processors that expose the same abstract machine as a PDP-11. [...]This strikes me as a flavor of the VLIW+compilers-could-statically-do-more-of-the-work argument, though TFA does not mention VLIW architectures.C or not, making compilers do more of the work is not trivial, it is not even simple, not even hard -- it's insanely difficult, at least for VLIW architectures, and it's insanely difficult whether we're using C or, say, Haskell. The only concession to make is that a Haskell compiler would have a lot more freedom than a C compiler, and a much more integrated view of the code to generate, but still, it'd be insanely hard to do all of the scheduling in the compiler. Moreover, the moment you share a CPU and its caches is the moment that static scheduling no longer works, and there is a lot of economic pressure to share resources.There are reasons that this make-the-compilers-insanely-smart approach has failed.It might be more likely to be successful now than 15 years ago, and it might be more successful if applied to Rust or Haskell or some such than C, but, honestly?, I just don't believe this will work anytime soon, and it's all academic anyways as long as the CPU architects keep churning out CPUs with hidden caches and speculative execution.If you want this to be feasible, the first step is to make a CPU where you can turn off speculative execution and where there is no sharing between hardware threads. This could be an extension of existing CPUs.A much more interesting approach might be to build asynchrony right into the CPUs and their ISAs. Suppose LOADs and STOREs were asynchronous, with an AWAIT-type instruction by which to implement micro event loops... then compilers could effectively do CPS conversion and automatically make your code locally async. This is feasible because CPS conversion is well-understood, but this is a far cry from the VLIW approach. Indeed, this is a lot simpler than the VLIW approach.TFA mentions CMT and ULtraSPARC, and that's certainly a design direction, but note that it's one that makes C less of a problem anyways -- so maybe C isn't the problem...Still, IMO TFA is right that C is a large part of the problem. Evented programs and libraries written in languages that insist on immutable data structures would help a great deal. Sharing even less across HW/SW threads (not even immutable data) would still be needed in order to eliminate the need for cache coherency, but just having immutable data would help reduce cache snooping overhead in actual programs. But the CPUs will continue to be von Neuman designs at heart.

kev009about 7 years ago

The meta point from the article is that this is as much a hardware problem as it is a language or developer one. An arms race was waged to create CPUs that are very effective in running sequential programs; to the point that what they present to the program is a very much a facade and they hide an increasing great deal of internal implementation detail. By David's postulation, even the native assembly language for the CPU is not low level.To drive this juxtaposition home, I'd point to PALcode on Alpha processors in which C (and others) can very much be a low level language. Very few commercial processors let you code at the microcode level.The overarching premise is then brought home by GPU programming, which shows that you don't necessarily need to be writing at the ucode level if the ecosystem was built around how the modern hardware functioned.

scott_sabout 7 years ago

The author, David Chisnall, is a co-author on a related paper from PLDI 2016: "Into the Depths of C: Elaborating the De Facto Standards", <a href="https://news.ycombinator.com/item?id=11805377" rel="nofollow">https://news.ycombinator.com/item?id=11805377</a>

评论 #16971036 未加载

compiler-guyabout 7 years ago

There is an entire junkyard full of processors designed to run other languages well.LISP machines in the 60s, Java machines in the 90s, many others.For whatever reason, successful general purpose silicon has almost always followed a C-ish model.It's also worth noting that Fortran runs quite well on C-ish style processors.

评论 #16972509 未加载

davidwabout 7 years ago

"C combines the power and performance of assembly language with the flexibility and ease-of-use of assembly language."

salgernonabout 7 years ago

It feels like the author really isn't talking so much about the limitations of C on modern architectures, but the architecture itself.Possibly relevant is this (short?) discussion[1] from 2011 about a CPU more closely designed for functional programming.[1] <a href="https://news.ycombinator.com/item?id=2645423" rel="nofollow">https://news.ycombinator.com/item?id=2645423</a>

angry_octetabout 7 years ago

It is instructive to consider GPUs and their compilers. The death of OpenGL in favour of Vulcan has come about because OpenGL is unable to express low level constructs which are essential to achieving performance. GPU drivers are actually compilers that recompile shaders to efficient machine expressions.Thus the fundamental limitation is that the processor has only a C ABI. If there were a vectorisation and parallel friendly ABI, then it would be possible to write high level language compilers for that. It should be possible for such an ABI to coexist with the traditional ASM/C ABI, with a mode switch for different processes.

评论 #16972919 未加载

arghwhatabout 7 years ago

It is correct that C is not really a low level language, but the points about how C limits the processor doesn't make much sense.It uses UltraSPARC T1 and above processors as an example for a "better" processor "not made for C", but this argument makes no sense at all. The "unique" approach in the UltraSPARC T1 was to aim for many simple cores rather than few large cores.This is simply about prioritizing silicon. Huge cores, many cores, small/cheap/simple/efficient die. Pick two. I'm sure Sun would have loved to cram huge caches in there, as it would benefit everything, but budgets, deadlines and target prices must be met.Furthermore, the UltraSPARC T1 was designed to support existing C and Java applications (this was Sun, remember?), despite the claim that this was a processor "not designed for traditional C".There are very few hardware features that one can add to a conventional CPU (which even includes things like the Mill architecture) that would not benefit C as well, and I cannot possibly imagine a feature that would benefit other languages that would be harmful to C. The example of loop count inference for use of ARM SVE being hard in C is particularly bad It is certainly no harder in the common use of a for loop than it is to deduce the length of an array on which a map function is applied.I cannot imagine a single compromise done on a CPU as a result of conventional programming/C. That is, short of replacing the CPU with an entirely different device type, such as a GPU or FPGA.

评论 #16969045 未加载

agumonkeyabout 7 years ago

The thing is, most of the time you're reflecting at some logical level that will not be the "reality". The problem is that C programmer think that C === reality === performance. C has better (lower) constant factors but by no means better all the time.

DannyB2about 7 years ago

The sophistication of the compiler does not mean the language is high level.The meaning of a high level language is to do with abstraction away from the hardware. C programmers often wince at languages that are highly abstracted away from the hardware. But those are what are "high level" languages. Especially languages that remove more and more of the mechanical bookkeeping of computation. Such as garbage collection (aka automatic memory management). Strong typing or automatic typing. Dynamic arrays and other collection structures. Unlimited length integers and possibly even big-decimal numbers of unlimited precision in principle. Symbols. Pattern matching. Lambda functions. Closures. Immutable data. Object programming. Functional programming. And more.By comparison C looks pretty low level.Now I'm not knocking C. If there were a perfect language, everyone would already be using it. Consider the Functional vs Object debate. (Or vi vs emacs, tabs vs spaces, etc) But all these languages have a place, or they would not have a widespread following. They all must be doing something right for some type of problem.C is a low level language. And there is NOTHING wrong with that! It can be something to be proud of!

评论 #16974890 未加载

thinklingabout 7 years ago

TLDR: C was close-to-the-metal on the PDP-11 but since then hardware has become more complex while exposing the same abstraction to the C programmer. That means that hardware features such as speculative execution and L1/L2 caching are invisible to the programmer. This was the cause of Spectre and Meltdown and it forces a lot of complexity into the compiler. GPUs achieve high performance in part because their programming model goes beyond C. Processors would be able to evolve if they weren't hamstrung by having to support C.

评论 #16968264 未加载

评论 #16967948 未加载

评论 #16968213 未加载

评论 #16967988 未加载

评论 #16967993 未加载

sytelusabout 7 years ago

Interesting tidbits from article:A modern Intel processor has up to 180 instructions in flight at a time (in stark contrast to a sequential C abstract machine, which expects each operation to complete before the next one begins). A typical heuristic for C code is that there is a branch, on average, every seven instructions. If you wish to keep such a pipeline full from a single thread, then you must guess the targets of the next 25 branches.The Clang compiler, including the relevant parts of LLVM, is around 2 million lines of code. Even just counting the analysis and transform passes required to make C run quickly adds up to almost 200,000 lines (excluding comments and blank lines).

anfiltabout 7 years ago

I hate the idea of "low-level". There is not really such a thing. You should be using a language suitable for the domain your working in.Sadly, too many programming languages try to be the end all be all. C is language that is great for working at the system domain.Ideally, we would have small minimalist languages for various problem domains. In reality maintaining and building high quality compilers is a lot work. Moreover, a lot of development will just pile together whatever works.That aside, you could build a computer transistor by transistor, but it's probably more helpful to think at the logic gate level or even larger units. Heck even a transistor is just a of piece of silicon/germanium that behaves in a certain way.So there are levels abstraction, but is an abstraction low-level? I think term probably came about to refer lower layers of abstraction that build what ever system your using. So unless your using something that nothing can be added upon. Everything, even what people would call high level can be low-level.Heck, people call JS a high level language, but there are compilers that compile to JS. This makes a JS a lower level system that something else is built upon. This just again shows why I would say that low-level is often thrown around with connotation that is not exactly true.

judge2020about 7 years ago

Archive.is link, as the page loaded incredibly slow for me: <a href="http://archive.is/E9s70" rel="nofollow">http://archive.is/E9s70</a>

plpotabout 7 years ago

I find this article insightful, but missing the points it tries to deliver.What the article is very good at delivering is that current CPU's ISAs exports a model that doesn't exist in reality. Yes, we might call it PDP-11, although I miss that architecture dearly.C was never meant to be a low level language. It was a way to map loosely to assembler and provide some higher level abstraction (functions, structures, unions) to write code that was more readable, and structured, than assembler. And yes, it is far from perfect. And yes, today is called a low level language with good reasons.But this article is all about exposing the insanity that modern CPU have become, insanity that is the sacrifice to the altar of backward compatibility -- all CPU architecture that tried the path of not being compatible with older CPUs have died.I am pretty sure that once we'll have an assembler that map closely to the microcode, or to the actual architecture of the internals of a modern, parallel, NUMA architecture, we will still need to have a C-like language that will introduce higher level features to help us ease writing of non-architecture dependent parts. And it will most probably be C.

rhackerabout 7 years ago

The article itself has 4 definitions or "attributes" for low-level languages that can be considered contradictory:* "A programming language is low level when its programs require attention to the irrelevant."* Low-level languages are "close to the metal," whereas high-level languages are closer to how humans think.* One of the common attributes ascribed to low-level languages is that they're fast.* One of the key attributes of a low-level language is that programmers can easily understand how the language's abstract machine maps to the underlying physical machine.So basically the entire article's premise (the title) hinges on the last bullet- which can be contested. All the other mentioned attributes can be applied to Java, C, C#, C++. So failing the last bullet point doesn't apply to just C.

评论 #16968977 未加载

Const-meabout 7 years ago

One reason for that is for many applications latency is much more critical than bandwidth. For PCs that’s input-to-screen latency, for servers that’s request-to-response. It’s possible to make multicore processors with simpler cores, design OS and language ecosystem around it, etc. Such tradeoffs will surely improve bandwidth, but will harm latency.Another reason is most IO devices are inherently serial. Ethernet only has 4 pairs, and wifi adapters are usually connected by a single USB or PCIx lane. If a system has limited single threaded (i.e. serial, PDP11-like) performance, it gonna be hard to produce or consume these gbits/sec of data.

zwiebackabout 7 years ago

Great article if you're willing to read past headlines. I would have liked to see a mention of small processors that are still hugely popular (microcontrollers, etc.) where C is still a good fit.

wglbabout 7 years ago

The article does not properly distinguish between C as a language and what the C compiler does with the C program. The logic of the article references what the compiler does.The reasonable way to measure languages is to look at the abstractions present in the language. C has fewer abstractions than the other languages that we are familiar with. That is the reasonable definition of the level of a language.

评论 #16971116 未加载

z3t4about 7 years ago

I wonder if it's easier for a compiler/cpu to optimize "async" code ? And I often find myself having an array in JavaScript that calls the same function on each item in the array, it would be nice if such cases would be made parallel, which I think is possible to do in C++. Is that ever gonna happen in JavaScript !?

Shikadiabout 7 years ago

Language evolves. C is certainly lower level than C# or JavaScript, so even if it no longer fits the definition created decades ago, I don't see a problem with the term evolving to match modern times. People say assembly language when they mean assembly language, (which others have argued isn't low level any more anyway) so using low level to describe a language closer to the hardware seems valid to me. It's interesting that the author argues C could be considered low level on the PDP-11, because by the old definition used back it definitely wouldn't be. That tells me the author's definition of low level is already an evolution of the original definition, so there's no reason the term can't evolve some more.Wiki definition:"A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture—commands or functions in the language map closely to processor instructions. Generally this refers to either machine code or assembly language."

评论 #16969922 未加载

hokusabout 7 years ago

<a href="http://web.archive.org/web/20180502001551/https://queue.acm.org/detail.cfm?id=3212479" rel="nofollow">http://web.archive.org/web/20180502001551/https://queue.acm....</a>

qsdf38100about 7 years ago

"processors wishing to keep their execution units busy running C code" What? This is non sense, the processor is not running C code! The processor can only run machine code, regardless of the language used to write the source code.

评论 #16972569 未加载

smadgeabout 7 years ago

Makes me wonder if x86 could be extended to expose the underlying parrellelism. How much faster would my Prolog and Haskell programs run if all branches were executed simultaneously and only the successful path down my search tree returned?

评论 #16974921 未加载

srikuabout 7 years ago

I couldn't read the article, but based on the comments, would it change the way we use C whether we declared it a "low level language" or not?

评论 #16977011 未加载

justicezyxabout 7 years ago

Statement of using adjective almost always is about defining the context.

burkeabout 7 years ago

C is Not a True Scotsman

评论 #16971293 未加载

waynecochranabout 7 years ago

Ok, w/o dipping into machine code, show me a low level language. Any snippet of C-code is transparent in that you know roughly how it is going to be translated into machine code.

评论 #16968346 未加载

评论 #16968472 未加载

评论 #16969237 未加载

评论 #16968200 未加载

评论 #16971893 未加载

评论 #16968181 未加载

评论 #16974984 未加载

sigjuiceabout 7 years ago

Where does it say in the ISO C standard that C must be translated to assembly code or machine code of any sort?EDIT: Various C interpreters exist

11thEarlOfMarabout 7 years ago

>403 Error - Access Forbidden We are sorry ... ... but we have temporarily restricted your access to the Digital Library. Your activity appears to be coming from some type of automated process. To ensure the availability of the Digital Library we can not allow these types of requests to continue. The restriction will be removed automatically once this activity stops.We apologize for this inconvenience.Please contact us with any questions or concerns regarding this matter: portal-feedback@hq.acm.orgThe ACM Digital Library is published by the Association for Computing Machinery. Copyright � 2010 ACM, Inc.

评论 #16974537 未加载

评论 #16976445 未加载

评论 #16975828 未加载

评论 #16974017 未加载

评论 #16974133 未加载

评论 #16974328 未加载

lowken10about 7 years ago

If the general public & tech community refers to C as a low level language then it is a low level language.

评论 #16968764 未加载

arseraptorabout 7 years ago

Ah yes, David Chisnall. Another Cambridge wannabe without a hope of tenure track who thinks he is cleverer than he really is and makes a bunch of trite points over and over hoping to get some attention -- not realising they've been made for over 20 years. Have an original thought David, and stop feeling smug. You're not.

评论 #16989369 未加载

julienfr112about 7 years ago

Ok. But what are the alternatives that are not decade away ?

_pmf_about 7 years ago

It's low level, but the level is not identical to the machine level.

retrogradeorbitabout 7 years ago

It's all relative. Lower level than what? Higher level than what? C is lower level than a huge number of other languages so I would feel comfortable calling it 'low level'.

评论 #16968113 未加载

emilfihlmanabout 7 years ago

This article is completely clickbait.C is low level. For example, with AVRs everything you do maps very clearly to what happens as opcodes.It's like the author wants to blame C for whatever reason and conveniently forgets that C is also portable.

评论 #16971022 未加载

laytheaabout 7 years ago

The title is not a well formed statement. It all depends on what you are used to. IE. If I write Java, C is low level. If I write assembler, C is high level.

skylyracabout 7 years ago

This doesn't make any sense. This would mean that my C code compiled for a Cortex-M0 is low level, but for my x86 laptop is not. Or even more stupid, that the same assembly code running in an old 386 is low level, but for an i7 isn't.Low level is about how close to talking to the CPU you are, not about how close to the silicon you are. The CPU is a black box and the programmer communicates with it. What that box does inside doesn't matter.

评论 #16970133 未加载

mar77iabout 7 years ago

As far as I understood what I do about C, is that most of C's here called "quirks" have actually been enablers for much of the portability and performance of modern platforms. Therefore I don't like "undefined behavior" and the like being criticised for being such a "hindrance". I hence doubt the author's familiarity with C is much beyond the basics, which kind of makes the case for why the author also had to namedrop Spectre and Meltdown, which were caused by the fact that later optimizations were unsound, ie. the Tomasulo algorithm.The problematic with the article somewhat remind me of the problems with LCTHW, and that the author of LCTHW was unable to figure out what the deal was about had been admitted by themselves. <a href="https://zedshaw.com/2015/01/04/admitting-defeat-on-kr-in-lcthw/" rel="nofollow">https://zedshaw.com/2015/01/04/admitting-defeat-on-kr-in-lct...</a> Sorry to re-repost this article again. I just somewhat perceive two variants same "smells" in both.

评论 #16971744 未加载