C Portability Lessons from Weird Machines

270 pointsby begriffsover 6 years ago

17 comments

A lot of the "weird machines" for C are microcontrollers and the like; the 8051 is mentioned, but another big "C-hostile" MCU family is the Microchip PIC, which still has its own C compiler. DSPs are another category where unusual word sizes are common (24 bits for char/short/int is often encountered.)It’s amazing that, by carefully writing portable ANSI C code and sticking to standard library functions, you can create a program that will compile and work without modification on almost any of these weird systems.The big question, which I ask whenever someone harps on about portability, is does it even make sense? Is your average PC/smartphone application realistically ever going to be ported to a 6502 or a Unisys mainframe? Keep in mind that I/O in particular is going to differ significantly, and something like standard input might not even exist (think of a fixed-function device like an oven controller with only a few buttons and LEDs for I/O.) I don't think it's particularly amazing, because the "core" of C essentially represents all the functionality of a stored-program digital computer; so if you completely ignore things like I/O it's not hard to see how the same piece of code can express the same concepts on any of the latter type of computer.It should also be noted that these "weird" environments are often not "strictly conforming" either, because it's either impossible or doesn't really make sense to. Besides the "omissions" they will also have "extensions" that help the programmer to more fully utilise the platform's capabilities.

评论 #18467704 未加载

评论 #18466318 未加载

评论 #18467897 未加载

评论 #18466444 未加载

评论 #18466363 未加载

评论 #18466974 未加载

评论 #18468383 未加载

评论 #18470607 未加载

评论 #18467126 未加载

评论 #18466704 未加载

评论 #18466664 未加载

评论 #18469012 未加载

chadaustinover 6 years ago

Compiling C++ to asm.js with Emscripten is another weird architecture you might actually use these days.Unaligned accesses don't trap with a SIGBUS or anything, they just round the pointer value down to an aligned address and read whatever that is.Reading from and writing to NULL will generally succeed (just as on SGI).Function pointers are just indices into tables of functions, one table per "function type" (number of arguments, whether the arguments are int or float, whether it returns an argument or not). Thus, two different function pointers may have the same bit pattern.

评论 #18467838 未加载

评论 #18467762 未加载

int_19hover 6 years ago

One thing that they didn't list there that probably deserves a mention is SHARC with its 32-bit word architecture, which they decided to manifest directly in C type sizes:<pre><code> CHAR_BIT == 32 sizeof(char) == 1 sizeof(int) == 1 sizeof(short) == 1 sizeof(float) == 1 sizeof(double) == 2 </code></pre> I suppose the alternative would be to use an addressing scheme encoding bit offset in the pointer, like some of the other machines in this story. But that's also much more expensive, and this is a DSP architecture, so they went with something more straightforward. Curiously, this set-up is still fully accommodated by ISO C standard.

评论 #18466746 未加载

评论 #18468572 未加载

评论 #18466371 未加载

jcranmerover 6 years ago

Seeing this article reminds me that there's a good litmus test in C (at least older versions) in determining whether a feature is undefined behavior or merely unspecified or implementation-defined behavior. Undefined behavior means that there is some processor that will throw an exception if you do it; if there isn't such, then the behavior is implementation-defined. So signed overflow is undefined because some processors have trap-on-signed-overflow, and unsigned overflow is not because that feature is not present. The part about undefined signed overflow being useful for optimizations only came decades later.That said, I'm at a loss to explain why i = i++; is undefined and not merely unspecified.

评论 #18467879 未加载

评论 #18468011 未加载

drewg123over 6 years ago

I remember that back in the early 90s, the DEC Alpha was pretty "weird", as it was one of the first common LP64 unix machines. I fixed so many issues due to sizeof(int) != sizeof(char *) when building open-source packages for DEC OSF/1 in the early 90s..Later, the alpha was FreeBSD's first 64-bit platform, and when working on the port to alpha, we hit a lot of the same issues in the FreeBSD kernel. As alpha was also FreeBSD's first RISC machine with strict alignment constraints, we also hit a lot of unaligned access issues in the kernel.Ah, those were the days. I now find myself grumbling about having to check to ensure my code is portable to 32-bit platforms.

评论 #18470056 未加载

tasty_freezeover 6 years ago

One machine I came across early in my career was the BTI-8000, designed in the mid to late 70s. Only a few dozen were sold. It was a multiprocessor mainframe, and the user memory space was 512KB. So what does an architecture do with all those extra bits after using up the first 19 as a linear space mapping all 512KB? Why encode other address modes. On that machine, it was possible to have a pointer to a particular bitfield in memory, something like [16:0] = 32b word of memory, [21:17] = bit offset, [26:22] = bit width, [31:27] addressing mode. Other modes provided the ability to point to a register, or a bitfield within a register. There were many other encodings, such as register plus immediate offset, base+offset, etc.If the instruction was something like "ADD @R1, @R2, 5", it would fetch the word, register, or bitfield pointed at by R2, add immediate 5, then save it to the word, register, or bitfield pointed to by R1.The machine didn't have shift/rotate instructions, but it could be effected by saving to a bitfield then reading from a bitfield offset by n bits.They had a working (but not polished) C compiler but that project got shut down when they realized the system was not going to take off.<a href="http://btihistory.org/bti8000.html#isa" rel="nofollow">http://btihistory.org/bti8000.html#isa</a>

dmitrygrover 6 years ago

Article is wrong in a few ways. One is about the R3000 where it says that integer overflow traps. there are actually two separate addition instructions, they operate the same way, except for one difference. One will trap on signed overflow and one will not. They both produce the same result. No c compiler I know of uses the trapping version of the instructions.

评论 #18467681 未加载

classichasclassover 6 years ago

"Accessing any address [on 6502] higher than the zero page (0x0 to 0xFF) causes a performance penalty."I never thought of it that way, but that's true. However, he didn't mention the biggest issue with C on the 6502, i.e., the extremely constrained 256-byte hardware stack. To do anything practical requires some sort of software-maintained stack to have stack frames of any decent size or quantity (in parallel, or replacing the use of the hardware stack completely). "Downright hostile to C compilers," indeed.

评论 #18468278 未加载

Tor3over 6 years ago

"[3B2] Fairly normal architecture, except it is big endianian, unlike most computers nowadays. The char datatype is unsigned by default."That sounds like any SGI, or Sun, and although they're mostly gone there's still the Power series from IBM (runs AIX), and the only reason to use the expression ".. unlike most computers nowadays" is by counting the sheer number of Intel, AMD and ARM chips in use. Of course those numbers are overwhelming - ARM alone sells billions - but it's not like big endian is some obscure old concept in a dusty corner. (The irony is that ARM can be used in both BE and LE modes, by setup). Anyway, at work I have to write all the code so that it runs on BE as well as LE architectures. BE is alive and well.

评论 #18467467 未加载

评论 #18467836 未加载

misesover 6 years ago

Sorry to be that guy, but a weird machine is a defined computer science term. I got the wrong impression from the title, as I'm guessing, did others. <a href="https://en.wikipedia.org/wiki/Weird_machine" rel="nofollow">https://en.wikipedia.org/wiki/Weird_machine</a>

评论 #18466456 未加载

gumbyover 6 years ago

Weird? 68000 is hardly weird -- the original Sun machines were 68Ks. The MIPS CPUs were designed to run C from the get go.And the reference in the manual to the Unisys machine not having byte pointers: the PDP-6 (the original 36-bit machine as far as I know) had byte pointers that allowed you to specify bytes in the width 1 to 63 bits wide. It was common to have six-bit characters (you could pack six to a word) as well as 7-bit ASCII characters (you could pack 5 to a word).

评论 #18467105 未加载

评论 #18468605 未加载

评论 #18469368 未加载

评论 #18468282 未加载

评论 #18473892 未加载

edooover 6 years ago

I do bare metal C on modern ARM chips and nowadays it is near identical to anything you'd do on a desktop. We don't have to write portable code to compile most of the same libraries on real computers for proper unit testing.It is hilarious to think about the possibility of having to make your code portable to a 9 bit big endian system too.

评论 #18469787 未加载

评论 #18469689 未加载

nbsd4lifeover 6 years ago

Is it weird to support some of these today? we just dropped acorn26, although it was gone for a few years.I think mips r3k has a better behaving add ("addu"), or is that only on some? if your compiler outputs these you don't have to worry about special behavior.I'd say a bigger concern, for vax - the lack of IEEE754 is noticeable when people pick unsuitable float constants, or it traps by default. or the many GCC bugs now.For mips r3k, the complete lack of atomics. And the load delays.

评论 #18466538 未加载

评论 #18467057 未加载

imglorpover 6 years ago

I've owned or used about half the machines on that list. Good times. We had to push our bits uphill both ways back then.

roywigginsover 6 years ago

The weird machine my Computer Architecture class had us learn was the PDP-8, which is almost comically constrained. 12-bit, with exactly one register, no hardware stack, no arithmetic beyond addition and negation.

评论 #18466858 未加载

评论 #18467195 未加载

jhallenworldover 6 years ago

TOPS-20 on the DECSYSTEM-20 had 7-bit characters, but this was a 36-bit machine... so sizeof(int) needs to return a fraction or something...(There was a C compiler on it, but it cheated and used 9-bit chars).

saifsadiq1995over 6 years ago

Ah ha! Great!! It reminds me my Grany stories in childhood. Huge memory disk, Floppies and room-sized mainframes.Worth to read it.