I don't think there is anything wrong in writing platform-specific code; in certain circles there is this weird fetishitization of portability, placing it on the highest pedestal as a metric of quality. This happens in C programming and also for example in shell scripting, people advocating for relying only on POSIX-defined behavior. If a platform specific way of doing something works better in some use-case then there should be no shame in using that. What is important is that the code relies on well-defined behavior, and also that the platform assumptions/requirements are documented to a degree.<p>Of course it is wonderful that you can make programs that are indeed portable between this huge range of computers; just that not every program needs to do so.
On a slightly related note, chances are good anyone reading this has an 8051 within a few meters of them - they're close to omnipresent in USB chips, particularly hubs, storage bridges and keyboards / mice. The architecture is equally atrocious as the 6502.<p>btw: a good indicator is GCC support - AVR, also an 8-bit µC - is perfectly well supported by GCC. 8051 and 6502, you need something like SDCC [<a href="http://sdcc.sourceforge.net/" rel="nofollow">http://sdcc.sourceforge.net/</a>]
I recall discussing C with a person using a processor where chars, shorts, and ints were all 32 bits. He stressed that writing portable code was necessary.<p>I pointed out that any programs that needed to manipulate byte data would require very special coding to work. Special enough to make it pointless to try to write portable code between it and a regular machine.<p>It's unreasonable to contort the standard to support such machines. It is reasonable for the compiler author on such machines to make some adjustments to the language.<p>For example, one can say C++ is technically portable to 16 bit machines. But it is not in practice, because:<p>1. segmented 16 bit machines require near/far pointer extensions<p>2. exceptions will not work on 16 bit machines, because supporting it consumes way too much memory<p>3. ditto for RTTI
C23 just dropped support for any non-twos-complement architectures. No more C on Unisys for you! <a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf" rel="nofollow">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf</a>
> Everyone who writes about programming the Intel 286 says what a pain its segmented memory architecture was<p>Actually this concerns more pre-80286 processors, since 80286 introduced virtual memory, and the segment registers were less prominent in "protected mode".
Moreover I wouldn't say it was a pain, at least at the assembly level, once you understood the trick. C had not concept of segmented memory, so you had to tell the compiler which "memory model" it should use.<p>> One significant quirk is that the machine is very sensitive to data alignment.<p>I remembered from school time about a "barrel register" that allowed to remove this limitation, but it was introduced in 68020.<p>On the topic itself, I like to say that a program is portable if it has been ported once (likewise a module is reusable if it has been reused once). I remember porting a program from a 68K descendant to ARM, the only non-obvious portability issue was that in C, the <i>char</i> type is that the standard doesn't mandate the <i>char</i> type to be signed or unsigned (it's implementation-defined).
I wrote a fair amount of code for TI's TMS320C4x DSPs. They had 32 bit sized char, short, int, long, float and double and a long double with 40 bits.<p>Took a bit to get used to but really the only way to get to the good stuff was by writing assembly code and hand-tuning all the pipeline stuff.
<i>the MIPS R3000 processor ... raises an exception for signed integer overflow, unlike many other processors which silently wrap to negative values.</i><p>Too bad programmer laziness won and most current hardware doesn't support this.<p>As a teenager I remember getting hit by this all the time in assembly language programming for the IBM S/360. (FORTRAN turned it off).<p><pre><code> S0C8 Fixed-point overflow exception
</code></pre>
When you're a kid you just do things quickly. This was the machine's way of slapping you upside your head and saying: "are you sure about that?"
It still amazes me how the PDP-11 has the NUXI [1] problem at nibble level and how the PDP-11 was bytesexual [2].<p>[1] <a href="http://catb.org/jargon/html/N/NUXI-problem.html" rel="nofollow">http://catb.org/jargon/html/N/NUXI-problem.html</a><p>[2] <a href="http://catb.org/jargon/html/B/bytesexual.html" rel="nofollow">http://catb.org/jargon/html/B/bytesexual.html</a>
Good article.<p>I've done C on half of those platforms. 8051 is actually more complex than described. There are two different kinds of RAM with two different addressing modes. IIRC there are 128 bytes of "zero page" RAM and, depending on the specific variant, somewhere between 256 bytes and a few kilobytes of "normal" RAM. Both RAM types can be present, and the addresses can both be the same value, but point to different memory, so the context of the RAM type is critical. The variants usually have a lot more ROM than RAM so coding styles may need to be adjusted, such as using a lot of (ROM initialized data) constants instead of computing things at run time.<p>6502 has a similar "zero page" addressing mode to the 8051.<p>I never encountered any alignment exceptions on 68k (Aztec C). Either I was oblivious and lucky, or just naturally wrote good code.<p>I do remember something about PDP-11 where there was a maximum code segment size (32k words?).<p>C on the VAX (where I first learned it) was a superset of the (not yet) ANSI standard. I vaguely remember some cases where the compiler/environment would allow some lazy notation with regard to initialized data structures.<p>They left out some interesting platforms (such as 6809/OS9, TI MSP430 and PPC), which have their own quirks.
The author forgot to mention that 8051 has a bit-addressable lower part of RAM.<p>PDP-11 had a weird RAM overlay scheme of squeezing 256KB RAM into a 64KB 16-bit address space.<p>IBM System/360 also had a weird addressing scheme with base register and up to 4KB offsets.<p><a href="https://en.wikipedia.org/wiki/IBM_System/360_architecture#Addressing" rel="nofollow">https://en.wikipedia.org/wiki/IBM_System/360_architecture#Ad...</a>
I scored 7. (have written C code on six of the architectures mentioned (PDP 11, i86, VAX, 68K, IBM 360, AT&T 3B2, and DG Eclipse) I have also written C code on the DEC KL-10 (36 bit machine) which isn't covered. And while I have a PDP-8, I only have FOCAL and FORTRAN for it rather than C. I'm sure there is a C compiler out there somewhere :-).<p>With the imminent C23 spec I'm really amazed at how well C has held up over the last half century. A lot of things in computers are very 'temporal' (in that there are a lot of things that are all available at a certain point in time that are required for the system to work) but C has managed to dodge much of that.
Given C’s origin on the PDP-11, it’s amazing it ended up so portable to all these crazy architectures. Even as an old-timer, the 8051 section made me say “WTF”!
I love these weird machines. I'll give an example of another.<p>The Texas Instruments' C40 was a DSP made in the late 90's. It had 32-bit words, but inefficient byte manipulation. The compiler/ABI writer's solution was simple: make char 32 bits. So sizeof(char)==sizeof(short)==sizeof(int)==sizeof(long).<p>I remember writing routines for "packing" and "unpacking" native strings to byte strings and back.
Theoretical portability is not very useful. If you've not <i>tested</i> on a Unisys with 36 bit integers and 8 word function pointers, the theoretical port is garbage.<p>It may be easier to fix the code than if you didn't think about the Unisys, but that effort has to be weighed against the incredibly vanishing odds of it ever being required.
While skimming did I miss an example of code that is portable between most of these systems? I'd love to see that because I'm having a very hard time believing that's possible. Or maybe you can make something that will technically compile and function on any of those, but if it doesn't perform reasonably then it's really hard to call the portability aspect a success.<p>Also you're very limited as to what you can actually do in ANSI C. You're going to need to start poking directly at the hardware which is not going to be portable. Hell, even stuff like checking if a letter is within a certain range in the English alphabet might not work between machines. Letters might not be contiguous in the machine's native character encoding.
I thought there'd be mention of CHERI as an up-to-the-minute architecture (as Arm Morello). I don't remember whether it requires modifications to standard C, but there's a C programming guide [1].<p>People who say everything is little endian presumably don't maintain packages for the major GNU/Linux systems which support s390x. I don't remember how many big endian targets I could count in GCC's list when I looked fairly recently. The place to look for "weird" features there is presumably the embedded systems processors.<p>1. <a href="https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf" rel="nofollow">https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf</a>
Haven't read the article yet, but I have noticed that the tab keeps loading even after 10 minutes. Aborting the loading process leads to broken media.<p>I am no expert in HTML video delivery and haven't tried it out, but maybe setting the preload attribute to "none" or "metadata" might help?