In some cases you can benefit by "struct of arrays" instead of "array of structs". This benefits packing, enables SIMD, and usually improves persistence speed too. Database people call this a "column store". So if you abstract it a little you can do per-column compression that is really hard to do with vanilla structs.<p>One thing I learned with Picasa (which is basically an in-memory database): if you're storing a lot of data in RAM, sometimes you should think of the data structures more as a database (with copy in/out, rather than reference). In multithreaded cases, this kind of approach unexpectedly gets you more parallelism and speed.
This was the cause of my very first "hard" C Bug, in 1991 or so. I had written a program in our Novell Netware labs to read in the printer control code definitions for our printer-release consoles - being "clever", and wanting to save a few lines of code, I read them into a memory structure, and overlaid a C-Struct on top of them that mapped precisely to the fields that I wanted. Everything worked fine, code compiled, and we were able to read all the job definitions until I handed it off to the team responsible for the rest of the release console - at which point the code just started breaking. No longer worked. For the life of me I couldn't figure out what was going, until our team leader took a glance, and made it clear that I needed to inform the compiler to byte align the C Structs (which was likely the default behavior in Turbo-C, but not Watcom C).<p>Really opened my eyes to the many, many things that I didn't know.
This was discussed like a year ago in HN <a href="https://news.ycombinator.com/item?id=6995568" rel="nofollow">https://news.ycombinator.com/item?id=6995568</a>
Decent article full of good information. One exception -- the C standard is vague on how bitfields are implemented and the compiler can rearrange them in any way it pleases. You are not guaranteed efficient packing, placement order, or size.
Sometimes this sort of stuff can really bite you: <a href="http://blog.jgc.org/2007/04/debugging-solaris-bus-error-caused-by.html" rel="nofollow">http://blog.jgc.org/2007/04/debugging-solaris-bus-error-caus...</a>
The article goes into more detail on the "why", but the "art" is not really that complicated or lost: sort structure elements by size/alignment, biggest first.
Most commonly used compilers have options to help find padding that is possibly not needed.<p>VC++ has a couple of undocumented (but well known and discussed) options /d1reportAllClassLayout and /d1reportSingleClassLayout.<p>GCC has -fdump-class-hierarchy<p>CLANG has -cc1 -fdump-record-layouts
If anyone wants to take a look at the struct offsets inside their own code, the article mentions pahole as being a useful tool but I found that it chokes on recent C++ - the dwarf libraries it relies on can't cope with lambdas and various other things IIRC.<p>However, helpful people have written an extension to gdb which does something roughly equivalent & lets gdb do all the heavy lifting of parsing the struct data from the binary debugging information.<p>pahole.py ships with Fedora, but since Debian doesn’t include it for some reason I’ve kept my own versions around based on something I grabbed from the gdb mailing lists some time ago:<p><a href="https://github.com/PhilArmstrong/pahole-gdb" rel="nofollow">https://github.com/PhilArmstrong/pahole-gdb</a>
In Pascal, you could just declare a structure as "packed", and this packing happened automatically. "packed array [0..n] of Boolean" is an array of bits. That feature has not been copied in more recent languages.
Interesting to discover that this is a "lost art".<p>I didn't graduate, but I did attend PUC-Rio for a while and I remember professor L. F. Bessa Seibel's lecture about that, as part of the INF1008 class (Introduction to Computers' Architecture), which is considered a basic first or second semester course for undergraduate CS students.<p>That was just 5 years ago.<p>Great class, great lecturer, btw
I am going to suggest that 'Structure Packing' isn't the lost art. I think the lost art is 'Variable Sized Structs', the emphasis on the last 's' of 'Structs'.<p>This is the situation where your solution calls for an array of structs that are contiguous in memory but aren't the same size. This happens oftenish in database, os, and compression tech. C99 and gcc have legalized the often used convention:<p><pre><code> struct varb {
char *foo;
unsigned char age;
int number_of_bars;
char var[]; // it's something like var[0] in GCC, or var[1] in versions < c99
};
</code></pre>
where the last element is actually the start of the variable length piece. In all the C supported implementations of var length structs, we have limitations:<p>- the variable part must be declared last<p>- the struct may not appear in other structs or arrays<p>which totally makes sense, but doesn't help us implement a contiguous array of variable length structs. Also, the restriction of having the variable part of the struct come last may waste a lot of space if we have a structure where we would rather optimize the order of the struct ourselves, but that's another post!<p>Assume we are OK with the limitations of our <i>struct varb</i>, we now need to declare and <i>malloc</i> (or <i>mmap</i>) a <i>char</i>* (to hold all of our data) and lay out our <i>varb structs</i> fairly manually. Once we have found the start of a particular <i>varb struct</i>, we can cast it's <i>char</i>* address to a <i>struct varb</i>* and the compiler then can help us populate or interrogate our data. Note that we will need to keep track of detailed memory usage 'so far' when we are populating as well track padding, alignment of adjacent <i>structs</i>, as well as any reallocation when our <i>struct</i> array needs to grow.
Rust does structure packing automatically.<p>This is great for the general programmer (someone who is not knowledgeable or caring about details like this.)<p>Unfortunately, this blows for people who do (someone like me.)
The article doesn't appear to mention it, but in the past I had a need for this type of information when interfacing C data structures from other programming languages.<p>If the structure you are connecting with is packed you have to be aware of that or else you'll end up with data soup :)
One of the many reasons that Rust is interesting is you can play these games if you are interested in performance or size: <a href="http://doc.rust-lang.org/book/ffi.html#interoperability-with-foreign-code" rel="nofollow">http://doc.rust-lang.org/book/ffi.html#interoperability-with...</a>
Is it really a lost art? I've recently been learning C and C++, and all three of the textbooks I've been studying have mentioned in, and two go very in-depth on it (one of those is a games programming textbook). This is a great resource though, definitely going to add it to my studies.
This was rather readable, more so than I've come to expect from that particular source.<p>Still, I must point out that using a signed 1-bit bitfield (there are <i>lots</i> of those in that article), is generally a bad idea. Consider for instance something like:<p><pre><code> struct foo6 {
int bigfield:31; /* 32-bit word 1 begins */
int littlefield:1;
};
</code></pre>
That "littlefield" member is signed but has a size of just one bit, which means it is limited to representing the two values 0 and -1 (in two's complement). This is very seldom useful. The general rule of thumb is to always make "boolean"-type bitfields have an unsigned type, for this reason.
A good portion of PHP 7's performance improvements came from better structure packing:<p><a href="https://wiki.php.net/phpng-int" rel="nofollow">https://wiki.php.net/phpng-int</a><p><a href="http://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html" rel="nofollow">http://nikic.github.io/2014/12/22/PHPs-new-hashtable-impleme...</a>
i definately agree with the 'lost art' part. i've been stunned to find programmers who don't understand this and make arguments like "the compiler optimises that for you"... so much so that i've written my own piece on this, and more than once (and in more or less detail in various places - i.e. bitfields):<p><a href="http://jheriko-rtw.blogspot.co.uk/2011/02/know-your-cc-struct-layout-rules.html" rel="nofollow">http://jheriko-rtw.blogspot.co.uk/2011/02/know-your-cc-struc...</a><p><a href="http://jheriko-rtw.blogspot.co.uk/2012/01/c-struct-layout-optimisation-revisited.html" rel="nofollow">http://jheriko-rtw.blogspot.co.uk/2012/01/c-struct-layout-op...</a>
How about the lost art of COBOL record packing where COBOL records are like a combination of C's union and structs. It was great you could sometimes squeeze a few extra bits out in one case to use better in another case.
Years ago I worked on a few projects that used complicated structure ordering, but even then it was my understanding that (with the right options) compilers could optimize these things better than most people could by hand.
Don't elements of structures are sorted before even packing starts? For example
char c;
int i;
char c2;<p>I thought in reality c2 starts after c. Instead of adding padding between each element, you will only need in boundaries
I've actually had a question about this on my programming interview for a games development company. And yes, I have had to use it several times since. So no, the art is definitely not lost :-)
This is no lost art, this is pretty well known for proficient c developers. There is even an Wikipedia article. <a href="http://en.wikipedia.org/wiki/Data_structure_alignment" rel="nofollow">http://en.wikipedia.org/wiki/Data_structure_alignment</a>