Declaring C String Constants

66 pointsby eklitzkeabout 8 years ago

13 comments

iainmerrickabout 8 years ago

There's a big mistake here:<pre><code> // Bogus function, just to see how arguments are passed. void bogus(); // Invoke bogus using ptr. void do_ptr() { bogus(&ptr, ptr); } // Invoke bogus using arr. void do_arr() { bogus(&arr, arr); } </code></pre> "&ptr" and "&arr" are not the same. &ptr gives you a pointer to a pointer (char * *), but &arr just gives you a pointer to the array data. That's why the code is different.This would normally show up as a compilation error, since &ptr and &arr have different types. It's disguised here by the variadic bogus() function.(I didn't know that no-arg functions in C are implicitly variadic! That's bizarre. I would have expected modern compilers to disable that by default, but I can't get Clang to warn me about it even with -Wall.)

评论 #14361414 未加载

评论 #14361444 未加载

评论 #14361964 未加载

评论 #14365193 未加载

评论 #14361286 未加载

评论 #14361342 未加载

评论 #14361441 未加载

tonygabout 8 years ago

Interestingly, we get<pre><code> .LC0: .string "Lorem ipsum" [...] movl $.LC0, %esi movl $ptr2, %edi xorl %eax, %eax jmp bogus </code></pre> for the same experiment done with a const pointer to const data,<pre><code> const char * const ptr2 = "Lorem ipsum"; </code></pre> (Incidentally, this kind of thing is why I prefer to write 'char const STAR const' rather than 'const char STAR const', and 'char const STAR' rather than 'const char STAR'; it systematically places the 'const's.)(PS. HN really doesn't work well for trying to write either unicode or ascii star characters.)

评论 #14363148 未加载

评论 #14361262 未加载

评论 #14361129 未加载

eridiusabout 8 years ago

All this talk about do_ptr being slower because it copies from main memory is misleading. Yes, it's slower, because it's doing something different than what do_arr does. It's not just a slower version of doing the same thing. So trying to compare do_ptr and do_arr makes no sense. And the whole thesis of this piece, that declaring string constants as arrays is better, has no supporting evidence. You know what's better? Don't dereference a pointer if you don't need to. That's it.

评论 #14361247 未加载

swolchokabout 8 years ago

Or you could just #define STR "Lorem Ipsum" and be sure that it's going to be efficient.

评论 #14361421 未加载

评论 #14361002 未加载

mnarayan01about 8 years ago

As others have noted, this is a little bizarre as the two functions are doing different things; one being less performant than the other is not really surprising. That said, there can be advantages to using array syntax rather than pointers for strings (even when also declaring the pointer to be constant), e.g. the compiler "knows" the array "pointer" is non-null:<pre><code> extern const char arr[]; void do_arr() { if (arr) { dummy(); } } </code></pre> allows the compiler to optimize out the conditional: <a href="https://godbolt.org/g/FwBeWx" rel="nofollow">https://godbolt.org/g/FwBeWx</a>.

russdillabout 8 years ago

Other comments have already deconstructed much of the article, but I'll add this. The primary factor will be your CPU architecture. Can you put the string in the same cacheline as the code or an adjacent cacheline? If so, awesome. However, if you have a microcontroller with a harvard architecture, such as a cortex M3, it's may be better to put your strings in data memory rather than code memory. The processor can simultaneously load your string and the next instructions via the two memory ports.

pettersabout 8 years ago

The author took the time to write a blog post, but not trying the code for anything else than that bogus() function. Had they done that, they would have realized their mistake.

droithommeabout 8 years ago

Arrays and pointers are not the same in C, nor have they ever been. The difference is more than that of mere syntax choices for equivalent concepts.In the pointer version of the code, the pointer is reassignable. Only the data it references can't be modified, the pointer can. To prevent the pointer from being modified, the declaration should have been:const char * const ptr = "Lorem ipsum";With the array version, there is no pointer to be modified and so no need or ability to have a second const.

kccqzyabout 8 years ago

Another difference: if you use sizeof, the pointer version gives you the size of a pointer, but the array version gives you the actual number of bytes in the string.

carlsonmarkabout 8 years ago

While I do enjoy this sort of analysis, I feel the decision to choose one method over another should be based on benchmarks (preferably with more than one size of string.) After benchmarking, then do an analysis of the assembly code.Guessing about pipelining, memory access times, and the effect of generated code size is much less valuable than real measurements.

评论 #14361240 未加载

bsderabout 8 years ago

What I find funny is that through all this he never actually mentioned the fact that sizeof() a string constant includes the NUL character at the end.This has bitten me more times than I care to admit.Nowadays, I almost always use the array form and actually declare my strings character by character so that I see the NUL if I actually meant to use it.

mark-rabout 8 years ago

I wonder how the results would have differed if those were automatic variables instead of static? That's a much more real-world situation. You wouldn't be able to ignore the overhead of re-initializing the array, since it would be occurring at run time instead of compile time.

Analemma_about 8 years ago

Any programming language that needs an entire blog post about how to declare string constants "the right way" is a programming language that needs to disappear.

评论 #14361105 未加载

评论 #14361315 未加载

评论 #14361374 未加载

评论 #14361163 未加载

评论 #14361190 未加载

评论 #14361217 未加载