What does the ??!??! operator do in C?

636 pointsby isomorphover 2 years ago

24 comments

susamover 2 years ago

I learnt C, more than 20 years ago, from the book The C Programming Language written by Brian W. Kernighan and Dennis M. Ritchie, also known as K&R. I read the book almost cover to cover all the way from the preface at the beginning to its three appendices at the end while solving all the exercises that each chapter presented. As someone who knew very little about programming languages back then, this book was formative in my journey of becoming a programmer.Appendix A (Reference Manual) of the book broadened my outlook on programming languages by providing me a glimpse of what goes into formally specifying a programming language. Section A.12 (Preprocessing) of this appendix specifies trigraph sequences. Quoting from the section:> Preprocessing itself takes place in several logically successive phases that may, in a particular implementation, be condensed.> 1. First, trigraph sequences as described in Par.A.12.1 are replaced by their equivalents. Should the operating system environment require it, newline characters are introduced between the lines of the source file.Then section A.12.1 (Trigraph Sequences) further elaborates trigraph sequences in more detail. Quoting this section below:> The character set of C source programs is contained within seven-bit ASCII, but is a superset of the ISO 646-1983 Invariant Code Set. In order to enable programs to be represented in the reduced set, all occurrences of the following trigraph sequences are replaced by the corresponding single character. This replacement occurs before any other processing.<pre><code> ??= # ??/ \ ??' ^ ??( [ ??) ] ??! | ??< { ??> } ??- ~ </code></pre> > No other such replacements occur.> Trigraph sequences are new with the ANSI standard.

评论 #33102637 未加载

评论 #33104594 未加载

评论 #33105161 未加载

评论 #33102990 未加载

评论 #33102399 未加载

评论 #33104776 未加载

评论 #33103290 未加载

bradfordover 2 years ago

Trigraphs make this obfuscated C submission possible: (<a href="https://gist.github.com/Property404/e31b99deb3527159e183" rel="nofollow">https://gist.github.com/Property404/e31b99deb3527159e183</a>)I've pasted it here for convenience (formatting fixed, thanks child comment!):<pre><code> // Are you there god??/ ??=define _(please, help) ??=define _____(i,m, v,e,r,y) r%:%:m ??=define ____ _____(a,f,r,a,i,d) main(__)<%____(!_(-~-??-((-~-??-!__<<- ??-!!__)<<-??-(!!__<<!!__))+-~-~-??--~-~ -~-~-~-~-??-(-~-~-~-~-??-!!__<<-~!!__),- ??-!__))<%??>%>_(__,___)??<____ (printf("please let me die??/r%d bottle%s" " of bee%s""""??/n",(!(___ %-~-~!!___))?--__+!___++:__+!___++,!(__-!!___) &&___%-~-~!!___??!??!!(___%-~-~!!___??!??!__ -(-~!!___))?"":"s",___%-~-??-!!___<-??-!!___? "r on the wall":"eeeeeeer! Take one down,pass ??/ it around")&&__&&_(__,___),"mercy I'm in pain")??<??>??></code></pre>

评论 #33103785 未加载

评论 #33102321 未加载

评论 #33102967 未加载

评论 #33104591 未加载

rdlwover 2 years ago

See also: "What is the "-->" operator in C++?"<a href="https://stackoverflow.com/q/1642028" rel="nofollow">https://stackoverflow.com/q/1642028</a>

评论 #33106238 未加载

评论 #33103026 未加载

评论 #33102522 未加载

layer8over 2 years ago

From the ASCII Wikipedia page (<a href="https://en.wikipedia.org/wiki/ASCII#7-bit_codes" rel="nofollow">https://en.wikipedia.org/wiki/ASCII#7-bit_codes</a>):> Almost every country needed an adapted version of ASCII, since ASCII suited the needs of only the US and a few other countries. For example, Canada had its own version that supported French characters.> Many other countries developed variants of ASCII to include non-English letters (e.g. é, ñ, ß, Ł), currency symbols (e.g. £, ¥), etc. See also YUSCII (Yugoslavia).> It would share most characters in common, but assign other locally useful characters to several code points reserved for "national use". […]> Because the bracket and brace characters of ASCII were assigned to "national use" code points that were used for accented letters in other national variants of ISO/IEC 646, a German, French, or Swedish, etc. programmer using their national variant of ISO/IEC 646, rather than ASCII, had to write, and, thus, read, something such as<pre><code> ä aÄiÜ = 'Ön'; ü </code></pre> instead of<pre><code> { a[i] = '\n'; } </code></pre> > C trigraphs were created to solve this problem for ANSI C, although their late introduction and inconsistent implementation in compilers limited their use. Many programmers kept their computers on US-ASCII, so plain-text in Swedish, German etc. (for example, in e-mail or Usenet) contained "{, }" and similar variants in the middle of words, something those programmers got used to. For example, a Swedish programmer mailing another programmer asking if they should go for lunch, could get "N{ jag har sm|rg}sar" as the answer, which should be "Nä jag har smörgåsar" meaning "No I've got sandwiches".

dhosekover 2 years ago

One of the challenges of | is that it was never entirely clear whether the ASCII | should be equivalent to EBCDIC’s | or ¦. As I recall, Waterloo C wanted ¦ as its vertical bar character, although I could be wrong. On the IBM system that I used back in the 80s, we had ASCII terminals which were run through a muxer to the actual system (which was part of the magic that allowed it to have thousands of concurrent users all getting real-time access—a lot of UI was offloaded to these systems which were essentially minicomputers on their own).

评论 #33103653 未加载

NegativeLatencyover 2 years ago

There's also iso646.h which allows you to do some particularly python looking stuff:<pre><code> #include <iso646.h> #include <stdbool.h> #include <stdio.h> #define is == bool is_whitespace(int c) { if (c is ' ' or c is '\n' or c is '\t') { return true; } return false; } int main() { int current, previous; bool in_word; while ((current = getchar()) not_eq EOF) { if (is_whitespace(current) and not is_whitespace(previous)) { putchar('\n'); } else { putchar(current); } previous = current; } return 0; }</code></pre>

评论 #33102643 未加载

评论 #33119758 未加载

chromatinover 2 years ago

Wow, and I thought I knew C pretty well. Great post.edited to add: I really like "Modern C" and just re-checked -- no mention of the preprocessor feature!<a href="https://hal.inria.fr/hal-02383654/file/ModernC.pdf" rel="nofollow">https://hal.inria.fr/hal-02383654/file/ModernC.pdf</a>

评论 #33102811 未加载

评论 #33102082 未加载

评论 #33102149 未加载

评论 #33103308 未加载

billpgover 2 years ago

"There's a problem. Some machines don't have some braces and vertical bars and such. We'll have to add keywords like OR and BEGIN and END.""Are question marks fine?""Yes.""I'll come up with something."

评论 #33105925 未加载

cl3mischover 2 years ago

This reminds me of a comment on a Python discussion >2 years ago, of which I think often:"Whether it's computer languages or human ones, as soon as you get into a discussion about the correct parsing of a statement, you've lost and need to rewrite in a way that's unambiguous. Too many people pride themselves on knowing more or less obscure rules and, honestly, no one else cares."<a href="https://news.ycombinator.com/item?id=23051202" rel="nofollow">https://news.ycombinator.com/item?id=23051202</a>

评论 #33106102 未加载

kbobover 2 years ago

I'd say, "Congratulations! You're one of today's luck 10,000!", but trigraphs aren't really much fun. Just another reminder that C is old, and computing is even older.I've used uppercase-only terminals, and I've used ancient C, but not at the same time.

评论 #33103294 未加载

kenniskragover 2 years ago

trigraphs are removed in c++ 17<a href="https://en.m.wikipedia.org/wiki/C%2B%2B17#Removed_features" rel="nofollow">https://en.m.wikipedia.org/wiki/C%2B%2B17#Removed_features</a>

评论 #33102160 未加载

评论 #33102161 未加载

评论 #33102163 未加载

DonHopkinsover 2 years ago

Years ago I wrote a perfectly reasonable comment like /* WTF??!?!!?!???? */ and the old C compiler complained about "invalid trigraph". A syntax error in the middle of a comment!Took me a while to figure out that "trigraph" was referring to some part of "??!?!!?!????" and not "WTF".

评论 #33106502 未加载

Agentlienover 2 years ago

Every time I hear about trigraphs I think of this horror:<a href="http://stackoverflow.com/questions/53315710/ddg#53315821" rel="nofollow">http://stackoverflow.com/questions/53315710/ddg#53315821</a>

FabHKover 2 years ago

There are two aspects to this, the trigraph, and using the short circuiting behaviour of the binary logic operator for control flow.The latter is a very common idiom in Julia code, which I found obscure and puerile at first (“look how smart I am”), but have come to appreciate as concise and natural by now.For example:<pre><code> function fact(n::Int) n >= 0 || error("n must be non-negative") n == 0 && return 1 n * fact(n-1) end </code></pre> <a href="https://docs.julialang.org/en/v1/manual/control-flow/#Short-Circuit-Evaluation" rel="nofollow">https://docs.julialang.org/en/v1/manual/control-flow/#Short-...</a>

divbzeroover 2 years ago

In addition to trigraphs, there are apparently a set of C alternative tokens defined as follows:<pre><code> #define and && #define and_eq &= #define bitand & #define bitor | #define compl ~ #define not ! #define not_eq != #define or || #define or_eq |= #define xor ^ #define xor_eq ^= </code></pre> I suppose that allows for code like this:<pre><code> if (x or not y or not z) { return 1; } </code></pre> <a href="https://en.wikipedia.org/wiki/C_alternative_tokens" rel="nofollow">https://en.wikipedia.org/wiki/C_alternative_tokens</a>

评论 #33106269 未加载

评论 #33110431 未加载

curling_gradover 2 years ago

Anecdote: An online judge website (which is pretty well known in Korea) has an easy problem[0] asking to write a program which adds "??!" to input. A lot of beginners' C/C++ submissions got "Wrong Answer" verdict because of trigraphs.[0]: <a href="https://www.acmicpc.net/problem/10926" rel="nofollow">https://www.acmicpc.net/problem/10926</a>

hgs3over 2 years ago

Reminds me of the "goes to" operator [1][1] <a href="https://stackoverflow.com/questions/1642028/what-is-the-operator-in-c" rel="nofollow">https://stackoverflow.com/questions/1642028/what-is-the-oper...</a>

cesarefover 2 years ago

This sort of practice goes back to BCPL, which wikipedia says is the first braced programming language. Because { and } weren't universally available, compilers also supported the sequence $( and $) to represent these, which were typeable and printable on just about anything.<a href="https://en.wikipedia.org/wiki/BCPL" rel="nofollow">https://en.wikipedia.org/wiki/BCPL</a>This is the earliest example of this sort of thing i'm aware of - is there an earlier example?Also, BCPL supported // for comments, again, probably the first use of this sequence.

virtualritzover 2 years ago

> Has Microsoft Windows finally been open-sourced or where did this come from?This comment on the SO post made my day. :D

anfractuosityover 2 years ago

In gcc I got:<pre><code> 1.c:1:11: warning: trigraph ??< ignored, use -trigraphs to enable [-Wtrigraphs] </code></pre> Is there a preprocessor directive to enable support out of curiosity?

sargstuffover 2 years ago

from [1], trigraphs or not:<pre><code> int main() { [](){}() } </code></pre> is still wierd.Wonder if there will be a request for an emacs macro to handle the replaced cpp trigraphs? [2][1] <a href="https://zygoloid.github.io/cppcontest2018.html" rel="nofollow">https://zygoloid.github.io/cppcontest2018.html</a> [2] <a href="https://www.emacswiki.org/emacs/CppTemplate" rel="nofollow">https://www.emacswiki.org/emacs/CppTemplate</a>

评论 #33106038 未加载

Waterluvianover 2 years ago

If we deprecated trigraphs and removed that step from the compiler would it speed compilation up much? I’m going to guess maybe by milliseconds?

评论 #33102557 未加载

评论 #33102430 未加载

评论 #33102408 未加载

评论 #33102664 未加载

chris_wotover 2 years ago

C++17 removed trigraphs. Sadly will no longer work.

评论 #33107030 未加载

olliejover 2 years ago

Oh trigraphs may you never die