TechEcho

2 comments

brudgersover 4 years ago

The reason is computer science. In computer science, a character is simply part of an arbitrary alphabet. Zero or more characters form a string. A string can be input into one of the several types of automata - e.g. fininte state, push-down, Turing machine, etc. Automata accept or reject strings or loop endlessly (in the case of Turing machines).None of this has anything to do with human readable text, except that accidentally it is possible to represent human readable text with a single byte in the case of English and since C was written by English speakers who where familiar with ASCII encoding as part of their job, "char" became synonymous with "byte." And so strings of bytes were used to encode English text and "string" was a useful but very leaky abstraction for English text. How leaky? C strings are NULL terminated because with a Turing machine there are no guarantees that the input terminates and this is useful for switching telephone networks since the switches have to work continuously even though individual calls end, i.e. the end of a call is not the end of the stream of input for a network switch.Conflating "string" with "text" is a source of endless clusterfucks. The most notable being Python 2 strings versus Python 3 strings and all the code that had to be rewritten because of a dumb design decision that could not be questioned under the governing model of dictatorship. As opposed to Perl where the difference between strings and text was understood and handling text with regex's meant that parsing an evilly constructed regex might take a really long time was an engineering tradeoff for the finite time of the Unix regex which lacked backtracking and look ahead...because the Unix "regex" was a regular expression in the sense of automata -- as such the regular expression is equivalent to a finite state machine.All of which probably expresses an opinion I might hold about Swift if I looked at it closely, but I haven't because I know enough about it to know that it does things that are hard to reason about in the way I tend to apply engineering reasoning.But that's me, and if it gets the job done for you, then that's great. I just don't want to try to figure out something that was designed that way.

Someoneover 4 years ago

History, dominance of English in computing, performance, plus the fact that even extended grapheme clusters don’t fully solve the problem. In Swift<pre><code> "ﬃ".count </code></pre> returns 1.

2 comments

brudgersover 4 years ago

Someoneover 4 years ago

Ask HN: Why not extended grapheme cluster as character type?

2 comments

Ask HN: Why not extended grapheme cluster as character type?

2 comments