TechEcho

1 comment

dsukhinover 4 years ago

This appears to be a two part question. Part one is how space efficient the language representation is (words, sounds) and part two is how much mental RAM is needed to extract meaning. In a traditional CS setting this would be an example of a time-space tradeoff (decompression), but given hefty evidence of special structures in the brain that are adapted for language processing, the processing aspect has really been abstracted away by "special hardware" which places a lower bound on how dense the representation can be to take advantage of it in a real-time streaming context. I find it hard to imagine any popular language system evolving to make itself harder by not using the embedded hardware for processing, so I'll turn the rest of my answer to examining the representational efficiency.<p>If you look at language efficiency from the information theoretic sense (i.e. most meaning conveyed with least amount of "bits" of data), you can approximate the efficiency of a language by looking at the branching factor and probability of words in its language model. The more branches there are, the more meanining a single word can have in a stream of text (especially a low probability word). However, the more words you have in your language, the more "bits"/letters/memory/etc. It takes to represent the word even if you use a probability informed encoding like Huffman.<p>If you think of a Markov language model as simply bitstream guided state machine, you can approximate the expected length of output for equivalent length inputs and get an information density approximation of the language model itself.<p>Traditional Chinese which is not phonemic will have a much larger "word" space but a much smaller branching factor. Each symbol conveys more meaning, but highly constrains the space of symbols which can follow. Where it falls on the tradeoff curve relative to say, English is not obvious ex ante, but we can use this framework to test it.<p>This answer might be a little dense, but is meant to provide some intuition and a thought framework to evaluate your question.

1 comment

dsukhinover 4 years ago

Ask HN: Memory Efficiency of Spoken Languages

1 comment

Ask HN: Memory Efficiency of Spoken Languages

1 comment