I've been living in Vietnam and trying to learn the language for a while now and this confirms my impression that Vietnamese is very information dense. Most words are monosyllabic and the same syllable pronounced with different tones has completely different meanings. Also, a lot of things we state explicitly in English are left implicit in Vietnamese.<p>You'd expect that a language with greater information density would lead to higher rates of transmission error but people here don't seem to have any more trouble understanding each other on the phone or in noisy environments than we do in English.
This article pretty much confirms a suspicion of mine that I've had for a while: strongly syllabic languages like Spanish, Japanese and Tagalog, with their paltry use of consonant clusters, speak/sound faster because (1) scarcity of consonant clusters without tones means that words require more syllables to be uniquely identifiable, and (2) sounds flow more easily when there is a vowel between every consonant. Unlike a language like German, you're not always stopping your speech to enunciate adjacent consonants. Therefore, longer words + easier to pronounce = fast speech.
Original paper: <a href="http://www.lsadc.org/info/documents/2011/press-releases/pellegrino-et-al.pdf" rel="nofollow">http://www.lsadc.org/info/documents/2011/press-releases/pell...</a>
From what I can tell, Chinese will often drop tones, just as English will drop vowels (substituting most vowels for a schwa - a kind of "e" or "uh" sound) if the meaning is not too ambiguous.<p>People who learn Chinese often fret about getting the tones right. The tones just aren't that important - Chinese speakers can generally guess the meaning, though they will think you sound like a 4-year-old if you don't pronounce tones correctly. IMO, getting the vowels and consonants right is harder (and more important).<p>Seriously, here's the pairs you will confuse:<p>d / t - d is unaspirated<p>j / zh - j is a "cjsch" sound (a bit like "A<i>s</i>ia") while zh is a "j" sound<p>q / ch - q is a "bright" (slightly whistled?) ch; ch is a "dark" ch)<p>x / sh - x is a "bright" sh and sh is a "dark" sh<p>c / s - c is a "ts", s is just s<p>b / p - b can sound a little closer to p than in English and p is more aspirated<p>g / k - g sounds a little close to k, while k is more aspirated<p>Then there's the vowels, which are really hard. Learning four tones is comparatively easy.<p>If you don't get the consonants almost 100% correct, people will simply not be able to tell what you are saying. If you don't use tones, they can usually understand, as long as you use simple words (which face it, you will).
Hm, .91 'information per syllable' for English at an average rate of 6.19 syllables per second are 5.6 'information' per second vs. .49 * 7.84 = 3.8 for Spanish.<p>How is 5.6 "more or less identical amount of information" as 3.8? That's a 47% difference!
Actually English has very few syllables compared to Filipino or Spanish. Nearly a third of the words are one syllable and most of the multiple syllable words have either Latin, French, or Greek origin. I think there was an attempt by early humans to make words have as less syllables as possible to simplify communication. Then as our thoughts became more complex they either changed the tone of the syllable (as in Chinese) or added prefixes and suffixes to alter the meaning of the root word. Those language families that decided to add prefixes and suffixes became known as "inflectional" languages. Those that remained mono-syllabic have a variety of tones per syllable.
The interesting thing this article seems to suggest is that languages and speaking styles naturally seems to self-correct to provide the same amount of information, i.e, as density goes down, speed goes up- the ratio is essentially always the same.<p>I'd be interested to see a larger sample of languages, that is, is there a language in which the decrease of either density or speed isn't combined with an increase of the other. (or the opposite) Or are humans all naturally predisposed to generate/accept information at a similar rate?
I would have thought it is because when familiar patterns of syllables are heard we tend to concentrate more on context and meaning. While the other makes you concentrate more on the syllables itself.
Here's a regression graph of Information Density vs Syllable Speed based on data from the preprint. <a href="https://twitter.com/#!/willf/media/slideshow?url=pic.twitter.com%2FdvfurXu" rel="nofollow">https://twitter.com/#!/willf/media/slideshow?url=pic.twitter...</a>
I'd also like to know about reading speed; I'm <i>convinced</i> that subtitles in Polish are displayed for mere microseconds, and the same rapid-fire display is true for various dot-matrix-type signs.