TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Compare sentence length distributions of famous authors

39 pointsby thesephistover 4 years ago

11 comments

doubleunplussedover 4 years ago
Tried it on Brave New World, which contains this massive 250-worder near the start (but it turns out the rest is pretty typical):<p>&gt; Still leaning against the incubators he gave them, while the pencils scurried illegibly across the pages, a brief description of the modern fertilizing process; spoke first, of course, of its surgical introduction–&quot;the operation undergone voluntarily for the good of Society, not to mention the fact that it carries a bonus amounting to six months&#x27; salary&quot;; continued with some account of the technique for preserving the excised ovary alive and actively developing; passed on to a consideration of optimum temperature, salinity, viscosity; referred to the liquor in which the detached and ripened eggs were kept; and, leading his charges to the work tables, actually showed them how this liquor was drawn off from the test-tubes; how it was let out drop by drop onto the specially warmed slides of the microscopes; how the eggs which it contained were inspected for abnormalities, counted and transferred to a porous receptacle; how (and he now took them to watch the operation) this receptacle was immersed in a warm bouillon containing free-swimming spermatozoa–at a minimum concentration of one hundred thousand per cubic centimetre, he insisted; and how, after ten minutes, the container was lifted out of the liquor and its contents re-examined; how, if any of the eggs remained unfertilized, it was again immersed, and, if necessary, yet again; how the fertilized ova went back to the incubators; where the Alphas and Betas remained until definitely bottled; while the Gammas, Deltas and Epsilons were brought out again, after only thirty-six hours, to undergo Bokanovsky&#x27;s Process.
评论 #24964836 未加载
wattenbergerover 4 years ago
I probably need to add more context to the page, but I want to specify that my intention with this tool wasn&#x27;t that long sentence == bad. I was inspired by this image:<p><a href="https:&#x2F;&#x2F;twitter.com&#x2F;misscrisp&#x2F;status&#x2F;1202792895448662016" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;misscrisp&#x2F;status&#x2F;1202792895448662016</a><p>It&#x27;s really interesting how different authors play with the rhythm of their sentences - one of the best ones on here is The Raven, with very short sentences interspersed with longer ones.
mykowebhnover 4 years ago
Is it a valid comparison if they&#x27;re comparing sentence length of translations of works originally written in other languages? For example, Hermann Hesse&#x27;s Siddhartha and Plato&#x27;s Republic.
评论 #24964753 未加载
mkay313over 4 years ago
This reminds me of an app I wrote a few years ago when learning R: <a href="https:&#x2F;&#x2F;mzgw.shinyapps.io&#x2F;book-recommendation&#x2F;" rel="nofollow">https:&#x2F;&#x2F;mzgw.shinyapps.io&#x2F;book-recommendation&#x2F;</a><p>My friend needed a way to determine the Flesch-Kincaid readability score for a selected paragraph, and was using Microsoft Word for that. I wrote a script to automatically parse the scores for all txt files provided, then started to wonder how different books fared in that aspect, how readability scores map to language proficiency levels, and what can be told about the book in general using simple analysis tools. It was a fun project but the readability scores were far from expected — eg I hate Joseph Conrad’s Heart of Darkness and always find it frustrating to read but the readability score is actually not that low.
RcouF1uZ4gsCover 4 years ago
I was disappointed not to see Paradise Lost by John Milton. His sentences are some of the longest I have ever come across.
评论 #24968060 未加载
hexane360over 4 years ago
It looks like this fails on initialisms. For example, it counts this as three sentences:<p>&gt;The agency responsible for it, the U.S. Bureau of Reclamation, would build the highest and largest dams in the world on rivers few believed could be controlled.
croissantsover 4 years ago
Heh. For fun I put in a bit of &quot;Within a Budding Grove&quot; (one of the Proust books [1]) and it&#x27;s almost entirely in the yellow region, even with my bad copy-paste job that chops up sentences by accident. Proust is fun to read, but it took me a while to get the way he talks.<p>[1] <a href="https:&#x2F;&#x2F;www.planetebook.com&#x2F;free-ebooks&#x2F;within-a-budding-grove.pdf" rel="nofollow">https:&#x2F;&#x2F;www.planetebook.com&#x2F;free-ebooks&#x2F;within-a-budding-gro...</a>
canjobearover 4 years ago
Sentence length is pretty arbitrary considering that in many cases the choice of whether to insert a sentence-ending full stop is arbitrary. Almost any period can be replaced with a semicolon. And what should you do with a clause that starts with &quot;however&quot;? Should it be a separate sentence, or attached to the previous one with a comma or a semicolon? So I&#x27;ve never considered sentence length to be very meaningful.
评论 #24965930 未加载
8bitsruleover 4 years ago
A lot of the average sentence lengths in the examples were fairly similar. Ran the texts through an ARI readability test: The Twain was &#x27;grade school&#x27; while the Hesse was &#x27;college level&#x27;.
JoeAltmaierover 4 years ago
Seemed pretty similar to me. Dialog is mostly short interjections, while scene exposition and monologues were longer.
NotStalinover 4 years ago
While the viz is nice some numbers would but mucho mas nicer<p>Also everyone should read pride and prejudice because it’s funny as hell