TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Boring Problems Need Attention Too

57 pointsby simplegeekalmost 4 years ago

17 comments

version_fivealmost 4 years ago
What this highlights most to me is the false precision people have come to expect. This is a real problem I've seen as an engineer / physicist working with business people. Things that I would consider "the same" because they are well withing the noise of one another get all kinds of "hold on now, I added up the numbers in the columns and got 22.1 but the columns total says 22" from people that don't understand rounding, significant figures, experimental error, etc. I know it's important to hide this from people because they end up getting distracted by it (or to occasionally let slip so someone can feel like they have done a thorough review because they found a column didnt add up to the rounded total). But I think it's more important overall to educate people about what kind of variation is important and what isn't, vs. getting hung up on the definition of a word.
评论 #27542211 未加载
6gvONxR4sf7oalmost 4 years ago
I&#x27;m sad that the author didn&#x27;t do a manual word count. If it&#x27;s only 1660 words or so, it would surely take less time than it took to write this post. It would probably make the author confront the difficulty in precisely defining word count.<p>It turns out that a lot of things that initially seem trivial to precisely define aren&#x27;t actually that precisely defined, like the length of the california coastline. This is, in my mind--and as a complete tangent--a great argument for wide programming and math education. When you&#x27;re forced to be so goddamned precise all the time, it&#x27;s very clear when an idea isn&#x27;t fully defined.
SethTroalmost 4 years ago
I diffed down to understand why Open Office (1663 consistent with Microsoft Office) differs from Google Docs (1655).<p>&quot;ago--never&quot;: G docs 1 word vs OpenOffice 2 words.<p>&quot;tiger-lillies--what&quot;: G docs 1 word vs OpenOffice 2 words (IDK what this should &quot;really&quot; be)<p>&quot;Wanting?--Water&quot;: GD 3 words vs OO 2 words<p>In this case the disagreement springs, exclusively, from if the docs engine believes that double hyphens make a compound word (and potentially handling punctuation in the middle of such a compound word)
评论 #27542683 未加载
评论 #27543725 未加载
thehappypmalmost 4 years ago
I mean.. are numbers words? Is a floating ! or - a word? Are hyphenated words 2 or 1? How about when a word is broken over two lines with a hyphen? Is ... a word? Is an unrecognized string a word? Does a figure caption count towards a word count? How about chapter titles? Or page numbers?
评论 #27540960 未加载
PaulHoulealmost 4 years ago
There is no such thing as a &quot;word&quot;.<p>Look at the waveform of speech and you will see long silent gaps inside &quot;words&quot; as well as there frequently being no gap between &quot;words&quot;.<p>There are phrases like &quot;Skinny Puppy&quot; that can do the same job as a word, there are also structures smaller than words that people smush together to make words. The two even work together:<p><pre><code> missile anti-missile missile anti-anti-missile missile missile </code></pre> If you see &quot;words&quot; as the molecules of text there will always be an asymptote you can&#x27;t overcome because segmenting text into words will sometimes introduce errors that you might not be able to recover from.
评论 #27540212 未加载
评论 #27541688 未加载
Pet_Antalmost 4 years ago
I remember during my PhD, a casual discussion on that figuring out the end of a sentence on it&#x27;s own is surprisingly more complex then you&#x27;d think:<p>&quot;I work at the F.B.I. I like it there.&quot; &quot;I work at the F.B.I., I like it there.&quot; &quot;I work at the F.B.I. I like it there!&quot;<p>It&#x27;s not as simple as counting periods.<p>So that counting words can have corner cases is definitely understandable. Is &quot;&amp;&quot; a word? It is literally just &#x27;e&#x27; + &#x27;t&#x27; superimposed and &quot;et&quot; is definitely a word.
LarryMade2almost 4 years ago
So they just leave us hanging there, no analysis, no reasoning of the results...?
评论 #27540304 未加载
评论 #27540900 未加载
评论 #27540886 未加载
asciimovalmost 4 years ago
I&#x27;d rather see an explanation behind the issue rather than a complaint about people not working on boring stuff because a bunch of text editors can&#x27;t agree on a word count.<p>I remember taking a typing class some 25 years ago and being told a that a word count is typically every 5 characters. That way someone doesn&#x27;t pad out their word count by using lots of small words.
评论 #27541797 未加载
jakub_galmost 4 years ago
IMO the issue here is that: everything that is not carefully specified as a precise algorithm, <i>will</i> be implemented differently.<p>This is why each browser used to parse HTML differently.<p>This is why you&#x27;d have compat or even security issues because some software used \r\n for newlines splitting while other used \n.<p>Luckily the browser vendors formed WHATWG which created pretty precise specs which are maybe convoluted but at least everyone parses HTML in the same way, and each browser pretends to be every other browser for compatibility.<p>2021 is really great for web compat, maybe not all browsers implement every API, but existing APIs are accompanied by very thorough test suites (Web Platform Tests): <a href="https:&#x2F;&#x2F;github.com&#x2F;web-platform-tests&#x2F;wpt" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;web-platform-tests&#x2F;wpt</a><p>Live results from nightly builds: <a href="https:&#x2F;&#x2F;wpt.fyi&#x2F;results&#x2F;?label=experimental&amp;label=master&amp;aligned" rel="nofollow">https:&#x2F;&#x2F;wpt.fyi&#x2F;results&#x2F;?label=experimental&amp;label=master&amp;ali...</a><p>Having said that I don&#x27;t see vendors aligning on definition on word count any soon due to corporate inertia, lack of incentives and lack of &quot;Word editors consortium&quot; (or is there any?)<p>A truly good function for word count might be pretty complex, and perhaps different for every language.
dcolkittalmost 4 years ago
I think a complement to this article is the essay &quot;Reality has a Surprising Amount of Detail&quot;[1]. The point is that things which superficially seem easy often have a lot of hidden complexity when you drill down to all the low-level details and corner cases.<p>[1]<a href="http:&#x2F;&#x2F;johnsalvatier.org&#x2F;blog&#x2F;2017&#x2F;reality-has-a-surprising-amount-of-detail" rel="nofollow">http:&#x2F;&#x2F;johnsalvatier.org&#x2F;blog&#x2F;2017&#x2F;reality-has-a-surprising-...</a>
AbrahamParangialmost 4 years ago
The words of someone who&#x27;s never tried to write a parser that operates on natural language.<p>&quot;I&#x27;ve seen things, you people wouldn&#x27;t believe&quot;
Zababaalmost 4 years ago
There&#x27;s a 2% variation between the lowest and the biggest. Is there a need to know a precise word count, where 2% of margin of error is not acceptable? If there&#x27;s not, there&#x27;s no problem. The author doesn&#x27;t attempt to define what a word is, so maybe all of these are correct for their own definition of word? Maybe the definition of what counts as a word is the boring problem that needs a solution, but it doesn&#x27;t look boring or easy to me. You can&#x27;t ask people to count correctly something that you haven&#x27;t even defined.<p>Also, believing that complex problems are easy is not something only programmers do. I work currently a lot with Excel automation, and most people have no idea of what can be automated easily and what can&#x27;t. I have some people coming that ask for automating a task they&#x27;ve never done manually and don&#x27;t really know how to do precisely. I think that&#x27;s the same mechanism of &quot;overabstraction&quot; that leads to people to say &quot;WET instead of DRY&quot; (Write Everything Twice instead of Don&#x27;t Repeat Yourself).
spoonjimalmost 4 years ago
The &quot;boring problem&quot; that always needs more attention than it gets is performance. Plenty of companies ship features that only reduce performance by 10,20,50 milliseconds but it all adds up and applications end up feeling very sluggish. Apple is one of the few where I feel like the perf doesn&#x27;t regress over time (likely because the features don&#x27;t really expand much either).
lbrineralmost 4 years ago
Also boring businesses need attention. Loads of people would love to run a gym but you could make 10x more by selling toilet paper or paper clips.
评论 #27541848 未加载
评论 #27541080 未加载
评论 #27544106 未加载
bilateralmost 4 years ago
The variance is probably due to a difference in what counts as a word + ambiguity around white spaces etc in each text editor. So it&#x27;s not that the word counting problem is unsolved it&#x27;s a problem of different companies adopting different definitions.
sgtnoodlealmost 4 years ago
At some FAANG companies, fixing bugs and other &quot;boring&quot; projects don&#x27;t get you promoted.<p>I can tell when there&#x27;s a perf&#x2F;promo cycle coming up at Google because all the core apps on my phone change their UIs and get buggier and slower.
juancnalmost 4 years ago
Counting words is easy, defining what a word is, is not.