TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

A Modern App Developer and an Old-Timer System Developer Walk into a Bar

148 pointsby zhenjlover 9 years ago

33 comments

tyreover 9 years ago
I suppose I&#x27;m a modern app developer, so here&#x27;s what I would actually do:<p>Postgres.<p>1) I&#x27;m not going to invent my own system. I am not the first person who wants to store addresses and boolean values. If I&#x27;m looking at this occasionally over many months, I don&#x27;t want to keep relearning my elegant bit-structure.<p>2) SQL is great, Postgres is even better with built-in network operators[1] for the more specific queries (e.g. subnets.)<p>3) Bitfields are all fun and games until you want to change things. I&#x27;m already having flashbacks to rails `has_bitfield` columns. Also, fewer people can confidently twiddle bits than can understand boolean SQL columns.<p>4) The author mentions possible compression methods. I&#x27;d rather a tested project like Postgres think about everything to do with storing my data.<p>I understand the criticism, but with cloud storage so cheap it feels like optimization just to show off. I&#x27;d rather save time than bits. If there is a satire that involves me using Postgres when I shouldn&#x27;t, that would be welcome.<p>[1] <a href="http:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;static&#x2F;functions-net.html" rel="nofollow">http:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;static&#x2F;functions-net....</a>
评论 #11100637 未加载
评论 #11100729 未加载
opticalfiberover 9 years ago
I&#x27;m not really sure what the point of this article is. If you&#x27;re clever, you can represent any data as a bit array. And once you&#x27;re there, counting bits or XORing them together is easy. But is that system easy to understand? Is the code easy to read for the rest of the developers who work on the project? Or for future hires? What happens if the data set becomes too large to fit in memory on a single machine? Storage is cheap and getting cheaper every day. The whole data set described in this problem (done the &quot;inefficient&quot; way) fits on a flash drive at today&#x27;s capacities.<p>There are always tradeoffs when it comes to architecture design. Speed and storage space are only two of the many factors that require consideration.
评论 #11100286 未加载
评论 #11100698 未加载
评论 #11100265 未加载
评论 #11100221 未加载
评论 #11100306 未加载
smoyerover 9 years ago
&quot;I can perform a simple AND operation on the 3 monthly bit arrays, and then count the number of “1” bits.&quot;<p>I think an old-timer would tell you that he&#x27;d be using an &quot;OR&quot; operation to count the number of hosts seen in the last three months.<p>As a long-time embedded systems engineer, you learn to guard every CPU cycle and memory location jealously. That behavior, along with avoiding needless abstractions that complicate the system AND cost CPU and memory, have served me well as I&#x27;ve switched to higher-level software.<p>As an aside, memory resident databases are now practical even for problems of this size. I&#x27;ve used the CERN Colt library to provide sparse arrays stored in memory in place of databases several times now. My current pet project is processing a mouse genome for my daughter&#x27;s lab. With the amount of data being read, I&#x27;d always be I&#x2F;O bound if I wasn&#x27;t packing a lot of working data into memory. This is the programmers equivalent of the CPU&#x27;s cache, but one we can easily control. Give it a try!
评论 #11100279 未加载
评论 #11106826 未加载
kogepathicover 9 years ago
Why wouldn&#x27;t you want to use an SQL database with an index to store this data? Just have a column for the ports and a boolean flag for the status of the port?<p>Honestly, I&#x27;d probably use bash&#x2F;nmap&#x2F;ping and psql to insert data. Want to query? psql and grep.<p>These examples seem like a great way to re-invent the wheel with modern buzzwords.
评论 #11100280 未加载
评论 #11100359 未加载
评论 #11100347 未加载
olalondeover 9 years ago
Business Guy shakes his head and gets a subscription on <a href="https:&#x2F;&#x2F;www.shodan.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.shodan.io&#x2F;</a>
评论 #11100402 未加载
评论 #11100267 未加载
tyingqover 9 years ago
Lost me at &quot;Old-Timer Developer: I will use Go.&quot;<p>The old-timers in your vicinity must be different than the ones in mine.
评论 #11100457 未加载
评论 #11100893 未加载
andybakover 9 years ago
Ah. Thank god the imaginary old school guy is better than the imaginary straw man modern guy! I&#x27;d better go learn Go immediately.<p>Personally I&#x27;d start off by looking into using Cython or Numpy and maybe pickle to disk for storage.<p>Does that make me the old guy or the new guy or is it a false dichotomy?
VLMover 9 years ago
Funny quote from the article: &quot;This is a big data problem.&quot; Hmm maybe on a raspberry pi, or some kind of retrocomputing challenge (Do this on a simulated IBM 1130!).<p>I don&#x27;t know how an app developer thinks, but I know the article characterization of the old developer is wrong, we&#x27;d chuck it into the existing relational DB, maybe spin up a new cloud instance and reserve some NAS space if necessary but this is a pretty small data set by modern standards so probably nothing special is required. All the reporting devolves into silly SQL onliner competition. The first question is a COUNT(*) and GROUP BY. The second question is a ridiculously simple SELECT. The third question is another simple select if you stored your ip addr bytes in separate columns, even if you allow non &#x2F;24 addrs. The fourth is another GROUP BY.<p>The article does fit the stereotype I&#x27;ve seen that new programmers prioritize ease of storing data over ease of reporting, and old timers vice versa. Like the difference between coming up with the fastest next move in checkers vs &quot;solving&quot; checkers in the game theory sense.
bottled_poeover 9 years ago
Maybe this is a stupid question, but why not just use a relational database?
评论 #11100195 未加载
nobulletover 9 years ago
Is this post Go biased? Why not C++ or Erlang? :) You can do the same with Python
评论 #11100182 未加载
评论 #11100192 未加载
szcover 9 years ago
A couple of errors near the end for &quot;How Many Hosts Were “Up” Last Month But Now It’s “Down”&quot; and vice versa. The error is that using xor on the same item twice is a no-op...<p>(A xor B) xor A == B.<p>The correct equations would be,<p>!this_month and last_month = &quot;up&quot; last month, down now.<p>and<p>this_month and !last_month = &quot;down&quot; last month, up now.
Manishearthover 9 years ago
This just compares a decent programmer with a terrible one. Not a modern app developer with an old timer. This is especially evident where the systems programmer is aware enough to use Go (which is a debatable choice, but certainly displays good awareness), but the modern app dev doesn&#x27;t know when (not) to use Big Data and when to try optimizing things (e.g. the JSON choice).<p>The only &quot;mistake&quot; I see a modern app developer making might be using Python, but hey, computers are fast (so python should still be fine). Most would probably come up with a similar bit-twiddly solution or a simple DB-based solution. No biggie.<p>There&#x27;s no dearth of horrible systems code out there either. It might be less likely for a systems programmer (even a horrible one) to mess up this exercise, but you could probably choose an exercise which would have the reverse effect. In the end, it&#x27;s a toy exercise, not one where you actually design a significant piece of software.<p>There is a point buried in all of this; which is that learning systems programming will probably make you a better modern app developer since you get the correct mindset to tackle problems like this (also, vice versa?). But there&#x27;s too much hyperbole obscuring it.<p>There&#x27;s also the other point about <i>how</i> systems programming is done vs modern app dev. It&#x27;s a valid one, but there are benefits to both approaches, and it boils down to each being useful in its own domain.
hellcowover 9 years ago
As VCs often look for keywords, I suppose the trick is to build the old-timer system but describe it as the modern web developer.
评论 #11100223 未加载
ryandrakeover 9 years ago
Couple of defensive, bruised egos here in the responses. Lighten up, it&#x27;s funny! We all know that one guy for whom the answer to every question is HADOOP MAP REDUCE!! I thought the writing was fun.
swinglockover 9 years ago
Unfortunately our hero the array developer got a bit too clever, pun intended, and produced an incorrect program. The last two of the XOR tricks won&#x27;t work the way he thought.
slightlycubanover 9 years ago
Point of this article is not the point of this article. The clever bit was:<p>&gt; ...the IPv4 address will convert into a number...<p>Many fancy, post-modern app developers might insist, &quot;You won&#x27;t do math with an address; it&#x27;s not a number.&quot; But some things <i>are</i> numbers, with a scheme and pattern you can exploit.
smarx007over 9 years ago
Old-time <i>system</i> developers use C.
评论 #11100583 未加载
评论 #11100387 未加载
评论 #11100314 未加载
hendlerover 9 years ago
There&#x27;s no simple solution without a lot of context. <a href="http:&#x2F;&#x2F;mockingeye.com&#x2F;a-classic-of-soviet-engineering&#x2F;" rel="nofollow">http:&#x2F;&#x2F;mockingeye.com&#x2F;a-classic-of-soviet-engineering&#x2F;</a>
Silhouetteover 9 years ago
It&#x27;s fascinating how many people here are defending the &quot;modern app developer&quot; approach, mostly with arguments about flexibility, maintainability, ability to pass on the code to junior developers, and the like. If you think about what the &quot;old-timer&#x27;s&quot; code here would <i>actually be</i>, these kinds of objections make no sense at all.<p>Assuming the code to run nmap itself would be equivalent either way and we&#x27;re interested in the data storage and analysis functions here, the old-timer would write that entire part of the program in about one screen of any decent programming language in perhaps 10 minutes. The functions would be short and simple, needing only basic iteration and bitwise arithmetic. Any junior programmer who&#x27;s going to get anywhere in this industry would be able to understand that code in moments with no specialist knowledge or additional training. Nothing about the code would get in the way of any reasonable commenting or testing policy either.<p>If in the future someone didn&#x27;t find the compact data structure appropriate for some new application, they could easily convert the data to a more suitable alternative format, because the current format would be well-specified, simple, efficient, and without external dependencies.<p>Hypothetical arguments about scalability are silly. The problem is fundamentally built around <i>IPv4 addresses</i>, which have been 32 bits wide since they were devised and will still be 32 bits wide tomorrow and next year. Designing for something more scalable is some horrible combination of scope creep, YAGNI violation, and worst of all, not giving even cursory thought to what the requirements actually mean. (I await the seemingly inevitable unintentionally amusing response about IPv6...)<p>The only thing I really quibble with here is the characterisation of the two types of developer. I don&#x27;t think this is really about modern vs. old-timer. It&#x27;s just about a good programmer -- who looks at each problem on its merits, chooses suitable tools for the job, and leaves their options open -- and the bad programmer, who does not.<p>Well, that and the fact that no self-respecting old-timer would misuse the word &quot;performant&quot; so heinously, but I digress. :-)
评论 #11100795 未加载
ComSubVieover 9 years ago
That&#x27;s actually a good comparison between different mindsets. Even if it reads a bit biased pro-go (or pro-old-time-system-developer) it doesn&#x27;t tell what&#x27;s the best way to implement this - and in my opinion it&#x27;s neither one.<p>So what&#x27;s actually the best way to implement this? What are the motivations for choosing one way?<p>&quot;Modern App Developer Way&quot;: + adoptable, scalable, readable data formats - storage, computation<p>&quot;Old-Time System Developer Way&quot;: + storage, computation - &quot;locked-in&quot; data formats<p>&quot;Database Developer Way&quot;: + ease of implementation ? storage, computation<p>Any other options&#x2F;ideas?
评论 #11100272 未加载
hosay123over 9 years ago
&gt; Old-Timer Developer:<p>&gt; I will use [a language with mandatory garbage collection]<p>I lol&#x27;d
asragabover 9 years ago
As a relatively junior developer, I have been told my some of my senior colleagues that I sometimes miss the forest for the trees, and so this may be another instance of that.<p>But it doesn&#x27;t seem &quot;trivial&quot; to me how one gets the output of nmap into whatever database, data structure one chooses here. I know nmap can produce XML, presumably there is a csv format, but it would seem like the XML&#x2F;CSV -&gt; JSON conversion (following our intrepid Modern App Developer) would be an easier more maintainable way to go, versus XML -&gt; to bit array (memory map file). Also, is managing the nmap or masscan and whatever other ancillary processes required to execute this plan equally as onerous in either paradigm? Finally, and this is likely controversial, this particular problem &quot;feels&quot; like its stacked against the Modern App Developer, given that it isn&#x27;t trying to solve a problem most Modern Apps try to solve (or try to solve as an end rather than a means to an end)
评论 #11100904 未加载
评论 #11100887 未加载
raesene2over 9 years ago
Pentester: I&#x27;ll use mass scan, someone else already solved this problem. <a href="https:&#x2F;&#x2F;github.com&#x2F;robertdavidgraham&#x2F;masscan" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;robertdavidgraham&#x2F;masscan</a>
cellisover 9 years ago
I&#x27;m betting on the modern app developers system being much easier to maintain and understand ( and pass off to junior programmers), at the cost of speed.
评论 #11100405 未加载
评论 #11100422 未加载
doomroboover 9 years ago
If anybody is curious, the zmap project from the University of Michigan does this and more.<p><a href="https:&#x2F;&#x2F;zmap.io" rel="nofollow">https:&#x2F;&#x2F;zmap.io</a>
yyinover 9 years ago
One could differentiate the two proposed solutions by the level of involvement of third parties (e.g., cloud hosting providers, authors of garbage collected or scripting languages, etc.). Some of these third parties have been involved since &quot;old times&quot;, others have not.
na85over 9 years ago
Based on my recent experiences, the modern app developer would write a slow, bloated web app that takes ages to produce a result but has a slick UI with awesome animations.
return0over 9 years ago
Did this all happen while they were in the bar? Seriously though, both stereotypes are off-base (Go? really?)
z3t4over 9 years ago
I think this is unfair because it&#x27;s basically in the old-school systems domain of problems.
the_cat_kittlesover 9 years ago
basically just an example of how choosing a bad data structure can make your life a headache
iofjover 9 years ago
Am I the only one who thinks that there are far more efficient data structures than a bit array to represent this data ?<p>How about an RLE encoded list of hosts, for instance ? (since they&#x27;re consecutive 32-bit integers). There&#x27;d be way less data than in a bitfield which will make most of these queries far faster than iterative bitfield lookups. Also, much more of the data would fit and stay in memory, which means that all queries that iterate over it will be 10x faster or maybe more.<p>Of course experimenting with data structures is not something you can do very efficiently in Go, as it&#x27;ll be painfully verbose code. C++ would be far more useful.<p>But this is basically the old argument for&#x2F;against optimizing code. The real problem with the &quot;old timer&quot; programmer is that there are quick ways to break his program. When new queries present themselves, the &quot;old timer&quot; will quickly find his datastructures not optimized for the queries, or that they require complex calculations, which means that for data analysis the modern app developer will probably &quot;win&quot;.<p>When it comes to putting a product in production, it needs to be fast and cheap. Any company that doesn&#x27;t hire an old-timer developer for that will quickly find their costs exploding. This may be acceptable for a few weeks when trying to find product&#x2F;market fit but it won&#x27;t last long.
GFK_of_xmaspastover 9 years ago
My background is much closer to the &quot;old timer&quot; but I think I&#x27;m definitely on the side of modernity here, along with everybody shouting &quot;sql&quot;.<p>Also I don&#x27;t see why the old timer needs both a bit array and a uint64 port array, they can put the up&#x2F;down bit in the high bit of each uint64.
dzhiurgisover 9 years ago
Old timer spends rest of 2016 to implement his solution that saves $100 in computing resources.
评论 #11100626 未加载