Absurd Success

629 pointsby asicspover 1 year ago

18 comments

BLKNSLVRover 1 year ago

Totally useless commentary:It makes me deeply happy to hear success stories like this for a project that's moving in the correctly opposite direction to that of the rest of the world.Engildification. Of which there should be more!My soul was also satisfied by the Sleeping At Night post which, along with the recent "Lie Still in Bed" article, makes for very simple options to attempt to fix sleep (discipline) issues.

评论 #37334464 未加载

评论 #37332202 未加载

评论 #37332099 未加载

评论 #37332137 未加载

评论 #37338414 未加载

评论 #37332453 未加载

评论 #37333323 未加载

flexagoonover 1 year ago

By the way, Kagi, the paid search engine you might've seen on HackerNews as well, uses Marginalia as one of its data sources<a href="https://help.kagi.com/kagi/search-details/search-sources.html" rel="nofollow noreferrer">https://help.kagi.com/kagi/search-details/search-sources.htm...</a>If you use the "non-commercial" lens, those results, among with results from Kagi's own index and a few other independent sources will be prioritized.

gnymanover 1 year ago

On a side note inspired by this blog post.I'm wondering if humans are mostly incapable of producing great things without (artifical) restrictions.In this case, marginalia is (ridiculously) efficient because Victor (the creator) is intentionally restricting what hardware it runs on and how much ram it has.If he just caved in and added another 32GiB it would work for a while, but the inefficient design would persist and the problem would just show it's head later and then there would be more complexity around that design and it might not be as easy to fix then.If the original thesis is correct, then I think it explains why most software is so bad (bloated, slow, buggy) nowadays. It's because very few individual pieces of software nowadays are hitting any limits (in isolation). So each individual piece is terribly inefficient but with the latest M2 Pro and GiB connection you can just keep ahead of the curve where it becomes a problem.Anyways, turned into a rant; but the conclusion might be to limit yourself, and you (and e everyone else) will be better off long term.

评论 #37333196 未加载

评论 #37333571 未加载

评论 #37333133 未加载

评论 #37332770 未加载

评论 #37332898 未加载

评论 #37332792 未加载

评论 #37333054 未加载

评论 #37332818 未加载

评论 #37334065 未加载

评论 #37332855 未加载

评论 #37332888 未加载

评论 #37339495 未加载

评论 #37334765 未加载

评论 #37334134 未加载

评论 #37333664 未加载

评论 #37335703 未加载

评论 #37333181 未加载

评论 #37333763 未加载

nicbouover 1 year ago

I always love seeing marginalia.nu updates here. You are a cherished user on this website, and I hope that you keep posting.

评论 #37334024 未加载

anyfactorover 1 year ago

Oh thank you. I have been doing a hobby project on search engines, and I kept searching of variations of "Magnolia" for some reason. ""Marginalia"" at least for me is hard to remember. Currently, I am trying to figure my way around Searx.Does Marginalia support "time filters" for search like past day, past week etc? According the special keywords the only search params accepted is based on years.<pre><code> year>2005 (beta) The document was ostensibly published in or after 2005 year=2005 (beta) The document was ostensibly published in 2005 year<2005 (beta) The document was ostensibly published in or before 2005</code></pre>

评论 #37335775 未加载

评论 #37339172 未加载

mananaysiempreover 1 year ago

> In brief, every time an SSD updates a single byte anywhere on disk, it needs to erase and re-write that entire page.Is that actually true for SSDs? For raw flash it’s not, provided you are overwriting “empty” all-ones values or otherwise only changing 1s to 0s. Writing is orders of magnitude slower than reading, but still a couple orders of magnitude faster than erasing (resetting back to “empty”), and only erases count against your wear budget. It sounds like an own goal for an SSD controller to not take advantage of that, although if the actual guts of it are log-structured then I could imagine it not being able to.

评论 #37335000 未加载

评论 #37334868 未加载

评论 #37336323 未加载

评论 #37335442 未加载

评论 #37334881 未加载

ricardo81over 1 year ago

Just a shout out to my boss at Mojeek who presumably has a very similar path to this (the post resonates a lot with past conversations). Mojeek started back in 2004 and for the most part has been a single developer who built the bones of it, and in that, pretty much all of the IR and infrastructure.Limitations of finance and hardware, making decisions about 32 vs 64 bit ids, sharding, speed of updating all sound very familiar.Reminds me of Google way back when and their 'Google dance' that updated results once a month, nowadays it's a daily flux. It's all an evolution, and great to see Marginalia offering another view point into the web beyond big tech.

aidenn0over 1 year ago

Great to read this!Lots of people treat optimization as some deep-black-magic thing[1], but most of the time, it's actually easier than fixing a typical bug; all you have to do is treat excessive resource usage identical to how you would treat a bug.I'm going to make an assertion: most bugs that you can easily reproduce don't require wizardry to fix. If you can poke at a bug, then you can usually categorize it. Even the rare bugs that reveal a design flaw tend to do so readily once you can reproduce it.Software that nobody has taken a critical eye to performance on is like software with 100s of easily reproducible bugs that nobody has ever debugged. You can chip away at them for quite a while until you run into anything that is hard.1: I think this attitude is a bit of a hold-out from when people would do things like set their branch targets so that the drum head would reach the target at the same time the CPU wanted the instruction, and when resources were so constrained that everything was hand-written assembly with global memory-locations having different semantics depending on the stage the program was in. In that case, really smart people had already taken a critical eye to performance, so you need to find things they haven't found yet. This is rarely true of modern code.

评论 #37342258 未加载

newman123over 1 year ago

Wonder why SQLite was chosen over a key value store. Seems he wanted reads by id and no other columns so a relational db seems unnecessary?

评论 #37333591 未加载

janvdbergover 1 year ago

I love this as another example where restriction breeds innovation. More often than not this is found not in abundance but in limits.

foucover 1 year ago

Good example of how complexity often engenders complexity. The wrong abstraction might create 100x more work to support it.

评论 #37332531 未加载

jorgeleoover 1 year ago

This baffled me:"I wish I knew what happened, or how to replicate it. It’s involved more upfront design than I normally do, by necessity. I like feeling my way forward in general, but there are problems where the approach just doesn’t work"Yes, immediate (or soon enough) gratification feels good... To me, and maybe is because I am an old fart, this is the difference between programming and engineering.

csoursover 1 year ago

I took a start script from 90 seconds to 30 seconds yesterday, by finding a poorly named timeout value. Now I'm working on a graceful fallback from itimer to alarm instead of outdated c directives.

38over 1 year ago

if I search encoding/json, I get some interesting stuff:<a href="https://search.marginalia.nu/search?query=encoding%2Fjson" rel="nofollow noreferrer">https://search.marginalia.nu/search?query=encoding%2Fjson</a>but NOT what I am looking for. If I try again with Google:<a href="https://google.com/search?q=encoding%2Fjson" rel="nofollow noreferrer">https://google.com/search?q=encoding%2Fjson</a>first result is exactly what I want.

评论 #37332390 未加载

评论 #37333598 未加载

评论 #37332368 未加载

_madmax_over 1 year ago

I'm happy for you !

alberthover 1 year ago

TL;DR - SQLite is amazing even at scale.

donotsayover 1 year ago

I just tried "Russian sources about Ukraine war", and it returns ukranian sources. So I guess it is of limited use to avoid censorship.

评论 #37334943 未加载

评论 #37334874 未加载

bomewishover 1 year ago

I enjoyed reading this but I also fundamentally don't get it at a basic level like... why re-implement stuff that has already been done by entire teams? There are so many bigger and productionised search and retrieval systems. Why invest the human capital in doing it all again yourself? I just don't get it.

评论 #37332162 未加载

评论 #37333616 未加载

评论 #37332336 未加载

评论 #37332193 未加载

评论 #37332171 未加载

评论 #37332501 未加载

评论 #37332140 未加载

评论 #37332322 未加载

评论 #37332659 未加载

评论 #37332437 未加载

评论 #37333287 未加载