TE
TechEcho
StartseiteTop 24hNeuesteBesteFragenZeigenJobs
GitHubTwitter
Startseite

TechEcho

Eine mit Next.js erstellte Technologie-Nachrichtenplattform, die globale Technologienachrichten und Diskussionen bietet.

GitHubTwitter

Startseite

StartseiteNeuesteBesteFragenZeigenJobs

Ressourcen

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. Alle Rechte vorbehalten.

A simple search engine from scratch

292 Punktevon bertmanvor 6 Tagen

9 comments

franczeskovor 6 Tagen
On the topic of search engines, I really liked classes by David Evans. The task was also building a simple search engine from scratch. It&#x27;s really for beginners, as the emphasis is on coding in general, but I&#x27;ve found it to be very approachable.<p><a href="https:&#x2F;&#x2F;www.cs.virginia.edu&#x2F;~evans&#x2F;courses&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.cs.virginia.edu&#x2F;~evans&#x2F;courses&#x2F;</a>
评论 #44042061 未加载
评论 #44042404 未加载
评论 #44044928 未加载
ktallettvor 6 Tagen
I always wonder if the days of search engines for specific topics could return. With LLM&#x27;s providing less than accurate results in some areas, and Google, bing, etc being taken over by adverts or well organised SEO, there feels like a place for accurate, specialised search.
评论 #44043536 未加载
评论 #44055071 未加载
评论 #44041264 未加载
评论 #44041562 未加载
snowstormsunvor 6 Tagen
Nice idea, but this approach does not handle out of vocabulary words well which is one major motivation for using a vector-based search. It might not perform significantly better compared to lexical matching like tf-idf or BM25, and being slower because of linear complexity. But cool regardless.
评论 #44042615 未加载
评论 #44044206 未加载
评论 #44046443 未加载
评论 #44042936 未加载
leumassuehtamvor 5 Tagen
The author has a nice series on compiling a Lisp [0], but unfortunately his search engine fails to find it by querying it with &quot;lisp&quot; or &quot;Lisp&quot;.<p>[0] <a href="https:&#x2F;&#x2F;bernsteinbear.com&#x2F;blog&#x2F;compiling-a-lisp-0&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bernsteinbear.com&#x2F;blog&#x2F;compiling-a-lisp-0&#x2F;</a>
评论 #44047970 未加载
评论 #44049740 未加载
sp0rkvor 6 Tagen
The SVG equation is very difficult to read if you&#x27;re using a dark OS theme because the blog uses the OS preference for dark&#x2F;light theme (and doesn&#x27;t seem to give an option to change it manually, either.)
评论 #44046655 未加载
评论 #44041929 未加载
kaycebasquesvor 5 Tagen
&gt; The idea behind the search engine is to embed each of my posts into this domain by adding up the embeddings for the words in the post.<p>Ah, OK! I never really grokked how to use word-level embeddings. Makes more sense now.
评论 #44043904 未加载
cosmicgadgetvor 6 Tagen
This was a really nice read. Now I have no excuse not to upgrade my blog search. I do feel that I&#x27;ll have a ton of long tail words like &#x27;prank&#x27;.
vojtechrichtervor 5 Tagen
I really like people playing around with technology many take for granted, without understanding its core, underlying princliples
swyxvor 6 Tagen
this embeds words with word2vec, which is like 10 years old. at least use BERT or sentencetransformers :)
评论 #44044321 未加载