TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Building a Search Engine for Programmers

51 pointsby vdthattealmost 5 years ago
Hey HN, I&#x27;ve recently started working on a side-project. It&#x27;s basically a vertical search engine for programmers. You&#x27;ll be able to quickly search through documentation, GitHub repos and stack overflow. It&#x27;ll know what language you&#x27;re using and what project you&#x27;re working on and tailor results accordingly.<p>What other features would you want to see in this tool?

18 comments

phausalmost 5 years ago
Here&#x27;s a use case you may or may not be interested in. Security Research.<p>As a security analyst, when I&#x27;m trying to figure out what a malicious script &#x2F; executable does, it often involves searching for weird strings I find in files that seem like they would be pretty unique. Or even specific sets of strings, even if it isn&#x27;t very easy.<p>Google used to be awesome for this. Now it is still the best that I&#x27;m aware of, but it has gotten gradually worse over the years. So basically the best tool for a job is one I would describe as infuriatingly bad.<p>I think the problem is that it does too much to try to protect a person from malicious stuff on the web. It also does too much to guess what you might actually want instead of giving you what you asked for.<p>Probably the biggest single thing that Google has done to screw it up is that it no longer respects quotes. Maybe there&#x27;s a workaround but I haven&#x27;t figured it out. 10-15 years ago if I wrapped something in quotes Google would give me exactly what I wanted. Now its very finicky and 70% of the time it gives me what it thinks I want. These guesses are almost always wrong.<p>That being said, there is also a usefulness to being able to search for a code snippet or another string and get things that are very similar, even when they aren&#x27;t an exact match. I think having multiple modes would be useful.<p>I think there might be some overlap between what I described above being useful for security researchers and what might be useful for programmers.
评论 #23514586 未加载
评论 #23513688 未加载
mlthoughts2018almost 5 years ago
My request would be semantic reverse code lookup within a language and across languages.<p>For example if I search something related to numpy mmap in Python, in some cases only core numpy mmap answers make sense as results. In other situations, the question is about mmap generally and info from other languages or the core mmap definition could answer my question.<p>I don’t want to be checking boxes or toggles, or retrying the query with different text over &amp; over to get the engine to understand these differences or when one class of answers is appropriate vs the other.
shakkharalmost 5 years ago
Should be &quot;Ask HN&quot; or &quot;Tell HN&quot;.
评论 #23513072 未加载
Syzygiesalmost 5 years ago
Literal search, including punctuation and spaces. Ideally regex searches. Mainstream engines are case-insensitive-on-steroids. Technical searches are literal.
评论 #23513536 未加载
评论 #23517999 未加载
hyperpapealmost 5 years ago
Tools surrounding search history. Things like selecting results as your personal answers to a query, trying to determine when you&#x27;re asking something you&#x27;ve asked before.<p>This is based on something Hillel Wayne wrote (@hillelogram) wrote on twitter not that long ago. The gist of it was it&#x27;s ok that we&#x27;re all using Google as part of our programming workflows, but why on Earth should we ever need to ask the same question twice? If you can find those threads, there might be more there than what I just said.<p>Obviously privacy is a concern here, but while I&#x27;m leery of Google knowing everything I do, I&#x27;m a lot happier with a technical search engine knowing which pieces of syntax I can&#x27;t remember.
评论 #23513222 未加载
gitgudalmost 5 years ago
- It would be good to detect the sydtem you&#x27;re using instead of writing &quot;<i>Ubuntu 19.04 64bit</i>&quot; before queries.<p>- Would be even cooler to detect the IDE, or even the error message itself (of course be careful not to leak sensitive information)
O_H_Ealmost 5 years ago
I sometimes find that obscure blog posts often have more comprehensive answers to very niche problems than SO.<p>Also, the market definitely exists. A lot of time, if I am not sure how to formulate my question yet, google really sucks. It also regularly fails to find projects&#x2F;tricks that I know exists and am able to find through my GitHub stars or browser history after some tedious browsing.
评论 #23513506 未加载
miccahalmost 5 years ago
If including code snippets, it would be great to easily export it to an online playground or other sandbox.<p>Often times I find an example or documentation, and I copy it to a playground to tweak it &#x2F; experiment with whatever feature I am implementing.
评论 #23513463 未加载
claxoalmost 5 years ago
Besides date ranking, which has been requested, maybe ranking code samples and GH issues on some proxy for code quality? Meaning, I would prefer to see a snippet of django or numpy before some aadwark repo
riedelalmost 5 years ago
Best possible preview snippets would be essential when looking for trivial stuff : cant remember exactly how to do sth but knew it once (google does that for the best SO match i think)<p>Exception messages could be an important thing to focus on. That is the second thing i when search engines matter to me often (support fuzzy search here: abstract away the too concrete stuff but keep the actual message).<p>It would be great if you would understand versioning of documenations: it always takes me a while to understand if the docs apply to the version i am actually using.
评论 #23513573 未加载
wizzerkingalmost 5 years ago
Sort by Date Freshest-&gt;Oldest and vice Veersa Patterns ??
评论 #23512275 未加载
anderscoalmost 5 years ago
I would want to be able to right click on an error in my IDE (I use vscode) and then run a search on that (filtered for the current language, env etc)
评论 #23513362 未加载
sneeuwpopsneeuwalmost 5 years ago
Some information is very hard to find. So on my local machine i have many books about c and c++ I translate those using what shell command to readable text so that i can search it. So maybe you can help with the trend that some programmer seam to notice that certain information is disparaging.<p>So a combination of easy search of the wayback machine or a search in all online books.
评论 #23513477 未加载
_zllxalmost 5 years ago
A VS Code extension that contributes a command to search it which opens the search results in the right pane. I assume you may already have a VS Code extension in the works, as that would be the best way to find out what project&#x2F;language is being worked on.
评论 #23513447 未加载
anderscoalmost 5 years ago
I would want google searches that have been filtered for the current language, tools etc.
评论 #23513371 未加载
vdthattealmost 5 years ago
I think programmers spend a lot of time searching for solutions online and making that process more intuitive will be a huge win for everybody haha.
评论 #23513131 未加载
maps7almost 5 years ago
I could see this as being very useful. If you&#x27;re looking for help let me know - I would like to spend time on something like this.
评论 #23513575 未加载
asicspalmost 5 years ago
See also quickref.dev [0] [1]<p>[0] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23263918" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23263918</a><p>[1] <a href="https:&#x2F;&#x2F;lobste.rs&#x2F;s&#x2F;dji0it&#x2F;experimental_search_engine_for" rel="nofollow">https:&#x2F;&#x2F;lobste.rs&#x2F;s&#x2F;dji0it&#x2F;experimental_search_engine_for</a>