TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Towards Natural Language Semantic Code Search at GitHub

159 pointsby Chris911over 6 years ago

13 comments

dgreenspover 6 years ago
It is a real cultural problem how engineers get more excited about machine learning than basic usability.<p>GitHub search can&#x27;t even search for a literal string, let alone a regex. It can&#x27;t search a subdirectory. Ranking is indistinguishable from random. It&#x27;s been this way for years. How about building an actual, usable, basic code search and then getting all fancy with your machine learning?<p>I almost built my own &quot;online git grep for GitHub&quot; last year.
评论 #18018595 未加载
评论 #18018519 未加载
评论 #18018524 未加载
评论 #18019972 未加载
评论 #18018771 未加载
评论 #18020600 未加载
评论 #18018443 未加载
评论 #18019575 未加载
评论 #18020043 未加载
评论 #18022544 未加载
评论 #18021504 未加载
评论 #18019777 未加载
评论 #18022896 未加载
rococodeover 6 years ago
This might just be me, but does anyone else feel that GitHub&#x27;s code search has other points that could be improved first?<p>My biggest gripe is that the other results show in seems to be totally random. For example, if I have a Java class called A and I search &quot;class A&quot; in code search, the actual A.java doesn&#x27;t tend to show up anywhere near the front. I just tried this in a repo and the actual A.java file was on the last page of results when I searched &quot;class A&quot;. The vast majority of the results before it didn&#x27;t even have the words &quot;class&quot; and &quot;A&quot; next to each other, which A.java does...<p>Maybe I&#x27;m doing something wrong (I&#x27;d welcome any input on how to use code search correctly!), but it just feels like they&#x27;re jumping the gun on trying to make their code search more advanced when the basic functionality doesn&#x27;t work that well.
评论 #18018286 未加载
评论 #18018254 未加载
评论 #18018247 未加载
finnhover 6 years ago
I would settle for the ability to use logical OR when searching issues&#x2F;pull requests, or to combine multiple negated searches.<p>&quot;is:pr is:open ( author:bob OR author:jim )&quot;<p>The lack of this pretty basic functionality makes issue &amp; PR search much less useful than it could be.
评论 #18018325 未加载
sam0x17over 6 years ago
It is awesome that they are working on this, but can I just say there are a lot of basic search features they need to add before &quot;doing the hard thing&quot;. Here are some things that I should be able to do easily but can&#x27;t (or can&#x27;t very easily or well) using GitHub&#x27;s search mechanism:<p>1. exact or close string searches for code that involves ![]{}_-*() etc characters<p>2. searches across past commits (e.g. find a line that used to be in the code)<p>4. search across pull request + comments (not just issues and commit messages)<p>5. advanced search operators -- there should be a full filtering UI with ands and ors etc<p>Because of this I often find my self grepping locally, or (more often) totally out of luck.
aaaaaaaaaabover 6 years ago
Now that’s what I call a misfeature!<p>GitHub is used by programmers. Surprisingly, they tend to be very good at telling computers <i>precisely</i> what they want, in the computers’ own language.<p>Natural language search is the exact opposite of this, invented for mom &amp; pops who start their search phrase with “Dear Google, I’d like to search for ...”.
KenanSulaymanover 6 years ago
GitHub is building some amazing stuff recently, I guess now that Microsoft is going to acquire them, there&#x27;s far less pressure on making Github Enterprise profitable..
DannyBeeover 6 years ago
I saw this created in another thread and it seems to accurately sum up the comments here: <a href="https:&#x2F;&#x2F;imgflip.com&#x2F;i&#x2F;2i90x2" rel="nofollow">https:&#x2F;&#x2F;imgflip.com&#x2F;i&#x2F;2i90x2</a>
评论 #18022440 未加载
paintstripperover 6 years ago
They should add regex search support first before this stuff.
nraynaudover 6 years ago
wait, they can&#x27;t search through forks or collate identical results and they are going into natural language processing?
manigandhamover 6 years ago
Devs don&#x27;t search code repositories using natural language queries, and any scenarios of searching for code examples that way are already extremely well handled by StackOverflow and Google.<p>This is an incredible waste of time and resources that could be spent making the existing search far better with very minor tweaks. A perfect example of big company project management where nobody seems to know what their users actually want.
tyingqover 6 years ago
I&#x27;d settle for github search that&#x27;s case sensitive and recognizes things like dollar signs, semi-colons, commas, braces, and such.
HereBeBeastiesover 6 years ago
Dear GitHub,<p>Please build search that lets me actually find a given file by name.<p>You are busy building a space rocket when all we want is a bicycle. Impressive, but useless for just popping down to the shops.<p>Love,<p>The rest of the world&#x27;s developers
mullikineover 6 years ago
I want to work at github. They&#x27;re making cool things.
评论 #18018257 未加载
评论 #18018296 未加载
评论 #18020216 未加载