TechEcho

10 comments

ianhornalmost 5 years ago

I'd like to see something that could do this, handling the awfulness of real world tabular data. "What country has the highest GDP? Okay, which table has GDP? Is it the country_gdp table? No, that's an old one that hasn't been written to in 3 years. Ah here it is, but you need to join against `geopolitics`, but first dedup the crimea data, since it's showing up in two places, we're can't remember why it got written to twice there. Also, you need to exclude June 21 because we had an outage on the brazil data that day. What do you mean some of the country_id rows are NULL?" And so on. I dream that someday there's a solution for that. That's a looooong ways away, I'd bet.

评论 #23726485 未加载

评论 #23727052 未加载

评论 #23726791 未加载

评论 #23726970 未加载

评论 #23729877 未加载

abhghalmost 5 years ago

Does anyone know how it relates/compares to Google's TaPaS? [1] I notice this paper doesn't refer to it.[1] <a href="https://ai.googleblog.com/2020/04/using-neural-networks-to-find-answers.html" rel="nofollow">https://ai.googleblog.com/2020/04/using-neural-networks-to-f...</a>

评论 #23730176 未加载

philprxalmost 5 years ago

Git repo or it doesn't exist ;-)Seriously, if this is not available, what are the alternatives?I've seen in the past some NLP + Storage project but I don't recall them. (even remotely connected, there was something to convert PDFs into machine readable data).Is this AwesomeNLP <a href="https://github.com/keon/awesome-nlp" rel="nofollow">https://github.com/keon/awesome-nlp</a> a good starting point there?

评论 #23726273 未加载

neeeeeesalmost 5 years ago

Seems similar to this work out of Salesforce a few years ago: <a href="https://www.salesforce.com/blog/2017/08/salesforce-research-ai-talk-to-data.html" rel="nofollow">https://www.salesforce.com/blog/2017/08/salesforce-research-...</a>

srikualmost 5 years ago

TABERT no longer on the Spider leaderboard? - <a href="https://yale-lily.github.io/spider" rel="nofollow">https://yale-lily.github.io/spider</a> . The top is "RATSQL v2 + BERT" testing at 65.6 for exact matches.

KasianFranksalmost 5 years ago

NLP has come pretty far: "Released by Symantec in 1985 for MS-DOS computers, Q&A's flat-file database and integrated word processing application is cited as a significant step towards making computers less intimidating and more user friendly. Among its features was a natural language search function based on a 600 word internal vocabulary." <a href="https://en.wikipedia.org/wiki/Q%26A_(Symantec)" rel="nofollow">https://en.wikipedia.org/wiki/Q%26A_(Symantec)</a>

j4ah4nalmost 5 years ago

Does the following mean that one can map/train to runtimes that give proper results based on the underlying data _results_?"A representative example is semantic parsing over databases, where a natural language question (e.g., “Which country has the highest GDP?”) is mapped to a program executable over database (DB) tables."Could it be thought of in the same fashion as Resolvers in GraphQL integrated into BERT?

louisstowalmost 5 years ago

Does anything like this exist for XML documents? Wonder if it could be used for identifying interesting information in web pages.

runawaybottlealmost 5 years ago

I thought Google already did something similar?Are we entering deep copycat culture?

SheinhardtWigCoalmost 5 years ago

Honest version:> Why it matters:> Improving NLP allows us to create better, more seamless human-to-machine interactions for tasks ranging from identifying dissidents to querying for desperate laid-off software engineers. TaBERT enables business development executives to improve their accuracy in answering questions like “Which hot app should we buy next?” and “Which politicians will take our bribes?” where the answer can be found in different databases or tables.> Someday, TaBERT could also be applied toward identifying illegal immigrants and automated fact checking. Third parties often check claims by relying on statistical data from existing knowledge bases. In the future, TaBERT could be used to map Facebook posts to relevant databases, thus not only verifying whether a claim is true, but also rejecting false, divisive and defamatory information before it's shared.

评论 #23730134 未加载

10 comments

ianhornalmost 5 years ago

评论 #23726485 未加载

评论 #23727052 未加载

评论 #23726791 未加载

评论 #23726970 未加载

评论 #23729877 未加载

abhghalmost 5 years ago

评论 #23730176 未加载

philprxalmost 5 years ago

评论 #23726273 未加载

neeeeeesalmost 5 years ago

srikualmost 5 years ago

KasianFranksalmost 5 years ago

j4ah4nalmost 5 years ago

louisstowalmost 5 years ago

Does anything like this exist for XML documents? Wonder if it could be used for identifying interesting information in web pages.

runawaybottlealmost 5 years ago

I thought Google already did something similar?Are we entering deep copycat culture?

SheinhardtWigCoalmost 5 years ago

评论 #23730134 未加载

TaBERT: A new model for understanding queries over tabular data

10 comments

TaBERT: A new model for understanding queries over tabular data

10 comments