MS has always shown that they can do research, halolens is one such product and the recent ML enablement spree is another example. It is a good thing that they are going beyond their comfort zone of milking Windows. What is interesting is to see if their efforts make any difference in the status quo. I say this after using AzureML, which I liked, there is no such thing on the internet which allows you to write a ML model without knowing a programming language! It is a webapp which asks you to put data inside it, click a few buttons and it generates python or R code for you. Just brilliant.
More interesting info on Maluuba's 2 actual, recently-released datasets: <a href="http://datasets.maluuba.com/" rel="nofollow">http://datasets.maluuba.com/</a><p>Their "News QA dataset" contains 120k Q&As collected from CNN articles:<p><i>Documents are CNN news articles. Questions are written by human users in natural language. Answers may be multiword passages of the source text. Questions may be unanswerable.<p>NewsQA is collected using a 3-stage, siloed process. Questioners see only an article's headline and highlights. Answerers see the question and the full article, then select an answer passage. Validators see the article, the question, and a set of answers that they rank. NewsQA is more natural and more challenging than previous datasets.</i><p>Their "Frames" dataset contains 1369 dialogues for vacation scheduling:<p><i>With this dataset, we also present a new task: frame tracking. Our main observation is that decision-making is tightly linked to memory. In effect, to choose a trip, users and wizards talked about different possibilities, compared them and went back-and-forth between cities, dates, or vacation packages.<p>Current systems are memory-less. They implement slot-filling for search as a sequential process where the user is asked for constraints one after the other until a database query can be formulated. Only one set of constraints is kept in memory. For instance, in the illustration below, on the left, when the user mentions Montreal, it overwrites Toronto as destination city. However, behaviours observed in Frames imply that slot values should not be overwritten. One use-case is comparisons: it is common that users ask to compare different items and in this case, different sets of constraints are involved (for instance, different destinations). Frame tracking consists of keeping in memory all the different sets of constraints mentioned by the user. It is a generalization of the state tracking task to a setting where not only the current frame is memorized.<p>Adding this kind of conversational memory is key to building agents which do not simply serve as a natural language interface for searching a database but instead accompany users in their exploration and help them find the best item.</i><p>---<p>Can anyone with experience in ML/AI comment on how novel/complex these projects are, and how expensive it would be to build out these datasets? Would be interesting to see what it takes to publish a few datasets trained on 20 day conversations between real people, and get acquired by Microsoft/Apple/Google.
Here's a presentation from one of their researchers, Harm van Seijen[0], to give you an idea of what sort of work they do.<p>Applying reinforcement learning to dialogue systems seems incredibly difficult, but if Maluuba (or others) can get a handle on the problem it would not be unreasonable to expect another revolution in the vein of applying convolutional nets to vision.<p>0: <a href="https://www.youtube.com/watch?v=s-8WkKhHYqA" rel="nofollow">https://www.youtube.com/watch?v=s-8WkKhHYqA</a>
not sure if this was a successful exit. LinkedIn shows the employee count being cut in half over the last little while. Also 2 of the other cofounders left?