TechEcho

4 comments

sreanover 13 years ago

What you broached open, intentionally or unintentionally, is a deep and a fundamental problem in machine learning. One could attempt to write a program that will guess (near) correctly. If the total number of hypothesis possible is finite the optimal guesser should take no more than log of the the size of the hypothesis class. The optimal strategy is to ask that example to be labeled that would cut down the hypothesis class in half regardless of whether the answer was true or not.Things get more interesting when the hypothesis class is infinite and the guesser is no longer in control of which examples to choose. What is known is that if the examples are chosen randomly from the space of all possible examples, more specifically if the random samples are drawn independently from the same distribution, it is still possible to guess the correct hypothesis even if the hypothesis class is infinite. There of course are a couple of catches.(i) The hypothesis class must have a finite VC dimension. What that means is given any n sample there can be atmost a n^d possible different labelings that can be induced on them.(ii) the algorithm has to be allowed to be wrong on a certain non-zero percentage of the sample space with some non-zero probability.These two hedges can be made arbitrarily small, just that the minimum number of training examples required will depend on how small they are. The reverse is also true, if the hypothesis class does not have a finite VC dimension the guessing problem is not learnable in this set up. This means one can come up with a scheme in which the learner will almost always fail to stay within the number of mistakes it was allowed. This learning problem set up is known as PAC learning. It stands for probably approximately correct.The question you ask is different because unlike in the PAC set up the guesser has control over which example he/she wants labeled. Your setup is called active learning. If I remember correctly the case for the infinite hypothesis space is still open.So in all fairness you do have to tell the interviewers what class the secret hypothesis lies in. Or to turn tables I can easily construct a hypothesis that you will not be able to guess.An interesting case would be you tell the interviewee the token length of the little schemer'ish lisp code that encodes the secret hypothesis.Quite unbelievably enough, there is another framework of machine learning problems where the choice of the example can be adversarial. Here of course it is not possible to arrive at the correct hypothesis. What one tries here is to do as best as the best of the "experts" you have access to. You of course do not know a-priori who is the best. This framework is known as online learning and is one of the most demanding, yet has simple algorithms that can solve it for certain hypothesis classes.

FaceKickerover 13 years ago

I like the spirit of the question, but it kind of bothers me. The fact is, no matter what finite number of hypotheses you test before making your one guess about the rule, there will always be an infinite number of possible "rules" that could fit all the information you have so far (where a "rule" is a function mapping triplets to booleans).So in the example he gave, let's say I decide to try his "falsification" technique as well. I ask about all the same 6 triplets he gives as examples, and then decide that the rule is "ascending numbers". So I guess that. Nope, that's wrong in this particular case - the rule was actually "ascending numbers where the second one is at least as close to the first number as it is to the third". As you can imagine, you could play that game forever. I don't think his guess of "even increasing numbers" is any less reasonable than the correct answer of "increasing numbers". Maybe the rule is "ascending integers" - he didn't try falsifying that by guessing 3.14 in one of his triplets. There are unlimited "patterns" he'd have to try falsifying.I think the ideas behind the problem are good but you'd need to constrain it in some way for it to be reasonable.

teaspoonover 13 years ago

You're claiming that you can infer, just from a triple that I guess, whether I am attempting to confirm or refute the "ascending even numbers" hypothesis. But if I guess (2,4,6) and learn that it doesn't adhere, that's as much a refutation as if I guess (1,2,3) and learn that it does adhere. (2,4,6) looks like an attempted confirmation to you only because you have prior knowledge that the rule is not more specific than "ascending even numbers".

评论 #3441850 未加载

xxqsover 13 years ago

here we discussed some technology interviewing topics: <a href="http://news.ycombinator.com/item?id=3437831" rel="nofollow">http://news.ycombinator.com/item?id=3437831</a>your approach looks interesting, although I would prefer some practical micro-assignments from the candidate's field of technology. If the questions require some thinking and solving micro-problems, you immediately learn how the person analyzes and solves them.

4 comments

sreanover 13 years ago

FaceKickerover 13 years ago

teaspoonover 13 years ago

评论 #3441850 未加载

xxqsover 13 years ago

The only brain teaser I use in interviews

4 comments

The only brain teaser I use in interviews

4 comments