科技回声

6 条评论

mcantor超过 15 年前

This is my problem with mathematics. The concept of a "soft maximum," the principles behind calculating it, and its startling similarity to a hard maximum, are all fascinating and exciting to me. Look at that! Two completely separate functions, and such magical results. According to this post, it's useful for "convex optimization." I clicked through to the "related post," which was merely a comment about someone else's opinions on "convex optimization," so I looked it up on Wikipedia:<a href="http://en.wikipedia.org/wiki/Convex_optimization" rel="nofollow">http://en.wikipedia.org/wiki/Convex_optimization</a>Ah! A technique used to "minimize convex functions." Maybe some of this notation will make sense to me if I understand the underlying concept of whatever a "convex function" is.<a href="http://en.wikipedia.org/wiki/Convex_functions" rel="nofollow">http://en.wikipedia.org/wiki/Convex_functions</a>Great. An entire article that is completely and utterly meaningless to me. I mean, absolutely nothing in that article--oh! "Convex sets?" That looks promising.<a href="http://en.wikipedia.org/wiki/Convex_set" rel="nofollow">http://en.wikipedia.org/wiki/Convex_set</a>Jackpot! The pretty pictures make the idea of a convex set clear to me. Unfortunately, by now I'm 4 clicks away, and my actual understanding of the subject is clearly just scratching the surface. Connecting my newfound--and obviously still naive--understanding of convexity* to "soft maximums," which initially inspired this search, feels dumbfoundingly impossible.Am I approaching this all wrong? Am I expecting too much? Thinking too little? I would love to understand more about this subject, and I have tried to learn it the same way I learned how to program: by Googling and working on my own problems. However, the resources simply don't seem to be there in the same way. What's the deal, here?* Would it be more accurate to say "Euclidian convexity?" What would that mean, exactly?

评论 #1065894 未加载

评论 #1066294 未加载

评论 #1066074 未加载

评论 #1065857 未加载

评论 #1065863 未加载

评论 #1066530 未加载

评论 #1066057 未加载

评论 #1065862 未加载

imurray超过 15 年前

The soft maximum often comes up when dealing with probabilistic models and in neural networks. I'm surprised that a blog that has been highlighting numerical issues hasn't pointed out how the soft maximum should be computed.If x is vector of values then the naive code is:<pre><code> log(sum(exp(x))) </code></pre> where exp operates elementwise. However, if even a single item of x is large (1000 say) this will return Inf. If x are log probabilities, where you want the log of the sum of the probabilities (common), the elements of x might all be less than -1000 (also common) and then the function will return -Inf.Many people have a function called logsumexp() kicking about to compute the softmax robustly. It will do something like:<pre><code> y = max(x); return y + log(sum(exp(x-y))); </code></pre> One example, slightly more elaborate, Matlab implementation is in: <a href="http://research.microsoft.com/en-us/um/people/minka/software/lightspeed/" rel="nofollow">http://research.microsoft.com/en-us/um/people/minka/software...</a>

评论 #1066394 未加载

tjic超过 15 年前

The huge flaw with this article: explaining WHY I would ever want to use a soft maximum.Yes, it "sands off the corners".Why do I want that?

评论 #1065907 未加载

评论 #1065980 未加载

评论 #1065961 未加载

ramanujan超过 15 年前

This is useful analytically, but one tradeoff here is that if this is in an inner loop, the exp/sum/log process will be a lot more expensive than a simple max.For proofs and stuff it can be invaluable to have a differentiable objective function, but in actual computation it's not always necessary.

评论 #1066639 未加载

mitko超过 15 年前

Also known as "smoothed" max/min which in my opinion is better name as it is mathematically more meaningful as the max function [NOT smooth] is approximated with a smooth function.There is no such thing as soft function in mathematics! Please be consistent when naming...

shalmanese超过 15 年前

Is there a accepted modification to the function to make it scale invariant? I'm kind of weirded out by something that would spit out a different result if did a unit change.

评论 #1067854 未加载

6 条评论

mcantor超过 15 年前

评论 #1065894 未加载

评论 #1066294 未加载

评论 #1066074 未加载

评论 #1065857 未加载

评论 #1065863 未加载

评论 #1066530 未加载

评论 #1066057 未加载

评论 #1065862 未加载

imurray超过 15 年前

评论 #1066394 未加载

tjic超过 15 年前

The huge flaw with this article: explaining WHY I would ever want to use a soft maximum.Yes, it "sands off the corners".Why do I want that?

评论 #1065907 未加载

评论 #1065980 未加载

评论 #1065961 未加载

ramanujan超过 15 年前

评论 #1066639 未加载

mitko超过 15 年前

shalmanese超过 15 年前

Is there a accepted modification to the function to make it scale invariant? I'm kind of weirded out by something that would spit out a different result if did a unit change.

评论 #1067854 未加载

The "Soft Maximum" function

6 条评论

The "Soft Maximum" function

6 条评论