TE
科技回声
首页
24小时热榜
最新
最佳
问答
展示
工作
中文
GitHub
Twitter
首页
A differentiable approximation to the max operator
3 点
作者
aidanrocke
将近 7 年前
1 comment
aidanrocke
将近 7 年前
For context, this occurred to me when I was trying to find a way to apply policy gradients to tic-tac-toe. I haven't compared it to the Gumbel-Softmax trick yet but empirically I can say that it works.