TE
테크에코
홈24시간 인기최신베스트질문쇼채용
GitHubTwitter
홈

테크에코

Next.js로 구축된 기술 뉴스 플랫폼으로 글로벌 기술 뉴스와 토론을 제공합니다.

GitHubTwitter

홈

홈최신베스트질문쇼채용

리소스

HackerNews API원본 HackerNewsNext.js

© 2025 테크에코. 모든 권리 보유.

Outcome-Based Reinforcement Learning to Predict the Future

99 포인트작성자: bturtel3일 전

5 comments

ctoth3일 전
Do you want paperclips? Because this is how you get paperclips!<p>Eliminate all agents, all sources of change, all complexity - anything that could introduce unpredictability, and it suddenly becomes far easier to predict the future, no?
评论 #44108367 未加载
评论 #44109030 未加载
valine3일 전
So instead of next token prediction its next event prediction. At some point this just loops around and we&#x27;re back to teaching models to predict the next token in the sequence.
评论 #44109457 未加载
评论 #44111892 未加载
评论 #44110418 未加载
jldugger3일 전
From the abstract<p>&gt; A simple trading rule turns this calibration edge into $127 of hypothetical profit versus $92 for o1 (p = 0.037).<p>I&#x27;m lazy: is this hypothetical shooting fish in a barrel, or is it a real edge?
评论 #44111427 未加载
amelius3일 전
Why would you use RL if you&#x27;re not going to control the environment, but just predict it?
评论 #44111917 未加载
garbagecoder1일 전
&quot;a couple of wavy lines&quot;<p>bzzzzz &quot;sorry this isn&#x27;t your lucky day&quot;