I was very impressed with Anthropic's paper on Concept mapping.<p>Post <a href="https://www.anthropic.com/news/mapping-mind-language-model" rel="nofollow">https://www.anthropic.com/news/mapping-mind-language-model</a><p>Paper <a href="https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html" rel="nofollow">https://transformer-circuits.pub/2024/scaling-monosemanticit...</a><p>This seems like a very good starting point for alignment. One could almost see a pathway to making something like the laws of robotics from here. It's a long way to go, but a good first step.
These superaligners.<p>"I am breaking out on my own! Together we will do bigger and better things!!!"<p>"Ok I'll join the other guys."<p>I think it's pretty clear that the capital markets have next door to no interest in alignment pursuits, and only the most-funded apply a token amount of investment towards it.
"Automated alignment research" suggests he's still interested in following the superalignment blueprint from OpenAI. So what do you do while you're waiting for the AI that's capable of doing alignment research for you to arrive? If you believe this is a viable path, what's the point of putzing around doing your own research when you'll allegedly have an army of AI researchers at your command in the near future?
I keep getting Anthropic and Extropic (Guillaume Verdon / Beff Jezos) names mixed up. Anthropic is Claude and Extropic is Thermodynamic hardware many orders of magnitude faster and more energy efficient than CPUs/GPUs.*<p>* parameterized stochastic analog circuits that implement energy-based models (EBMs). Stochastic computing is a computing paradigm that represents numbers using the probability of ones in a bitstream.