I run a small distributed research community, and recently have been considering building a GATO (https://openreview.net/forum?id=1ikK0kHjvj) equivalent model.<p>This would be extremely challenging, but I think could be a very interesting project, if only from the perspective of all the tacit knowledge you build in trying to build a SOTA Foundation model.<p>Details are extremely limited in the paper, but the grand challenges I see are:<p>- building an equivalent dataset across so many different modalities
- model design given limited architecture details. It is a seq to seq foundation model with a high level embedding function to map different modalities to sequence space,
- Actually training such a model<p>Our research group discord: https://discord.gg/yyqqhwXE
Just wanted to share and ask the HN community for feedback around feasibility and scoping/organizing such an effort.