Sora in this case really has nothing to do with the model, as OpenAI hasn't released any specific details about the model, we only know that it's a diffusion model using a transformer architecture and it was trained on videos of different dimensions and lengths, this is just an open source text-to-video.