It's cool to see few shot learning work so well. I'm skeptical about using these models in production though. Would the smallest GPT-3 model outperform GPT-2? Is there a version of the GPT-3 model that I can use in production where the accuracy vs costs trade off makes sense?