HumanEval Benchmark: 95.1 @ GPT-3.5<p>I wonder if it can be combined with projects like SWE-Agent to build powerful yet opensource coding agents.<p>- <a href="https://paperswithcode.com/sota/code-generation-on-humaneval" rel="nofollow">https://paperswithcode.com/sota/code-generation-on-humaneval</a><p>- <a href="https://github.com/princeton-nlp/SWE-agent">https://github.com/princeton-nlp/SWE-agent</a>