Home 24h Top Newest Best Ask Show Jobs

Back to Profile

Submissions by zone411

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

Home

Home Newest Best Ask Show Jobs

Resources

HackerNews API Original HackerNews Next.js

© 2025 TechEcho. All rights reserved.

1

Public Goods Game Benchmark: Contribute and Punish, a Multi-Agent Benchmark

7 pointsby zone411about 2 months ago

2

Elimination Game: Multi-Agent LLM Social Reasoning, Strategy, and Deception

5 pointsby zone4113 months ago

3

SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork

111 pointsby zone4113 months ago

4

LLM Hallucination Benchmark: R1, o1, o3-mini, Gemini 2.0 Flash Think Exp 01-21

17 pointsby zone4113 months ago

5

Multi-Agent Step Race Benchmark: LLM Collaboration and Deception Under Pressure

7 pointsby zone4114 months ago

6

Show HN: LLM Thematic Generalization Benchmark

6 pointsby zone4114 months ago

7

Show HN: LLM Creative Story-Writing Benchmark

5 pointsby zone4114 months ago

8

Show HN: LLM Divergent Thinking Creativity Benchmark

8 pointsby zone4115 months ago

← Previous