UOJ-Bench

Beyond Problem Solving: UOJ-Bench for Evaluating Code Generation, Hacking, and Repair in Competitive Programming

A benchmark on the Universal Online Judge for code generation, code hacking, and code repair—evaluated through UOJ’s native judging infrastructure.

Leaderboard

# Model Metric