UOJ-Bench

Beyond Problem Solving: UOJ-Bench for Evaluating Code Generation, Hacking, and Repair in Competitive Programming

A benchmark on the Universal Online Judge for code generation, code hacking, and code repair—evaluated through UOJ’s native judging infrastructure.

Leaderboard

#	Model	Metric

#	Model	Easy	Hard

#	Model	Easy	Hard