LuxMuse AI

community

AI & ML interests

None defined yet.

Recent Activity

Chiung-Yi authored a paper about 2 months ago

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity

Chiung-Yi authored a paper about 2 months ago

Is GPT-OSS Good? A Comprehensive Evaluation of OpenAI's Latest Open Source Models

Chiung-Yi authored a paper about 2 months ago

StreetMath: Study of LLMs' Approximation Behaviors

View all activity

LuxMuseAI 's datasets 1

LuxMuseAI/StreetMathDataset

Viewer • Updated Sep 14 • 1k • 29 • 3