Mini Reasoning

university

https://joshuaongg21.github.io/

AI & ML interests

None defined yet.

Recent Activity

rohitsaxena authored a paper 9 days ago

VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models

rohitsaxena authored a paper 9 days ago

Do Composed Image Retrieval Benchmarks Require Multimodal Composition?

Jforeverss authored a paper 9 days ago

OpenSIR: Open-Ended Self-Improving Reasoner

View all activity

authored 2 papers 9 days ago

VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models

Paper • 2603.06148 • Published Mar 6 • 2

Do Composed Image Retrieval Benchmarks Require Multimodal Composition?

Paper • 2605.14787 • Published May 15

authored 4 papers 9 days ago

OpenSIR: Open-Ended Self-Improving Reasoner

Paper • 2511.00602 • Published Nov 1, 2025 • 21

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

Paper • 2602.12586 • Published Feb 13 • 2

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

Paper • 2605.31433 • Published 27 days ago • 28

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

Paper • 2606.12594 • Published 15 days ago • 17

submitted a paper to Daily Papers 9 days ago

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

Paper • 2606.12594 • Published 15 days ago • 17

authored a paper 24 days ago

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

Paper • 2605.31433 • Published 27 days ago • 28

submitted a paper to Daily Papers 4 months ago

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

Paper • 2602.12586 • Published Feb 13 • 2

authored 2 papers 5 months ago

Beyond Data Filtering: Knowledge Localization for Capability Removal in LLMs

Paper • 2512.05648 • Published Dec 5, 2025

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?

Paper • 2601.23045 • Published Jan 30

authored a paper 9 months ago

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Paper • 2509.21552 • Published Sep 25, 2025 • 11

authored 2 papers 10 months ago

Theorem Prover as a Judge for Synthetic Data Generation

Paper • 2502.13137 • Published Feb 18, 2025 • 1

PiCSAR: Probabilistic Confidence Selection And Ranking

Paper • 2508.21787 • Published Aug 29, 2025 • 4

authored a paper 10 months ago

PiCSAR: Probabilistic Confidence Selection And Ranking

Paper • 2508.21787 • Published Aug 29, 2025 • 4

authored 4 papers 11 months ago

Self-Training Large Language Models for Tool-Use Without Demonstrations

Paper • 2502.05867 • Published Feb 9, 2025

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Paper • 2307.03042 • Published Jul 6, 2023

Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them

Paper • 2507.10616 • Published Jul 13, 2025 • 1

Inverse Scaling in Test-Time Compute

Paper • 2507.14417 • Published Jul 19, 2025 • 28

authored a paper about 1 year ago

What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations

Paper • 2502.08279 • Published Feb 12, 2025 • 1