Zeyi Zhang's picture

14 2

Zeyi Zhang

illusence

·

https://lumen-ze.github.io/

Lumen-Ze

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

upvoted a paper about 2 months ago

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

upvoted a paper 2 months ago

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12 • 201

upvoted a paper about 2 months ago

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Paper • 2510.23691 • Published Oct 27 • 52

upvoted 11 papers 2 months ago

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Paper • 2510.19600 • Published Oct 22 • 68

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Paper • 2510.19779 • Published Oct 22 • 60

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published Oct 22 • 29

BLIP3o-NEXT: Next Frontier of Native Image Generation

Paper • 2510.15857 • Published Oct 17 • 24

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published Oct 17 • 50

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20 • 72

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23 • 18

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40

Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

Paper • 2510.19304 • Published Oct 22 • 23

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Paper • 2510.20579 • Published Oct 23 • 55

FinSight: Towards Real-World Financial Deep Research

Paper • 2510.16844 • Published Oct 19 • 8

upvoted a paper 3 months ago

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Paper • 2510.05684 • Published Oct 7 • 141

authored 3 papers 3 months ago

Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis

Paper • 2405.09814 • Published May 16, 2024

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

Paper • 2303.14613 • Published Mar 26, 2023

Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

Paper • 2510.04637 • Published Oct 6

liked 2 models 5 months ago

nvidia/audio-flamingo-3

Audio-Text-to-Text • Updated 29 days ago • 783 • 137

fixie-ai/ultravox-v0_5-llama-3_2-1b

Audio-Text-to-Text • 0.7B • Updated about 1 month ago • 381k • 67

updated a model 11 months ago

illusence/Semantic_Gesture_Retrieval_Model

8B • Updated Feb 3 • 15