4 33 9

Lingdong Kong

ldkong

https://ldkong.com

AI & ML interests

3D Perception, Domain Adaptation, Semi-Supervised Learning, Self-Supervised Learning

Recent Activity

liked a Space 9 days ago

worldbench/vla4ad

submitted a paper 9 days ago

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

upvoted a paper 9 days ago

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

View all activity

Organizations

submitted a paper to Daily Papers 9 days ago

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Paper • 2512.16760 • Published 9 days ago • 12

authored 8 papers 16 days ago

Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Paper • 2405.05258 • Published May 8, 2024

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Paper • 2507.05260 • Published Jul 7

SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining

Paper • 2503.19912 • Published Mar 25

Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation

Paper • 2407.15282 • Published Jul 21, 2024

SEE4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting

Paper • 2510.26796 • Published Oct 30

authored a paper about 2 months ago

3EED: Ground Everything Everywhere in 3D

Paper • 2511.01755 • Published Nov 3 • 11

authored 3 papers 2 months ago

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Paper • 2510.02240 • Published Oct 2 • 17

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Paper • 2510.20579 • Published Oct 23 • 55

VideoLucy: Deep Memory Backtracking for Long Video Understanding

Paper • 2510.12422 • Published Oct 14 • 1

authored 5 papers 4 months ago

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Paper • 2505.23727 • Published May 29 • 5

Zero-Shot 3D Visual Grounding from Vision-Language Models

Paper • 2505.22429 • Published May 28 • 1

MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query

Paper • 2506.03144 • Published Jun 3 • 7

Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras

Paper • 2507.17664 • Published Jul 23 • 1

3D and 4D World Modeling: A Survey

Paper • 2509.07996 • Published Sep 4 • 58

authored a paper 7 months ago

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Paper • 2505.18675 • Published May 24 • 26

authored a paper 12 months ago

FRNet: Frustum-Range Networks for Scalable LiDAR Segmentation

Paper • 2312.04484 • Published Dec 7, 2023 • 1

Lingdong Kong

AI & ML interests

Recent Activity

Organizations

ldkong's activity