Tile-AI

community

https://github.com/tile-ai/

tile-ai

AI & ML interests

Enabling Lightning-Fast AI Workloads Development via Tiling

xysmlx

updated a collection 2 months ago

TileRT

Tile-Based Runtime for Ultra-Low Latency LLM Inference • 1 item • Updated Nov 20, 2025

yuqxia

authored a paper about 1 year ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published Jan 23, 2025 • 48

yuqxia

authored a paper over 1 year ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 180

xysmlx

authored 2 papers over 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Paper • 2407.00088 • Published Jun 25, 2024 • 12

LeiWang1999

authored 2 papers over 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Paper • 2407.00088 • Published Jun 25, 2024 • 12

yuqxia

authored a paper over 2 years ago

Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 172