Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Kaushal's picture

2

Kaushal

kd-tensor

·

kd303

AI & ML interests

model inferencing, synthetic data generation, model fine-tuning

Organizations

None yet

kd-tensor 's collections 5

Drowning in Documents: Consequences of Scaling Reranker Inference

Paper • 2411.11767 • Published Nov 18, 2024 • 19

FLM-101B: An Open LLM and How to Train It with $100K Budget

Paper • 2309.03852 • Published Sep 7, 2023 • 44

Synthetic Data Generation

General collection for making stuff up! :)

instruction-pretrain/instruction-synthesizer

Text Generation • 7B • Updated Mar 1, 2025 • 19 • 79

safety-alignment

Constitutional AI: Harmlessness from AI Feedback

Paper • 2212.08073 • Published Dec 15, 2022 • 4

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 82
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16, 2024 • 22
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7, 2024 • 5

Drowning in Documents: Consequences of Scaling Reranker Inference

Paper • 2411.11767 • Published Nov 18, 2024 • 19

safety-alignment

Constitutional AI: Harmlessness from AI Feedback

Paper • 2212.08073 • Published Dec 15, 2022 • 4

FLM-101B: An Open LLM and How to Train It with $100K Budget

Paper • 2309.03852 • Published Sep 7, 2023 • 44

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 82
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16, 2024 • 22
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7, 2024 • 5

Synthetic Data Generation

General collection for making stuff up! :)

instruction-pretrain/instruction-synthesizer

Text Generation • 7B • Updated Mar 1, 2025 • 19 • 79

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs