πΌ ORCH Next.js 350M v2 (287M)
Orchestrated Recursive Code Hierarchy β the second-generation 287M-parameter ORCH small model, trained from scratch with a Next.js-focused 16k-vocab tokenizer.
TL;DR
| Parameters | 286,966,784 (~287M) |
| Architecture | Custom LLaMA-style decoder-only transformer |
| Training | From scratch β no base model, no fine-tuning |
| Vocabulary | 16,000 (custom tokenizer, Next.js / React / TS focused) |
| Context length | 4,096 tokens |
| Hardware | NVIDIA RTX 3060 12GB |
| Format | Custom PyTorch (.pt) |
| License | MIT |
What this is
V2 of the small ORCH architecture. Same backbone as ORCH Fusion (24 layers, 1024 hidden, GQA 16/4) but trained with a larger 16k-vocab custom tokenizer focused on the Next.js / React / TypeScript ecosystem β common tokens for JSX, hooks, App Router file paths, Tailwind class names, and Prisma schema fragments compress efficiently.
Architecture
Model name: orch
Size: 350m (287M actual)
Layers: 24
Hidden size: 1,024
Intermediate size: 2,816
Attention heads: 16
KV heads (GQA): 4
Max position: 4,096
RoPE theta: 10,000
Sliding window: disabled
Activation: SwiGLU
Normalization: RMSNorm
Tied embeddings: yes
Vocab size: 16,000
Training
- Data: synthetic code data emphasizing Next.js project structures
- Framework: custom PyTorch implementation
- Hardware: NVIDIA RTX 3060 12GB
- Precision: mixed
- From scratch β no warm-start from any pretrained model
Usage
Custom format β needs the ORCH inference code:
import torch
from tokenizers import Tokenizer
from orch import OrchForCausalLM
model = OrchForCausalLM.from_pretrained("raihan-js/orch-nextjs-350m-v2")
tokenizer = Tokenizer.from_file("orch-tokenizer.json")
prompt = 'export default function HomePage() {'
ids = tokenizer.encode(prompt).ids
input_ids = torch.tensor([ids])
output = model.generate(input_ids, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0].tolist()))
Intended use
- Next.js / React / TypeScript code completion and small file generation
- Research baseline for SLM training on consumer hardware
- Comparing performance gains from vocab specialization (vs ORCH Fusion's 2,103-vocab)
Limitations
- Domain-specialized vocabulary: efficient on Next.js code, less efficient on Python, Rust, etc.
- Synthetic training data: see ORCH Fusion model card for caveats on synthetic vs real-world distribution
- No safety alignment
Related models
raihan-js/orch-fusionβ predecessor (272M, 2,103-vocab, project benchmark scores)raihan-js/orch-nextjs-3bβ 3B sibling, custom 32k vocabraihan-js/orch-7bβ QLoRA fine-tune of DeepSeek Coder 6.7B Instruct
Author
Akteruzzaman Raihan Sikder β AI/ML engineer, CTO at ClarioScope AI. Trains SLMs from scratch and applies QLoRA fine-tuning on larger bases. Portfolio Β· GitHub.
Citation
@misc{sikder2025orchnextjs350m,
title = {ORCH Next.js 350M v2: A From-Scratch SLM with a Next.js-Focused Tokenizer},
author = {Sikder, Akteruzzaman Raihan},
year = {2025},
url = {https://huggingface.co/raihan-js/orch-nextjs-350m-v2}
}
- Downloads last month
- 20