🎼 ORCH Next.js 350M v2 (287M)

Orchestrated Recursive Code Hierarchy — the second-generation 287M-parameter ORCH small model, trained from scratch with a Next.js-focused 16k-vocab tokenizer.

TL;DR


Parameters	286,966,784 (~287M)
Architecture	Custom LLaMA-style decoder-only transformer
Training	From scratch — no base model, no fine-tuning
Vocabulary	16,000 (custom tokenizer, Next.js / React / TS focused)
Context length	4,096 tokens
Hardware	NVIDIA RTX 3060 12GB
Format	Custom PyTorch (`.pt`)
License	MIT

What this is

V2 of the small ORCH architecture. Same backbone as ORCH Fusion (24 layers, 1024 hidden, GQA 16/4) but trained with a larger 16k-vocab custom tokenizer focused on the Next.js / React / TypeScript ecosystem — common tokens for JSX, hooks, App Router file paths, Tailwind class names, and Prisma schema fragments compress efficiently.

Architecture

Model name:               orch
Size:                     350m (287M actual)
Layers:                   24
Hidden size:              1,024
Intermediate size:        2,816
Attention heads:          16
KV heads (GQA):           4
Max position:             4,096
RoPE theta:               10,000
Sliding window:           disabled
Activation:               SwiGLU
Normalization:            RMSNorm
Tied embeddings:          yes
Vocab size:               16,000

Training

Data: synthetic code data emphasizing Next.js project structures
Framework: custom PyTorch implementation
Hardware: NVIDIA RTX 3060 12GB
Precision: mixed
From scratch — no warm-start from any pretrained model

Usage

Custom format — needs the ORCH inference code:

import torch
from tokenizers import Tokenizer
from orch import OrchForCausalLM

model = OrchForCausalLM.from_pretrained("raihan-js/orch-nextjs-350m-v2")
tokenizer = Tokenizer.from_file("orch-tokenizer.json")

prompt = 'export default function HomePage() {'
ids = tokenizer.encode(prompt).ids
input_ids = torch.tensor([ids])
output = model.generate(input_ids, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0].tolist()))

Intended use

Next.js / React / TypeScript code completion and small file generation
Research baseline for SLM training on consumer hardware
Comparing performance gains from vocab specialization (vs ORCH Fusion's 2,103-vocab)

Limitations

Domain-specialized vocabulary: efficient on Next.js code, less efficient on Python, Rust, etc.
Synthetic training data: see ORCH Fusion model card for caveats on synthetic vs real-world distribution
No safety alignment

Related models

raihan-js/orch-fusion — predecessor (272M, 2,103-vocab, project benchmark scores)
raihan-js/orch-nextjs-3b — 3B sibling, custom 32k vocab
raihan-js/orch-7b — QLoRA fine-tune of DeepSeek Coder 6.7B Instruct

Author

Akteruzzaman Raihan Sikder — AI/ML engineer, CTO at ClarioScope AI. Trains SLMs from scratch and applies QLoRA fine-tuning on larger bases. Portfolio · GitHub.

Citation

@misc{sikder2025orchnextjs350m,
  title  = {ORCH Next.js 350M v2: A From-Scratch SLM with a Next.js-Focused Tokenizer},
  author = {Sikder, Akteruzzaman Raihan},
  year   = {2025},
  url    = {https://huggingface.co/raihan-js/orch-nextjs-350m-v2}
}

Downloads last month: 20