🎼 ORCH Next.js 350M v2 (287M)

Orchestrated Recursive Code Hierarchy β€” the second-generation 287M-parameter ORCH small model, trained from scratch with a Next.js-focused 16k-vocab tokenizer.

GitHub Hardware


TL;DR

Parameters 286,966,784 (~287M)
Architecture Custom LLaMA-style decoder-only transformer
Training From scratch β€” no base model, no fine-tuning
Vocabulary 16,000 (custom tokenizer, Next.js / React / TS focused)
Context length 4,096 tokens
Hardware NVIDIA RTX 3060 12GB
Format Custom PyTorch (.pt)
License MIT

What this is

V2 of the small ORCH architecture. Same backbone as ORCH Fusion (24 layers, 1024 hidden, GQA 16/4) but trained with a larger 16k-vocab custom tokenizer focused on the Next.js / React / TypeScript ecosystem β€” common tokens for JSX, hooks, App Router file paths, Tailwind class names, and Prisma schema fragments compress efficiently.

Architecture

Model name:               orch
Size:                     350m (287M actual)
Layers:                   24
Hidden size:              1,024
Intermediate size:        2,816
Attention heads:          16
KV heads (GQA):           4
Max position:             4,096
RoPE theta:               10,000
Sliding window:           disabled
Activation:               SwiGLU
Normalization:            RMSNorm
Tied embeddings:          yes
Vocab size:               16,000

Training

  • Data: synthetic code data emphasizing Next.js project structures
  • Framework: custom PyTorch implementation
  • Hardware: NVIDIA RTX 3060 12GB
  • Precision: mixed
  • From scratch β€” no warm-start from any pretrained model

Usage

Custom format β€” needs the ORCH inference code:

import torch
from tokenizers import Tokenizer
from orch import OrchForCausalLM

model = OrchForCausalLM.from_pretrained("raihan-js/orch-nextjs-350m-v2")
tokenizer = Tokenizer.from_file("orch-tokenizer.json")

prompt = 'export default function HomePage() {'
ids = tokenizer.encode(prompt).ids
input_ids = torch.tensor([ids])
output = model.generate(input_ids, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0].tolist()))

Intended use

  • Next.js / React / TypeScript code completion and small file generation
  • Research baseline for SLM training on consumer hardware
  • Comparing performance gains from vocab specialization (vs ORCH Fusion's 2,103-vocab)

Limitations

  • Domain-specialized vocabulary: efficient on Next.js code, less efficient on Python, Rust, etc.
  • Synthetic training data: see ORCH Fusion model card for caveats on synthetic vs real-world distribution
  • No safety alignment

Related models

Author

Akteruzzaman Raihan Sikder β€” AI/ML engineer, CTO at ClarioScope AI. Trains SLMs from scratch and applies QLoRA fine-tuning on larger bases. Portfolio Β· GitHub.

Citation

@misc{sikder2025orchnextjs350m,
  title  = {ORCH Next.js 350M v2: A From-Scratch SLM with a Next.js-Focused Tokenizer},
  author = {Sikder, Akteruzzaman Raihan},
  year   = {2025},
  url    = {https://huggingface.co/raihan-js/orch-nextjs-350m-v2}
}
Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support