T5-Tiny Grammar Error Correction (GEC) - Conversational Specialized
This model is a highly compact T5-tiny (approx. 60M parameters) fine-tuned specifically for Grammar Error Correction (GEC) in conversational contexts. It is optimized for low-latency CPU inference and web-based deployment via Transformers.js.
Model Description
- Architecture: T5-tiny (Encoder-Decoder)
- Specialization: Informal English, slang correction (e.g., 'sus' -> 'suspicious', 'no cap' -> 'no lie'), and numeric shorthand.
- Deployment: Split into two dedicated repositories for architectural purity:
gec-t5-tiny: Standard PyTorch/Safetensors weights.
gec-t5-tiny-onnx: Web-optimized ONNX weights only.
Performance (CPU Benchmarks)
- Average Latency: ~426ms
- Target Platform: Web browsers (via Transformers.js) and mobile devices.
Intended Use
This model is intended for real-time typing assistance in chat applications, specialized to handle the nuances of modern digital communication without over-correcting natural slang into overly formal language.
Training Data
The model was trained on a mixture of:
- JFLEG: Fluency-based corrections.
- WI-LOCNESS: Authentic learner errors.
- Custom Conversational Set: Targeted mappings for internet slang and shorthand.
Limitations
Due to its tiny size, the model may struggle with extremely long or structurally complex formal legal/academic documents. It is primarily tuned for short-to-medium length conversational snippets.