--- library_name: transformers tags: [translation, hinglish, LoRA, NLP] --- # Model Card for English to Hinglish Translation Model ## Model Details ### Model Description This is a fine-tuned **T5-small** model for translating English sentences into Hinglish (a mix of Hindi and English written in Latin script). The model was trained using **LoRA (Low-Rank Adaptation)** to optimize training efficiency. - **Developed by:** Team AI-Pradarshan(Rashmi Rai, Ayesha, Bitasta) - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [Your Hugging Face Username] - **Model type:** Sequence-to-Sequence Language Model - **Language(s) (NLP):** English, Hinglish - **License:** MIT - **Finetuned from model [optional]:** google-t5/t5-small ### Model Sources [optional] - **Repository:** [https://huggingface.co/rairashmi/hinglish_translation_lora](https://huggingface.co/rairashmi/hinglish_translation_lora) - **Dataset:** [rairashmi/en-to-hinglish-dataset](https://huggingface.co/datasets/rairashmi/en-to-hinglish-dataset) ## Uses ### Direct Use This model can be used to translate English sentences into Hinglish text directly via Hugging Face Transformers. ### Downstream Use [optional] The model can be fine-tuned further or integrated into conversational AI systems and chatbots. ### Out-of-Scope Use - This model is not designed for real-time conversational applications. - It may not perform well on non-standard or highly domain-specific English text. ## Bias, Risks, and Limitations - The dataset used may contain inherent biases in Hinglish translation styles. - Accuracy may vary for different dialects and sentence structures. ### Recommendations Users should be aware of translation inconsistencies and verify translations for critical applications. ## How to Get Started with the Model ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer model_name = "rairashmi/hinglish_translation_lora" model = AutoModelForSeq2SeqLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) def translate_english_to_hinglish(text): inputs = tokenizer(f"translate English to Hinglish: {text}", return_tensors="pt", padding=True, truncation=True) outputs = model.generate(**inputs) return tokenizer.decode(outputs[0], skip_special_tokens=True) sentence = "How are you?" translation = translate_english_to_hinglish(sentence) print(f"🔹 English: {sentence}") print(f"🟢 Hinglish: {translation}") ``` ## Training Details ### Training Data The model was trained on the **rairashmi/en-to-hinglish-dataset**, a parallel corpus of English-Hinglish text pairs. ### Training Procedure #### Preprocessing [optional] - Tokenized using the **T5 tokenizer** - Padding and truncation applied with a max length of 128 #### Training Hyperparameters - **Learning Rate:** 2e-5 - **Batch Size:** 8 - **Epochs:** 2 - **Mixed Precision:** FP16 #### Speeds, Sizes, Times [optional] - Training took approximately **X hours** on an **A100 GPU** - Model size: **T5-Small with LoRA adapters** ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data - Evaluated on a held-out validation split of the dataset. #### Factors - Evaluated across different sentence lengths and complexities. #### Metrics - **BLEU Score:** X.XX (Evaluated using `sacrebleu`) ### Results - The model achieves **X.XX BLEU Score** on the test set. ## Model Examination [optional] [More Information Needed] ## Environmental Impact - **Hardware Type:** A100 GPU - **Hours used:** X - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective - The model is based on **T5-small** architecture fine-tuned for machine translation. ### Compute Infrastructure #### Hardware - Training was performed on a **single A100 GPU** #### Software - Transformers, Datasets, PEFT, Accelerate, Evaluate, Torch ## Citation [optional] **BibTeX:** ```bibtex @misc{hinglish_translation, author = {Your Name}, title = {English to Hinglish Translation Model}, year = {2025}, url = {https://huggingface.co/rairashmi/hinglish_translation_lora} } ``` ## Glossary [optional] - **Hinglish**: A mix of Hindi and English written in Latin script. ## More Information [optional] For further details, check out the **[Hugging Face Model Page](https://huggingface.co/rairashmi/hinglish_translation_lora)**. ## Model Card Authors [optional] - [Your Name or Organization] ## Model Card Contact For any issues or questions, contact **[Your Contact Information]**.