PawanEmbd-68M

A 68M parameter embedding model distilled from Granite-278M

Model Details

  • Model Type: Sentence Embedding Model
  • Architecture: Transformer-based encoder with projection layer
  • Parameters: ~68 million
  • Teacher Model: IBM Granite-278M Multilingual Embedding
  • Training Method: Knowledge Distillation
  • Output Dimensions: 768
  • Max Sequence Length: 512 tokens

Training Details

This model was trained using knowledge distillation from the IBM Granite-278M teacher model on the All-NLI dataset (SNLI + MultiNLI).

Training Hyperparameters

  • Dataset: sentence-transformers/all-nli (100K samples)
  • Epochs: 20
  • Batch Size: 32
  • Learning Rate: 5e-4 with OneCycleLR scheduler
  • Loss Function: Combined MSE + Cosine Similarity (α=0.5, β=0.5)
  • Mixed Precision: FP16 (AMP)
  • Hardware: NVIDIA T4 GPU

Usage

Using Transformers

from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn.functional as F

# Load model and tokenizer
model = AutoModel.from_pretrained("dmedhi/PawanEmbd-68M")
tokenizer = AutoTokenizer.from_pretrained("dmedhi/PawanEmbd-68M")

# Encode sentences
sentences = ["This is an example sentence", "Each sentence is converted to a vector"]
encoded = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Get embeddings
with torch.no_grad():
    outputs = model(**encoded)
    embeddings = outputs.pooler_output # Already normalized

# Compute similarity
similarity = F.cosine_similarity(embeddings[0:1], embeddings[1:2])
print(f"Similarity: {similarity.item():.4f}")

Using Sentence-Transformers

from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim

# Load your model (should work now!)
model = SentenceTransformer("dmedhi/PawanEmbd-68M")

# Test encoding
sentences = ["This is an example sentence", "Each sentence is converted to a vector"]
embeddings = model.encode(sentences)

print(f"✅ Embeddings shape: {embeddings.shape}")

# Compute similarity
similarity = cos_sim(embeddings[0], embeddings[1])
print(f"✅ Similarity: {similarity.item():.4f}")

Performance

Comparison with Teacher Model

Metric Teacher (Granite-278M) Student (PawanEmbd-68M)
Parameters 278M 68M (4.1x smaller)
Model Size ~1.1 GB ~258.7 MB
Inference Speed (CPU) 269.57 ms 11.57 (23.3x faster)
Inference Speed (GPU) 17.94.57 ms 2.75 (6.5x faster)
Cosine Similarity 1.000 0.943

Intended Uses

This model is suitable for:

✅ Semantic Search: Find similar documents or passages
✅ Clustering: Group similar texts together
✅ Duplicate Detection: Identify near-duplicate content
✅ Recommendation Systems: Find similar items
✅ Question Answering: Retrieve relevant passages
✅ Sentence Similarity: Measure semantic similarity between texts

Training Code

The model was trained using PyTorch with knowledge distillation. Training code available at: TODO

Citation

@misc{pawanembdmodel2025,
  author = {Dipankar Medhi},
  title = {PawanEmbd: A Lightweight Embedding Model via Knowledge Distillation},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = { \url{https://huggingface.co/dmedhi/PawanEmbd-68M} }
}

Acknowledgments

License

Apache 2.0

Contact

For questions or feedback, please open an issue on Github.

Downloads last month
4
Safetensors
Model size
67.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train dmedhi/PawanEmbd-68M