| --- |
| language: |
| - en |
| license: gemma |
| library_name: transformers |
| tags: |
| - function-calling |
| - agent-routing |
| - multi-agent |
| - lora |
| - peft |
| - gemma |
| - functiongemma |
| - customer-support |
| - e-commerce |
| base_model: google/functiongemma-270m-it |
| datasets: |
| - scionoftech/functiongemma-e-commerce-dataset |
| model-index: |
| - name: functiongemma-270m-ecommerce-router |
| results: |
| - task: |
| type: text-classification |
| name: Agent Routing |
| dataset: |
| name: E-commerce Customer Support Routing |
| type: scionoftech/ecommerce-agent-routing |
| metrics: |
| - type: accuracy |
| value: 89.4 |
| name: Routing Accuracy |
| - type: f1 |
| value: 89.0 |
| name: Macro F1 Score |
| --- |
| |
| # FunctionGemma 270M - E-Commerce Multi-Agent Router |
|
|
| Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems. |
|
|
| ## Model Description |
|
|
| This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**. |
|
|
| **Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model). |
|
|
| ### Architecture |
|
|
| - **Base Model:** google/functiongemma-270m-it (270M parameters) |
| - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
| - **Trainable Parameters:** 1,474,560 (0.55%) |
| - **LoRA Rank:** 16 |
| - **LoRA Alpha:** 32 |
| - **Target Modules:** q_proj, k_proj, v_proj, o_proj |
|
|
| ### Training Details |
|
|
| - **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents) |
| - **Training Time:** 45 minutes on Google Colab T4 GPU |
| - **Framework:** Hugging Face Transformers + PEFT + TRL |
| - **Quantization:** 4-bit NF4 during training |
| - **Optimizer:** paged_adamw_8bit |
| - **Learning Rate:** 2e-4 |
| - **Epochs:** 3 |
| - **Batch Size:** 4 (effective 16 with gradient accumulation) |
|
|
| ## Intended Use |
|
|
| ### Primary Use Case |
| **Multi-agent customer support routing** for e-commerce platforms: |
| - Route queries to order management, product search, returns, payments, account, technical support agents |
| - Maintain conversation context across multi-turn interactions |
| - Enable intelligent task switching |
|
|
| ### Supported Agents |
|
|
| The model routes queries to 7 specialized agents: |
|
|
| 1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders |
| 2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations |
| 3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons |
| 4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges |
| 5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security |
| 6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing |
| 7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems |
|
|
| ### Out-of-Scope Use |
|
|
| - ❌ General-purpose chatbot (use base Gemma models instead) |
| - ❌ Direct dialogue generation (this is a routing model) |
| - ❌ More than 20 agents (context window limitations) |
| - ❌ Non-customer-support domains without fine-tuning |
|
|
| ## Performance |
|
|
| ### Test Set Results |
|
|
| ``` |
| Overall Accuracy: 89.40% (1,684/1,883 correct) |
| |
| Per-Agent Performance: |
| order_management 92.3% (251/272) |
| product_search 91.1% (257/282) |
| product_details 94.7% (233/246) |
| returns_refunds 88.2% (238/270) |
| account_management 85.1% (229/269) |
| payment_support 89.5% (241/269) |
| technical_support 87.0% (234/269) |
| ``` |
|
|
| ### Comparison to Baselines |
|
|
| | Approach | Accuracy | Latency | Memory | |
| |----------|----------|---------|--------| |
| | Keyword Matching | 52-58% | 5ms | Negligible | |
| | Rule-based (100 rules) | 65-70% | 8ms | Negligible | |
| | BERT Classifier (300M) | 82-85% | 45ms | 400 MB | |
| | **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** | |
| | GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud | |
|
|
| ### Latency Breakdown (T4 GPU) |
|
|
| - **Routing Decision:** 127ms average |
| - **Agent Execution:** ~52ms average |
| - **Total End-to-End:** ~179ms average |
|
|
| ## How to Use |
|
|
| ### Installation |
|
|
| ```bash |
| pip install transformers peft torch accelerate bitsandbytes |
| ``` |
|
|
| ### Quick Start |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| import torch |
| |
| # Load base model |
| base_model = AutoModelForCausalLM.from_pretrained( |
| "google/functiongemma-270m-it", |
| device_map="auto", |
| torch_dtype=torch.bfloat16 |
| ) |
| |
| # Load LoRA adapters |
| model = PeftModel.from_pretrained( |
| base_model, |
| "scionoftech/functiongemma-270m-ecommerce-router" |
| ) |
| |
| tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it") |
| |
| # Define available agents |
| agent_declarations = """<start_function_declaration> |
| route_to_order_agent(): Track, update, or cancel customer orders |
| route_to_search_agent(): Search products, check availability |
| route_to_details_agent(): Get product specifications and reviews |
| route_to_returns_agent(): Handle returns, refunds, exchanges |
| route_to_account_agent(): Manage user profile and settings |
| route_to_payment_agent(): Resolve payment and billing issues |
| route_to_technical_agent(): Fix app, website, login issues |
| <end_function_declaration>""" |
| |
| # Route a query |
| query = "Where is my order?" |
| |
| prompt = f"""<start_of_turn>user |
| {agent_declarations} |
| |
| User query: {query}<end_of_turn> |
| <start_of_turn>model |
| """ |
| |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=30, |
| do_sample=False, |
| pad_token_id=tokenizer.eos_token_id |
| ) |
| |
| response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False) |
| print(response) |
| # Output: <function_call>route_to_order_agent</function_call> |
| ``` |
|
|
| ### Production Deployment (4-bit Quantization) |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, BitsAndBytesConfig |
| from peft import PeftModel |
| |
| # 4-bit quantization config |
| quant_config = BitsAndBytesConfig( |
| load_in_4bit=True, |
| bnb_4bit_quant_type="nf4", |
| bnb_4bit_compute_dtype=torch.bfloat16 |
| ) |
| |
| # Load with quantization |
| base_model = AutoModelForCausalLM.from_pretrained( |
| "google/functiongemma-270m-it", |
| quantization_config=quant_config, |
| device_map="auto" |
| ) |
| |
| model = PeftModel.from_pretrained( |
| base_model, |
| "scionoftech/functiongemma-270m-ecommerce-router" |
| ) |
| |
| # Result: 180 MB model, 132ms latency, 89.1% accuracy |
| ``` |
|
|
| ### Parsing Function Calls |
|
|
| ```python |
| import re |
| |
| def extract_agent_function(response: str) -> str: |
| """Extract function name from FunctionGemma output.""" |
| match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response) |
| return match.group(1) if match else "unknown" |
| |
| # Usage |
| agent = extract_agent_function(response) |
| print(f"Route to: {agent}") |
| # Output: Route to: route_to_order_agent |
| ``` |
|
|
| ## Training Procedure |
|
|
| ### Dataset Preparation |
|
|
| Generated 12,550 synthetic examples with linguistic variations: |
|
|
| ```python |
| # Example training format |
| { |
| "query": "Please track my package ASAP", |
| "function": "route_to_order_agent", |
| "agent": "order_management" |
| } |
| ``` |
|
|
| Variations included: |
| - Polite forms: "Please", "Could you", "Can you" |
| - Casual starters: "Hey", "Hi", "Um" |
| - Urgency markers: "ASAP", "urgently", "immediately" |
| - Edge cases and ambiguous queries |
|
|
| ### Training Configuration |
|
|
| ```python |
| from transformers import TrainingArguments |
| from trl import SFTTrainer |
| from peft import LoraConfig |
| |
| # LoRA config |
| lora_config = LoraConfig( |
| r=16, |
| lora_alpha=32, |
| target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], |
| lora_dropout=0.05, |
| bias="none", |
| task_type="CAUSAL_LM" |
| ) |
| |
| # Training args |
| training_args = TrainingArguments( |
| output_dir="./functiongemma-ecommerce-router", |
| num_train_epochs=3, |
| per_device_train_batch_size=4, |
| gradient_accumulation_steps=4, |
| learning_rate=2e-4, |
| lr_scheduler_type="cosine", |
| warmup_ratio=0.1, |
| weight_decay=0.01, |
| bf16=True, |
| optim="paged_adamw_8bit", |
| logging_steps=20, |
| eval_strategy="epoch", |
| save_strategy="epoch" |
| ) |
| ``` |
|
|
| ### Training Results |
|
|
| - **Final Training Loss:** 0.0182 |
| - **Final Validation Loss:** 0.0198 |
| - **Training Time:** 45 minutes (T4 GPU) |
| - **Peak Memory:** 11.2 GB |
|
|
| ## Limitations and Biases |
|
|
| ### Known Limitations |
|
|
| 1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries |
| - Example: "I need help" (could be any agent) |
| - Mitigation: Implement confidence-based clarification (confidence < 0.7) |
|
|
| 2. **Context Dependency:** Requires conversation state management for multi-turn interactions |
| - Solution: Use durable workflow orchestrators (Temporal, Cadence) |
|
|
| 3. **Agent Confusion:** Most common misclassifications: |
| - Returns ↔ Order Management (12 cases) |
| - Account ↔ Payment (8 cases) |
| - Technical ↔ Product Details (6 cases) |
|
|
| 4. **Language:** Trained only on English queries |
| - For multilingual support, fine-tune on translated datasets |
|
|
| ### Biases |
|
|
| - **Domain-Specific:** Trained exclusively on e-commerce customer support |
| - **Synthetic Data:** Generated examples may not capture all real-world variations |
| - **Agent Distribution:** Balanced training may not reflect real query distributions |
|
|
| ## Ethical Considerations |
|
|
| - **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution |
| - **Recommendation:** Implement fallback to human agents for low-confidence predictions |
| - **Privacy:** Model doesn't store user data; conversation state managed externally |
| - **Fairness:** Ensure equal routing performance across user demographics |
|
|
| ## Citation |
|
|
| If you use this model in your research or production systems, please cite: |
|
|
| ```bibtex |
| @misc{functiongemma-ecommerce-router, |
| author = {Sai Kumar Yava}, |
| title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing}, |
| year = {2025}, |
| publisher = {HuggingFace}, |
| howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}}, |
| } |
| |
| @article{functiongemma2025, |
| title={FunctionGemma: Bringing bespoke function calling to the edge}, |
| author={Google DeepMind}, |
| year={2025}, |
| url={https://blog.google/technology/developers/functiongemma/} |
| } |
| ``` |
|
|
| ## Acknowledgments |
|
|
| - Google DeepMind for FunctionGemma base model |
| - Hugging Face for PEFT and Transformers libraries |
| - The open-source AI community |
|
|
| ## License |
|
|
| This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms). |
|
|
| **Commercial Use:** Permitted under Gemma license terms. |
|
|
| ## Related Resources |
|
|
| - **Blog Post:** [Full implementation guide](https://medium.com/@saikumaryava/beyond-mobile-actions-exploring-functiongemma-for-intelligent-multi-agent-orchestration-242dc0273f93) |
| - **Funcroute python package:** [funcroute](https://github.com/scionoftech/funcroute) |
| - **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb) |
| - **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce) |
| - **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset) |
| - **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) |
|
|
| ## Updates |
|
|
| - **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support |
|
|
| --- |
|
|
| **Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues) |