It gives me error as below: OSError: You are trying to access a gated repo. Make sure to have access to it at https://huggingface.co/google/gemma-7b-it. 401 Client Error. (Request ID: Root=1-67057062-38b576a8416d6a5720107b86;9d47d074-89e2-45ba-86a9-809d9cd8250e)
Give your token a descriptive name (e.g., "HF_TOKEN").
We recommend keeping the default "Read" access.
Click "Generate a token" and copy the token to your clipboard.
Step 2: Set Up Authentication in Your Server-Side/local-machine Code
You'll need to set the HF_TOKEN environment variable within your server-side environment. How you do this depends on your specific setup, but here's a general example:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import os
# Set the access token as an environment variable
os.environ["HF_TOKEN"] = "YOUR_TOKEN_HERE"
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it")
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-7b-it",
torch_dtype=torch.bfloat16,
use_auth_token=True
)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
If you encounter any further issues or have specific questions about your local machine-side setup, feel free to ask!