name: hf-jobs
description: >-
This skill should be used when users want to run any workload on Hugging Face
Jobs infrastructure. Covers UV scripts, Docker-based jobs, hardware selection,
cost estimation, authentication with tokens, secrets management, timeout
configuration, and result persistence. Designed for general-purpose compute
workloads including data processing, inference, experiments, batch jobs, and
any Python-based tasks. Should be invoked for tasks involving cloud compute,
GPU workloads, or when users mention running jobs on Hugging Face
infrastructure without local setup.
license: Complete terms in LICENSE.txt
Running Workloads on Hugging Face Jobs
Overview
Run any workload on fully managed Hugging Face infrastructure. No local setup required—jobs run on cloud CPUs, GPUs, or TPUs and can persist results to the Hugging Face Hub.
Common use cases:
- Data Processing - Transform, filter, or analyze large datasets
- Batch Inference - Run inference on thousands of samples
- Experiments & Benchmarks - Reproducible ML experiments
- Model Training - Fine-tune models (see
model-trainerskill for TRL-specific training) - Synthetic Data Generation - Generate datasets using LLMs
- Development & Testing - Test code without local GPU setup
- Scheduled Jobs - Automate recurring tasks
For model training specifically: See the model-trainer skill for TRL-based training workflows.
When to Use This Skill
Use this skill when users want to:
- Run Python workloads on cloud infrastructure
- Execute jobs without local GPU/TPU setup
- Process data at scale
- Run batch inference or experiments
- Schedule recurring tasks
- Use GPUs/TPUs for any workload
- Persist results to the Hugging Face Hub
Key Directives
When assisting with jobs:
ALWAYS use
hf_jobs()MCP tool - Submit jobs usinghf_jobs("uv", {...})orhf_jobs("run", {...}). Thescriptparameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string tohf_jobs().Always handle authentication - Jobs that interact with the Hub require
HF_TOKENvia secrets. See Token Usage section below.Provide job details after submission - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
Set appropriate timeouts - Default 30min may be insufficient for long-running tasks.
Prerequisites Checklist
Before starting any job, verify:
✅ Account & Authentication
- Hugging Face Account with Pro, Team, or Enterprise plan (Jobs require paid plan)
- Authenticated login: Check with
hf_whoami() - HF_TOKEN for Hub Access ⚠️ CRITICAL - Required for any Hub operations (push models/datasets, download private repos, etc.)
- Token must have appropriate permissions (read for downloads, write for uploads)
✅ Token Usage (See Token Usage section for details)
When tokens are required:
- Pushing models/datasets to Hub
- Accessing private repositories
- Using Hub APIs in scripts
- Any authenticated Hub operations
How to provide tokens:
{
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # Recommended: automatic token
}
⚠️ CRITICAL: The $HF_TOKEN placeholder is automatically replaced with your logged-in token. Never hardcode tokens in scripts.
Token Usage Guide
Understanding Tokens
What are HF Tokens?
- Authentication credentials for Hugging Face Hub
- Required for authenticated operations (push, private repos, API access)
- Stored securely on your machine after
hf auth login
Token Types:
- Read Token - Can download models/datasets, read private repos
- Write Token - Can push models/datasets, create repos, modify content
- Organization Token - Can act on behalf of an organization
When Tokens Are Required
Always Required:
- Pushing models/datasets to Hub
- Accessing private repositories
- Creating new repositories
- Modifying existing repositories
- Using Hub APIs programmatically
Not Required:
- Downloading public models/datasets
- Running jobs that don't interact with Hub
- Reading public repository information
How to Provide Tokens to Jobs
Method 1: Automatic Token (Recommended)
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Automatic replacement
})
How it works:
$HF_TOKENis a placeholder that gets replaced with your actual token- Uses the token from your logged-in session (
hf auth login) - Most secure and convenient method
- Token is encrypted server-side when passed as a secret
Benefits:
- No token exposure in code
- Uses your current login session
- Automatically updated if you re-login
- Works seamlessly with MCP tools
Method 2: Explicit Token (Not Recommended)
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Hardcoded token
})
When to use:
- Only if automatic token doesn't work
- Testing with a specific token
- Organization tokens (use with caution)
Security concerns:
- Token visible in code/logs
- Must manually update if token rotates
- Risk of token exposure
Method 3: Environment Variable (Less Secure)
hf_jobs("uv", {
"script": "your_script.py",
"env": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Less secure than secrets
})
Difference from secrets:
envvariables are visible in job logssecretsare encrypted server-side- Always prefer
secretsfor tokens
Using Tokens in Scripts
In your Python script, tokens are available as environment variables:
# /// script
# dependencies = ["huggingface-hub"]
# ///
import os
from huggingface_hub import HfApi
# Token is automatically available if passed via secrets
token = os.environ.get("HF_TOKEN")
# Use with Hub API
api = HfApi(token=token)
# Or let huggingface_hub auto-detect
api = HfApi() # Automatically uses HF_TOKEN env var
Best practices:
- Don't hardcode tokens in scripts
- Use
os.environ.get("HF_TOKEN")to access - Let
huggingface_hubauto-detect when possible - Verify token exists before Hub operations
Token Verification
Check if you're logged in:
from huggingface_hub import whoami
user_info = whoami() # Returns your username if authenticated
Verify token in job:
import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN not found!"
token = os.environ["HF_TOKEN"]
print(f"Token starts with: {token[:7]}...") # Should start with "hf_"
Common Token Issues
Error: 401 Unauthorized
- Cause: Token missing or invalid
- Fix: Add
secrets={"HF_TOKEN": "$HF_TOKEN"}to job config - Verify: Check
hf_whoami()works locally
Error: 403 Forbidden
- Cause: Token lacks required permissions
- Fix: Ensure token has write permissions for push operations
- Check: Token type at https://huggingface.co/settings/tokens
Error: Token not found in environment
- Cause:
secretsnot passed or wrong key name - Fix: Use
secrets={"HF_TOKEN": "$HF_TOKEN"}(notenv) - Verify: Script checks
os.environ.get("HF_TOKEN")
Error: Repository access denied
- Cause: Token doesn't have access to private repo
- Fix: Use token from account with access
- Check: Verify repo visibility and your permissions
Token Security Best Practices
- Never commit tokens - Use
$HF_TOKENplaceholder or environment variables - Use secrets, not env - Secrets are encrypted server-side
- Rotate tokens regularly - Generate new tokens periodically
- Use minimal permissions - Create tokens with only needed permissions
- Don't share tokens - Each user should use their own token
- Monitor token usage - Check token activity in Hub settings
Complete Token Example
# Example: Push results to Hub
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["huggingface-hub", "datasets"]
# ///
import os
from huggingface_hub import HfApi
from datasets import Dataset
# Verify token is available
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Use token for Hub operations
api = HfApi(token=os.environ["HF_TOKEN"])
# Create and push dataset
data = {"text": ["Hello", "World"]}
dataset = Dataset.from_dict(data)
dataset.push_to_hub("username/my-dataset", token=os.environ["HF_TOKEN"])
print("✅ Dataset pushed successfully!")
""",
"flavor": "cpu-basic",
"timeout": "30m",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Token provided securely
})
Quick Start: Two Approaches
Approach 1: UV Scripts (Recommended)
UV scripts use PEP 723 inline dependencies for clean, self-contained workloads.
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["transformers", "torch"]
# ///
from transformers import pipeline
import torch
# Your workload here
classifier = pipeline("sentiment-analysis")
result = classifier("I love Hugging Face!")
print(result)
""",
"flavor": "cpu-basic",
"timeout": "30m"
})
Benefits: Direct MCP tool usage, clean code, dependencies declared inline, no file saving required
When to use: Default choice for all workloads, custom logic, any scenario requiring hf_jobs()
Working with Scripts
⚠️ Important: There are two “script path” stories depending on how you run Jobs:
- Using the
hf_jobs()MCP tool (recommended in this repo): thescriptvalue must be inline code (a string) or a URL. A local filesystem path (like"./scripts/foo.py") won’t exist inside the remote container. - Using the
hf jobs uv runCLI: local file paths do work (the CLI uploads your script).
Common mistake with hf_jobs() MCP tool:
# ❌ Will fail (remote container can't see your local path)
hf_jobs("uv", {"script": "./scripts/foo.py"})
Correct patterns with hf_jobs() MCP tool:
# ✅ Inline: read the local script file and pass its *contents*
from pathlib import Path
script = Path("hf-jobs/scripts/foo.py").read_text()
hf_jobs("uv", {"script": script})
# ✅ URL: host the script somewhere reachable
hf_jobs("uv", {"script": "https://huggingface.co/datasets/uv-scripts/.../raw/main/foo.py"})
CLI equivalent (local paths supported):
hf jobs uv run ./scripts/foo.py -- --your --args
Approach 2: Docker-Based Jobs
Run jobs with custom Docker images and commands.
hf_jobs("run", {
"image": "python:3.12",
"command": ["python", "-c", "print('Hello from HF Jobs!')"],
"flavor": "cpu-basic",
"timeout": "30m"
})
Benefits: Full Docker control, use pre-built images, run any command When to use: Need specific Docker images, non-Python workloads, complex environments
Example with GPU:
hf_jobs("run", {
"image": "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel",
"command": ["python", "-c", "import torch; print(torch.cuda.get_device_name())"],
"flavor": "a10g-small",
"timeout": "1h"
})
Finding More UV Scripts on Hub
The uv-scripts organization provides ready-to-use UV scripts stored as datasets on Hugging Face Hub:
# Discover available UV script collections
dataset_search({"author": "uv-scripts", "sort": "downloads", "limit": 20})
# Explore a specific collection
hub_repo_details(["uv-scripts/classification"], repo_type="dataset", include_readme=True)
Popular collections: OCR, classification, synthetic-data, vLLM, dataset-creation
Hardware Selection
| Workload Type | Recommended Hardware | Cost (approx./hr) | Use Case |
|---|---|---|---|
| Data processing, testing | cpu-basic, cpu-upgrade |
~$0.10-0.50 | Lightweight tasks |
| Small models, demos | t4-small |
~$0.75 | <1B models, quick tests |
| Medium models | t4-medium, l4x1 |
~$1.50-2.50 | 1-7B models |
| Large models, production | a10g-small, a10g-large |
~$3.50-5.00 | 7-13B models |
| Very large models | a100-large |
~$8-12 | 13B+ models |
| Batch inference | a10g-large, a100-large |
~$5-10 | High-throughput |
| Data processing | cpu-upgrade, l4x1 |
~$0.50-2.50 | Parallel workloads |
GPU Flavors: cpu-basic/upgrade, t4-small/medium, l4x1/x4, a10g-small/large/largex2/largex4, a100-large, h100/h100x8
TPU Flavors: v5e-1x1, v5e-2x2, v5e-2x4
Guidelines:
- Start with smaller hardware for testing
- Scale up based on actual needs
- Use multi-GPU for parallel workloads
- See
references/hardware_guide.mdfor detailed specifications
Critical: Saving Results
⚠️ EPHEMERAL ENVIRONMENT—MUST PERSIST RESULTS
The Jobs environment is temporary. All files are deleted when the job ends. If results aren't persisted, ALL WORK IS LOST.
Persistence Options
1. Push to Hugging Face Hub (Recommended)
# Push models
model.push_to_hub("username/model-name", token=os.environ["HF_TOKEN"])
# Push datasets
dataset.push_to_hub("username/dataset-name", token=os.environ["HF_TOKEN"])
# Push artifacts
api.upload_file(
path_or_fileobj="results.json",
path_in_repo="results.json",
repo_id="username/results",
token=os.environ["HF_TOKEN"]
)
2. Use External Storage
# Upload to S3, GCS, etc.
import boto3
s3 = boto3.client('s3')
s3.upload_file('results.json', 'my-bucket', 'results.json')
3. Send Results via API
# POST results to your API
import requests
requests.post("https://your-api.com/results", json=results)
Required Configuration for Hub Push
In job submission:
{
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # Enables authentication
}
In script:
import os
from huggingface_hub import HfApi
# Token automatically available from secrets
api = HfApi(token=os.environ.get("HF_TOKEN"))
# Push your results
api.upload_file(...)
Verification Checklist
Before submitting:
- Results persistence method chosen
-
secrets={"HF_TOKEN": "$HF_TOKEN"}if using Hub - Script handles missing token gracefully
- Test persistence path works
See: references/hub_saving.md for detailed Hub persistence guide
Timeout Management
⚠️ DEFAULT: 30 MINUTES
Setting Timeouts
{
"timeout": "2h" # 2 hours (formats: "90m", "2h", "1.5h", or seconds as integer)
}
Timeout Guidelines
| Scenario | Recommended | Notes |
|---|---|---|
| Quick test | 10-30 min | Verify setup |
| Data processing | 1-2 hours | Depends on data size |
| Batch inference | 2-4 hours | Large batches |
| Experiments | 4-8 hours | Multiple runs |
| Long-running | 8-24 hours | Production workloads |
Always add 20-30% buffer for setup, network delays, and cleanup.
On timeout: Job killed immediately, all unsaved progress lost
Cost Estimation
General guidelines:
Total Cost = (Hours of runtime) × (Cost per hour)
Example calculations:
Quick test:
- Hardware: cpu-basic ($0.10/hour)
- Time: 15 minutes (0.25 hours)
- Cost: $0.03
Data processing:
- Hardware: l4x1 ($2.50/hour)
- Time: 2 hours
- Cost: $5.00
Batch inference:
- Hardware: a10g-large ($5/hour)
- Time: 4 hours
- Cost: $20.00
Cost optimization tips:
- Start small - Test on cpu-basic or t4-small
- Monitor runtime - Set appropriate timeouts
- Use checkpoints - Resume if job fails
- Optimize code - Reduce unnecessary compute
- Choose right hardware - Don't over-provision
Monitoring and Tracking
Check Job Status
# List all jobs
hf_jobs("ps")
# Inspect specific job
hf_jobs("inspect", {"job_id": "your-job-id"})
# View logs
hf_jobs("logs", {"job_id": "your-job-id"})
# Cancel a job
hf_jobs("cancel", {"job_id": "your-job-id"})
Remember: Wait for user to request status checks. Avoid polling repeatedly.
Job URLs
After submission, jobs have monitoring URLs:
https://huggingface.co/jobs/username/job-id
View logs, status, and details in the browser.
Scheduled Jobs
Run jobs on a schedule using CRON expressions or predefined schedules.
# Schedule a job that runs every hour
hf_jobs("scheduled uv", {
"script": "your_script.py",
"schedule": "@hourly",
"flavor": "cpu-basic"
})
# Use CRON syntax
hf_jobs("scheduled uv", {
"script": "your_script.py",
"schedule": "0 9 * * 1", # 9 AM every Monday
"flavor": "cpu-basic"
})
Available schedules:
@annually,@yearly- Once per year@monthly- Once per month@weekly- Once per week@daily- Once per day@hourly- Once per hour- CRON expression - Custom schedule (e.g.,
"0 9 * * 1")
Manage scheduled jobs:
hf_jobs("scheduled ps") # List scheduled jobs
hf_jobs("scheduled suspend", {"job_id": "..."}) # Pause
hf_jobs("scheduled resume", {"job_id": "..."}) # Resume
hf_jobs("scheduled delete", {"job_id": "..."}) # Delete
Common Workload Patterns
This repository ships ready-to-run UV scripts in hf-jobs/scripts/. Prefer using them instead of inventing new templates.
Pattern 1: Dataset → Model Responses (vLLM) — scripts/generate-responses.py
What it does: loads a Hub dataset (chat messages or a prompt column), applies a model chat template, generates responses with vLLM, and pushes the output dataset + dataset card back to the Hub.
Requires: GPU + write token (it pushes a dataset).
from pathlib import Path
script = Path("hf-jobs/scripts/generate-responses.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"username/input-dataset",
"username/output-dataset",
"--messages-column", "messages",
"--model-id", "Qwen/Qwen3-30B-A3B-Instruct-2507",
"--temperature", "0.7",
"--top-p", "0.8",
"--max-tokens", "2048",
],
"flavor": "a10g-large",
"timeout": "4h",
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
Pattern 2: CoT Self-Instruct Synthetic Data — scripts/cot-self-instruct.py
What it does: generates synthetic prompts/answers via CoT Self-Instruct, optionally filters outputs (answer-consistency / RIP), then pushes the generated dataset + dataset card to the Hub.
Requires: GPU + write token (it pushes a dataset).
from pathlib import Path
script = Path("hf-jobs/scripts/cot-self-instruct.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"--seed-dataset", "davanstrien/s1k-reasoning",
"--output-dataset", "username/synthetic-math",
"--task-type", "reasoning",
"--num-samples", "5000",
"--filter-method", "answer-consistency",
],
"flavor": "l4x4",
"timeout": "8h",
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
Pattern 3: Streaming Dataset Stats (Polars + HF Hub) — scripts/finepdfs-stats.py
What it does: scans parquet directly from Hub (no 300GB download), computes temporal stats, and (optionally) uploads results to a Hub dataset repo.
Requires: CPU is often enough; token needed only if you pass --output-repo (upload).
from pathlib import Path
script = Path("hf-jobs/scripts/finepdfs-stats.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"--limit", "10000",
"--show-plan",
"--output-repo", "username/finepdfs-temporal-stats",
],
"flavor": "cpu-upgrade",
"timeout": "2h",
"env": {"HF_XET_HIGH_PERFORMANCE": "1"},
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
Common Failure Modes
Out of Memory (OOM)
Fix:
- Reduce batch size or data chunk size
- Process data in smaller batches
- Upgrade hardware: cpu → t4 → a10g → a100
Job Timeout
Fix:
- Check logs for actual runtime
- Increase timeout with buffer:
"timeout": "3h" - Optimize code for faster execution
- Process data in chunks
Hub Push Failures
Fix:
- Add to job:
secrets={"HF_TOKEN": "$HF_TOKEN"} - Verify token in script:
assert "HF_TOKEN" in os.environ - Check token permissions
- Verify repo exists or can be created
Missing Dependencies
Fix: Add to PEP 723 header:
# /// script
# dependencies = ["package1", "package2>=1.0.0"]
# ///
Authentication Errors
Fix:
- Check
hf_whoami()works locally - Verify
secrets={"HF_TOKEN": "$HF_TOKEN"}in job config - Re-login:
hf auth login - Check token has required permissions
Troubleshooting
Common issues:
- Job times out → Increase timeout, optimize code
- Results not saved → Check persistence method, verify HF_TOKEN
- Out of Memory → Reduce batch size, upgrade hardware
- Import errors → Add dependencies to PEP 723 header
- Authentication errors → Check token, verify secrets parameter
See: references/troubleshooting.md for complete troubleshooting guide
Resources
References (In This Skill)
references/token_usage.md- Complete token usage guidereferences/hardware_guide.md- Hardware specs and selectionreferences/hub_saving.md- Hub persistence guidereferences/troubleshooting.md- Common issues and solutions
Scripts (In This Skill)
scripts/generate-responses.py- vLLM batch generation: dataset → responses → push to Hubscripts/cot-self-instruct.py- CoT Self-Instruct synthetic data generation + filtering → push to Hubscripts/finepdfs-stats.py- Polars streaming stats overfinepdfs-eduparquet on Hub (optional push)
External Links
Key Takeaways
- Submit scripts inline - The
scriptparameter accepts Python code directly; no file saving required unless user requests - Jobs are asynchronous - Don't wait/poll; let user check when ready
- Always set timeout - Default 30 min may be insufficient; set appropriate timeout
- Always persist results - Environment is ephemeral; without persistence, all work is lost
- Use tokens securely - Always use
secrets={"HF_TOKEN": "$HF_TOKEN"}for Hub operations - Choose appropriate hardware - Start small, scale up based on needs
- Use UV scripts - Default to
hf_jobs("uv", {...})with inline scripts for Python workloads - Handle authentication - Verify tokens are available before Hub operations
- Monitor jobs - Provide job URLs and status check commands
- Optimize costs - Choose right hardware, set appropriate timeouts