Instructions to use vikhyatk/moondream2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vikhyatk/moondream2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="vikhyatk/moondream2", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("vikhyatk/moondream2", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use vikhyatk/moondream2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vikhyatk/moondream2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vikhyatk/moondream2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/vikhyatk/moondream2
- SGLang
How to use vikhyatk/moondream2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "vikhyatk/moondream2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vikhyatk/moondream2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "vikhyatk/moondream2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vikhyatk/moondream2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use vikhyatk/moondream2 with Docker Model Runner:
docker model run hf.co/vikhyatk/moondream2
I try to create inference using a service and it fails every time, thoughts? (across all different services)
Endpoint encountered an error.
You can try restarting it using the "pause" button above. Check logs for more details.
[Server message]Endpoint failed to start
See details
0/dist-packages/starlette/routing.py", line 732, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 608, in aenter
await self._router.startup()
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 709, in startup
await handler()
File "/app/webservice_starlette.py", line 60, in some_startup_task
inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
File "/app/huggingface_inference_toolkit/handler.py", line 50, in get_inference_handler_either_custom_or_default_handler
custom_pipeline = check_and_register_custom_pipeline_from_directory(model_dir)
File "/app/huggingface_inference_toolkit/utils.py", line 186, in check_and_register_custom_pipeline_from_directory
custom_pipeline = handler.EndpointHandler(model_dir)
File "/repository/handler.py", line 10, in init
self.model = AutoModelForCausalLM.from_pretrained(self.model_id, trust_remote_code=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 548, in from_pretrained
model_class = get_class_from_dynamic_module(
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 488, in get_class_from_dynamic_module
final_module = get_cached_module_file(
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 352, in get_cached_module_file
get_cached_module_file(
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 314, in get_cached_module_file
modules_needed = check_imports(resolved_module_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 180, in check_imports
raise ImportError(
ImportError: This modeling file requires the following packages that were not found in your environment: einops. Run pip install einops
Application startup failed. Exiting.
this is the kind of error I see when I try to make an endpoint. @Thomas Simonini doesn't seem to work for any endpoint. Any advice?
Or anyone I could tag?
ImportError: This modeling file requires the following packages that were not found in your environment: einops. Run pip install einops
Don't really get how this is the error
in the logs
Would love to deploy this for a game-engine use case.
The error message literally says it doesn't have einops package installed. Can you retry after installing the package?
Woops, sorry about that; I actually implemented this on replicate instead and it worked with just a click, but if I go the HF route again, I'll install this package.
But how can I install the package in inference API ? it is serverless
Facing the same issue. Would cloning the repository and adding einops to requirements.txt help?