Instructions to use codefuse-ai/CodeFuse-DeepSeek-33B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use codefuse-ai/CodeFuse-DeepSeek-33B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="codefuse-ai/CodeFuse-DeepSeek-33B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("codefuse-ai/CodeFuse-DeepSeek-33B")
model = AutoModelForCausalLM.from_pretrained("codefuse-ai/CodeFuse-DeepSeek-33B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use codefuse-ai/CodeFuse-DeepSeek-33B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "codefuse-ai/CodeFuse-DeepSeek-33B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-DeepSeek-33B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/codefuse-ai/CodeFuse-DeepSeek-33B

SGLang

How to use codefuse-ai/CodeFuse-DeepSeek-33B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "codefuse-ai/CodeFuse-DeepSeek-33B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-DeepSeek-33B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "codefuse-ai/CodeFuse-DeepSeek-33B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codefuse-ai/CodeFuse-DeepSeek-33B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use codefuse-ai/CodeFuse-DeepSeek-33B with Docker Model Runner:
```
docker model run hf.co/codefuse-ai/CodeFuse-DeepSeek-33B
```

Is there anything about the training data that makes this specifically better at Java and C++?

by jukofyork - opened Jan 30, 2024

Discussion

jukofyork

Jan 30, 2024

•

edited Jan 30, 2024

Hi, just converting this model to GGUF format now and have a couple of questions.

From: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard

humaneval-python = 76.83
java = 60.76
javascript = 66.46
cpp = 65.22

Is there anything about the training data that makes this specifically better at Java and C++? This seems to be the first recently fine-tuned coding model I've seen that isn't massively biased towards Python (to game the humaneval-python benchmarks, etc). The recent WizardCoder-33B-V1.1, which is also fine-tuned from Deepseek-Coder-33B, is so over-trained on Python that it tries to convert everything it's given in C++ or Java into Python, and is basically unusable for anything else!!!

I will give it a try and report back on how I get on.

Sadly I don't have enough upload bandwidth to upload the GGUF(s), but hopefully @TheBloke or @LoneStriker will convert it soon as a non-Python targeted fine-tune could be very useful to a lot of people.

codefuse-admin

CodeFuse AI org Feb 6, 2024

I'm sorry for not being able to respond in time.
In the training of this model, we used unit test generation data containing Java/C++ and code practice exercises (also containing Java/C++) we constructed (referencing the PHI-Textbook work). We have published an article on WeChat's official accounts which contains more information; however, I apologize that it is written in Chinese https://mp.weixin.qq.com/s/2Ddm7-aUJuEnsESSxkmkGg

codefuse-admin

CodeFuse AI org Feb 6, 2024

I have translated the introduction of the data used for fine-tuning into English:

twelveand0 changed discussion status to closed Mar 20, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment