Instructions to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="stratosphere/qwen2.5-1.5b-slips-immune-summarization")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("stratosphere/qwen2.5-1.5b-slips-immune-summarization")
model = AutoModelForCausalLM.from_pretrained("stratosphere/qwen2.5-1.5b-slips-immune-summarization")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stratosphere/qwen2.5-1.5b-slips-immune-summarization"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization

SGLang

How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stratosphere/qwen2.5-1.5b-slips-immune-summarization" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stratosphere/qwen2.5-1.5b-slips-immune-summarization" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="stratosphere/qwen2.5-1.5b-slips-immune-summarization",
    max_seq_length=2048,
)

Docker Model Runner
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Docker Model Runner:
```
docker model run hf.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization
```

Qwen2.5-1.5B — Slips IDS Security Summarization

Model Description

A fine-tuned version of Qwen2.5-1.5B-Instruct specialized for translating technical network security events from Slips IDS into clear, human-readable incident summaries with severity assessments.

Slips is a network intrusion detection system that generates DAG-structured alert logs — chains of related security events per source IP per time window. Raw Slips output is highly technical and difficult to interpret quickly. This model translates those logs into structured, concise summaries grouped by event type, with per-event severity labels (CRITICAL / HIGH / MEDIUM / LOW / INFO) and an overall severity breakdown.

The model was fine-tuned using SFT (Supervised Fine-Tuning) with best-of-N response selection: for each training incident, the highest-scoring response (judged by an LLM-as-judge) among GPT-4o, GPT-4o-mini, Qwen2.5 3B, and Qwen2.5 was selected as ground truth.

Quick Start

Ollama (recommended for local deployment)

ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization
# or a specific quantization:
ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization:q5_k_m
ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization:q8_0

Python (Transformers)

This model uses a merged prompt format: instructions and the DAG are combined into a single user message with no system prompt.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "stratosphere/qwen2.5-1.5b-slips-immune-summarization"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

dag_input = """
============================================================
Incident: abc123
Source IP: 192.168.1.100 | Timewindow: 5
Timeline: 2024-01-15 14:00:00 to 2024-01-15 15:00:00
Threat Level: 8.5 | Events: 42
...
"""  # paste your Slips DAG analysis here

user_message = f"""You are a security analyst. Your task is to translate technical security events into clear, concise, human-readable summaries and assess their severity.

INCIDENT METADATA:
- Incident ID: abc123
- Source IP: 192.168.1.100
- Timewindow: 5
- Accumulated Threat Level: 8.5
- Time Range: 2024-01-15 14:00:00 to 2024-01-15 15:00:00
- Total Events: 42

RAW EVENTS (Time | Description):
{dag_input}

YOUR TASK:
1. Transform the technical event descriptions into clear, readable summaries using plain language
2. Group identical or very similar events (e.g., 24 identical connections → one summary line)
3. Assess the severity of each event/group based on security impact:
   - CRITICAL: Active exploitation, data exfiltration, confirmed malware C2
   - HIGH: Scanning, suspicious connections, potential threats
   - MEDIUM: Anomalous but potentially benign behavior
   - LOW: Minor issues, likely false positives
   - INFO: Informational events, normal network behavior
4. Calculate the overall severity breakdown based on your assessments

OUTPUT FORMAT (match this structure exactly):

============================================================
Incident: <incident_id>
Source IP: <source_ip> | Timewindow: <timewindow>
Timeline: <start> to <end>
Threat Level: <threat_level> | Events: <count>

• HH:MM-HH:MM - [Your clear grouped summary] [YOUR_ASSESSED_SEVERITY]
• HH:MM - [Your clear summary] [YOUR_ASSESSED_SEVERITY]

Total Evidence: <count> events
Severity breakdown: [Your calculated breakdown, e.g., "High: 5, Medium: 3, Info: 2"]

RULES:
- Group identical events into ONE line
- Use time ranges (HH:MM-HH:MM) when showing grouped events
- Assess severity based on security impact, not just event type
- Keep descriptions clear and concise
- Just output the structured summary - no explanations or meta-commentary"""

messages = [{"role": "user", "content": user_message}]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
output = model.generate(input_ids, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True))

Training Details

Dataset

The training data is publicly available at stratosphere/immune-summary-sft-dataset.

Source: 532 incidents from real Slips IDS network captures
Responses: 4 model responses per incident (GPT-4o, GPT-4o-mini, Qwen2.5 3B, Qwen2.5 1B) used as candidate labels, scored by an LLM-as-judge
Selection: Best-of-N — highest-scoring response per incident used as training target
Filtering: Responses with judge score < 4 or summary token length outside [50, 400] discarded
Split: 90% train / 10% eval (stratified, seed=42)

Training Procedure

Parameter	Value
Base model	`unsloth/Qwen2.5-1.5B-Instruct`
Training method	SFT (Supervised Fine-Tuning)
Framework	Unsloth + TRL SFTTrainer
LoRA rank (`r`)	64
LoRA alpha	64
LoRA dropout	0.0
RSLoRA	enabled (required at r=64)
LoRA targets	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Sequence length	4096
Batch size	1 (effective: 16 via gradient accumulation)
Gradient accumulation steps	16
Learning rate	2e-5
LR scheduler	cosine
Warmup steps	20
Weight decay	0.01
Epochs	3
Optimizer	adamw_8bit
Precision	BF16
Quantization (training)	4bit
Hardware	A100 80GB MiG 20GB slice (e-infra.cz cloud)
Response masking	`train_on_responses_only` — loss computed on assistant turns only

Framework Versions

Unsloth: 2026.3.18
Transformers: (auto-detected)
PyTorch: (auto-detected)

Evaluation

Evaluated on 47 held-out Slips IDS incidents using gpt-oss-120b as an independent LLM-as-judge. The judge ranked all 5 model responses per incident simultaneously on a 1–10 scale, with model labels randomized per incident to prevent position bias. Inference was performed with the merged prompt format (instructions + DAG combined in a single user message, no system prompt) at 4096 max input tokens.

Overall Results

Rank	Model	Avg Score	Avg Position	Win Rate	Wins
1	GPT-4o-mini	6.89/10	1.81	42.6%	20
2	GPT-4o	5.87/10	2.38	29.8%	14
3	Qwen2.5-1.5B (finetuned)	4.70/10	3.21	19.1%	9
4	Qwen2.5 3B (baseline)	4.57/10	3.40	8.5%	4
5	Qwen2.5 1B (baseline)	3.36/10	4.19	0.0%	0

The finetuned 1.5B model beats both untuned baselines (+1.34 avg score vs Qwen2.5 1B, +0.13 vs Qwen2.5 3B) and achieves a 19.1% win rate — higher than the 3B baseline (8.5%).

By Complexity

Complexity	Events	Finetuned Score	GPT-4o-mini Score	GPT-4o Score
Simple	< 500 (31 incidents)	5.45/10	6.74/10	5.61/10
Medium	500–1999 (7 incidents)	3.43/10	6.71/10	5.71/10
Complex	≥ 2000 (9 incidents)	3.11/10	7.56/10	6.89/10

On simple incidents the finetuned model is competitive with GPT-4o (5.45 vs 5.61). Medium and complex incidents are the primary weakness, consistent with context length limitations at 4096 tokens.

By Category

Category	Finetuned Score	Finetuned Win Rate
Malware (45 incidents)	4.82/10	20.0%
Normal (2 incidents)	2.00/10	0.0%

Readability

An automated readability analysis on the 47 held-out incidents shows the model achieves a compression ratio of 0.26 with 373 abstracted bullets, 256 verbatim lines, and 44 markdown fences — indicating the model learned to paraphrase and summarize rather than echo the input DAG.

Known Limitation: Complex Incident Performance

The model struggles on medium and complex incidents (≥ 500 events), scoring 3.43/10 (medium) and 3.11/10 (complex) with 0 wins in both tiers. Large DAGs exceed the effective 4096-token context window, resulting in inference errors on the largest inputs. Reducing the input token limit to match the training sequence length mitigates but does not fully resolve this.

Intended Use

Automated triage of Slips IDS alerts for security analysts
First-pass summarization of network incident logs
Input to downstream reporting or ticketing workflows

Out-of-Scope Use

General-purpose chat or instruction following
Security domains outside network IDS (malware analysis, vulnerability scanning, etc.)
Non-English inputs

Citation

@misc{qwen2.5-1.5b-slips-immune,
  title        = {Qwen2.5-1.5B fine-tuned for Slips IDS security summarization},
  author       = {Stratosphere Laboratory, CTU Prague},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization}}
}

Model Details

Model size: 1.5B params
Tensor type: FP16
License: Apache-2.0
Tags: Text Generation, Transformers, Safetensors, Network Security, IDS, SLIPS, Summarization, Cybersecurity, LoRA, SFT, TRL, Unsloth

Downloads last month: 30

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for stratosphere/qwen2.5-1.5b-slips-immune-summarization

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct