Instructions to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stratosphere/qwen2.5-1.5b-slips-immune-summarization") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("stratosphere/qwen2.5-1.5b-slips-immune-summarization") model = AutoModelForCausalLM.from_pretrained("stratosphere/qwen2.5-1.5b-slips-immune-summarization") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stratosphere/qwen2.5-1.5b-slips-immune-summarization" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization
- SGLang
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stratosphere/qwen2.5-1.5b-slips-immune-summarization" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stratosphere/qwen2.5-1.5b-slips-immune-summarization" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stratosphere/qwen2.5-1.5b-slips-immune-summarization", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for stratosphere/qwen2.5-1.5b-slips-immune-summarization to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="stratosphere/qwen2.5-1.5b-slips-immune-summarization", max_seq_length=2048, ) - Docker Model Runner
How to use stratosphere/qwen2.5-1.5b-slips-immune-summarization with Docker Model Runner:
docker model run hf.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization
Qwen2.5-1.5B — Slips IDS Security Summarization
Model Description
A fine-tuned version of Qwen2.5-1.5B-Instruct specialized for translating technical network security events from Slips IDS into clear, human-readable incident summaries with severity assessments.
Slips is a network intrusion detection system that generates DAG-structured alert logs — chains of related security events per source IP per time window. Raw Slips output is highly technical and difficult to interpret quickly. This model translates those logs into structured, concise summaries grouped by event type, with per-event severity labels (CRITICAL / HIGH / MEDIUM / LOW / INFO) and an overall severity breakdown.
The model was fine-tuned using SFT (Supervised Fine-Tuning) with best-of-N response selection: for each training incident, the highest-scoring response (judged by an LLM-as-judge) among GPT-4o, GPT-4o-mini, Qwen2.5 3B, and Qwen2.5 was selected as ground truth.
Quick Start
Ollama (recommended for local deployment)
ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization
# or a specific quantization:
ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization:q5_k_m
ollama run stratosphere/qwen2.5-1.5b-slips-immune-summarization:q8_0
Python (Transformers)
This model uses a merged prompt format: instructions and the DAG are combined into a single user message with no system prompt.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "stratosphere/qwen2.5-1.5b-slips-immune-summarization"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
dag_input = """
============================================================
Incident: abc123
Source IP: 192.168.1.100 | Timewindow: 5
Timeline: 2024-01-15 14:00:00 to 2024-01-15 15:00:00
Threat Level: 8.5 | Events: 42
...
""" # paste your Slips DAG analysis here
user_message = f"""You are a security analyst. Your task is to translate technical security events into clear, concise, human-readable summaries and assess their severity.
INCIDENT METADATA:
- Incident ID: abc123
- Source IP: 192.168.1.100
- Timewindow: 5
- Accumulated Threat Level: 8.5
- Time Range: 2024-01-15 14:00:00 to 2024-01-15 15:00:00
- Total Events: 42
RAW EVENTS (Time | Description):
{dag_input}
YOUR TASK:
1. Transform the technical event descriptions into clear, readable summaries using plain language
2. Group identical or very similar events (e.g., 24 identical connections → one summary line)
3. Assess the severity of each event/group based on security impact:
- CRITICAL: Active exploitation, data exfiltration, confirmed malware C2
- HIGH: Scanning, suspicious connections, potential threats
- MEDIUM: Anomalous but potentially benign behavior
- LOW: Minor issues, likely false positives
- INFO: Informational events, normal network behavior
4. Calculate the overall severity breakdown based on your assessments
OUTPUT FORMAT (match this structure exactly):
============================================================
Incident: <incident_id>
Source IP: <source_ip> | Timewindow: <timewindow>
Timeline: <start> to <end>
Threat Level: <threat_level> | Events: <count>
• HH:MM-HH:MM - [Your clear grouped summary] [YOUR_ASSESSED_SEVERITY]
• HH:MM - [Your clear summary] [YOUR_ASSESSED_SEVERITY]
Total Evidence: <count> events
Severity breakdown: [Your calculated breakdown, e.g., "High: 5, Medium: 3, Info: 2"]
RULES:
- Group identical events into ONE line
- Use time ranges (HH:MM-HH:MM) when showing grouped events
- Assess severity based on security impact, not just event type
- Keep descriptions clear and concise
- Just output the structured summary - no explanations or meta-commentary"""
messages = [{"role": "user", "content": user_message}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
output = model.generate(input_ids, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True))
Training Details
Dataset
The training data is publicly available at stratosphere/immune-summary-sft-dataset.
- Source: 532 incidents from real Slips IDS network captures
- Responses: 4 model responses per incident (GPT-4o, GPT-4o-mini, Qwen2.5 3B, Qwen2.5 1B) used as candidate labels, scored by an LLM-as-judge
- Selection: Best-of-N — highest-scoring response per incident used as training target
- Filtering: Responses with judge score < 4 or summary token length outside [50, 400] discarded
- Split: 90% train / 10% eval (stratified, seed=42)
Training Procedure
| Parameter | Value |
|---|---|
| Base model | unsloth/Qwen2.5-1.5B-Instruct |
| Training method | SFT (Supervised Fine-Tuning) |
| Framework | Unsloth + TRL SFTTrainer |
LoRA rank (r) |
64 |
| LoRA alpha | 64 |
| LoRA dropout | 0.0 |
| RSLoRA | enabled (required at r=64) |
| LoRA targets | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Sequence length | 4096 |
| Batch size | 1 (effective: 16 via gradient accumulation) |
| Gradient accumulation steps | 16 |
| Learning rate | 2e-5 |
| LR scheduler | cosine |
| Warmup steps | 20 |
| Weight decay | 0.01 |
| Epochs | 3 |
| Optimizer | adamw_8bit |
| Precision | BF16 |
| Quantization (training) | 4bit |
| Hardware | A100 80GB MiG 20GB slice (e-infra.cz cloud) |
| Response masking | train_on_responses_only — loss computed on assistant turns only |
Framework Versions
- Unsloth: 2026.3.18
- Transformers: (auto-detected)
- PyTorch: (auto-detected)
Evaluation
Evaluated on 47 held-out Slips IDS incidents using gpt-oss-120b as an independent LLM-as-judge. The judge ranked all 5 model responses per incident simultaneously on a 1–10 scale, with model labels randomized per incident to prevent position bias. Inference was performed with the merged prompt format (instructions + DAG combined in a single user message, no system prompt) at 4096 max input tokens.
Overall Results
| Rank | Model | Avg Score | Avg Position | Win Rate | Wins |
|---|---|---|---|---|---|
| 1 | GPT-4o-mini | 6.89/10 | 1.81 | 42.6% | 20 |
| 2 | GPT-4o | 5.87/10 | 2.38 | 29.8% | 14 |
| 3 | Qwen2.5-1.5B (finetuned) | 4.70/10 | 3.21 | 19.1% | 9 |
| 4 | Qwen2.5 3B (baseline) | 4.57/10 | 3.40 | 8.5% | 4 |
| 5 | Qwen2.5 1B (baseline) | 3.36/10 | 4.19 | 0.0% | 0 |
The finetuned 1.5B model beats both untuned baselines (+1.34 avg score vs Qwen2.5 1B, +0.13 vs Qwen2.5 3B) and achieves a 19.1% win rate — higher than the 3B baseline (8.5%).
By Complexity
| Complexity | Events | Finetuned Score | GPT-4o-mini Score | GPT-4o Score |
|---|---|---|---|---|
| Simple | < 500 (31 incidents) | 5.45/10 | 6.74/10 | 5.61/10 |
| Medium | 500–1999 (7 incidents) | 3.43/10 | 6.71/10 | 5.71/10 |
| Complex | ≥ 2000 (9 incidents) | 3.11/10 | 7.56/10 | 6.89/10 |
On simple incidents the finetuned model is competitive with GPT-4o (5.45 vs 5.61). Medium and complex incidents are the primary weakness, consistent with context length limitations at 4096 tokens.
By Category
| Category | Finetuned Score | Finetuned Win Rate |
|---|---|---|
| Malware (45 incidents) | 4.82/10 | 20.0% |
| Normal (2 incidents) | 2.00/10 | 0.0% |
Readability
An automated readability analysis on the 47 held-out incidents shows the model achieves a compression ratio of 0.26 with 373 abstracted bullets, 256 verbatim lines, and 44 markdown fences — indicating the model learned to paraphrase and summarize rather than echo the input DAG.
Known Limitation: Complex Incident Performance
The model struggles on medium and complex incidents (≥ 500 events), scoring 3.43/10 (medium) and 3.11/10 (complex) with 0 wins in both tiers. Large DAGs exceed the effective 4096-token context window, resulting in inference errors on the largest inputs. Reducing the input token limit to match the training sequence length mitigates but does not fully resolve this.
Intended Use
- Automated triage of Slips IDS alerts for security analysts
- First-pass summarization of network incident logs
- Input to downstream reporting or ticketing workflows
Out-of-Scope Use
- General-purpose chat or instruction following
- Security domains outside network IDS (malware analysis, vulnerability scanning, etc.)
- Non-English inputs
Citation
@misc{qwen2.5-1.5b-slips-immune,
title = {Qwen2.5-1.5B fine-tuned for Slips IDS security summarization},
author = {Stratosphere Laboratory, CTU Prague},
year = {2026},
howpublished = {\url{https://huggingface.co/stratosphere/qwen2.5-1.5b-slips-immune-summarization}}
}
Model Details
- Model size: 1.5B params
- Tensor type: FP16
- License: Apache-2.0
- Tags: Text Generation, Transformers, Safetensors, Network Security, IDS, SLIPS, Summarization, Cybersecurity, LoRA, SFT, TRL, Unsloth
- Downloads last month
- 30
Model tree for stratosphere/qwen2.5-1.5b-slips-immune-summarization
Base model
Qwen/Qwen2.5-1.5B