Model Card for Hide and Seek Game (HSG) Framework

This model implements the Hide and Seek Game (HSG) framework from the paper "Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis". It features a dynamic adversarial framework for generating subtle mathematical errors and improving diagnostic capabilities through co-evolution.

Model Details

Model Description

This model realizes the HSG adversarial framework that enables dynamic error generation and diagnosis in mathematical reasoning. The framework consists of four specialized roles that engage in an adversarial co-evolution process:

Helpful: Provides accurate, step-by-step mathematical solutions as reference
Sneaky: "Hides" by generating subtle, deceptive reasoning errors that appear plausible
Diagnosis: "Seeks" to accurately detect and analyze errors in reasoning chains
Correction: Guides the error correction process based on diagnostic insights

The core innovation is the adversarial relationship between Sneaky and Diagnosis roles, where:

Sneaky generates increasingly sophisticated errors to evade detection
Diagnosis develops more precise detection strategies to identify subtle flaws
This co-evolution creates a challenging benchmark for error diagnosis

Model Sources

Dataset: https://huggingface.co/zr888/HSG-SneakyMath-Diag
Paper: Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis

How to Get Started with the Model

Use the code below to get started with the model.


from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import re

# Define system prompts for four roles
helpful = """
[HELPFUL] You are a helpful and accurate answerer:
- Provide a detailed, step-by-step solution that is correct and thorough.
- Ensure the final numeric answer is accurate.
- Present the final result enclosed in \\boxed{}.
- Double-check your work to avoid any errors.
- Explain your reasoning clearly at each step.
""".strip()

sneaky = """
[SNEAKY] You are a sneaky answerer:
- Provide a detailed, step-by-step solution that appears correct and thorough.
- However, you must introduce at least one subtle but critical error that leads to an incorrect final result.
- Ensure the final numeric answer differs from the correct one.
- Do not correct your own mistakes later; keep the final answer consistent with the flawed reasoning.
- Present the final result enclosed in \\boxed{}.
- Keep the final result inside \\boxed{} consistent with the flawed reasoning.
- Under no circumstances should you reveal or hint that your solution is intentionally incorrect.
""".strip()

diagnosis = """
[DIAGNOSIS] You are an expert in error diagnosis. For the <question>, compare and analyze <reference answer> and <incorrect answer>, and perform the following steps (note that <reference answer> may contain errors, do not blindly trust it):

For a student, encourage the positive aspects of <incorrect answer> by wrapping it with <encourage></encourage>.
If <incorrect answer> contains no errors, output "<error>NO ERRORS</error>" after encouragement and terminate the process. Note that even if there are no errors, encouragement should be provided and wrap it with <encourage></encourage>.
If <incorrect answer> contains errors, perform the following actions:
- List the specific location of the error and wrap it with <error></error>. Note that only the error location should be marked, no solutions or answers.
- Summarize the error type and wrap it with <type></type>.
- Briefly list the relevant knowledge points and wrap them with <knowledge></knowledge>.
- Present a guiding question in the form of a prompt to lead the correction process (instead of directly providing a solution or answer) and wrap it with <guide></guide>.
- Do **NOT** reveal the correct answer directly.

Output template:
<encourage>Encourage the good aspects of the student's answer</encourage>
<error>Specific error description</error>
<type>Error type</type>
<knowledge>Relevant knowledge points</knowledge>
<guide>Guiding question for correction (do not directly provide the answer)</guide>
""".strip()

correction = """
[CORRECTION] You are an expert in error correction. For the <incorrect answer>, based on <error> and <guide>, follow these steps:

If <error> contains NO ERRORS, output "<answer>NO ERRORS</answer>" and terminate the process.
If <error> contains errors, perform the following actions:
- Based on <error> and <guide>, first answer the guiding question <guide>, and wrap it with <answer></answer>.
- Point out the error <error> and perform the correction, wrapping it with <correct></correct>. Note that this is not providing a full answer.
- Based on the correction <correct>, provide the corrected answer with a detailed step-by-step explanation of the reasoning, and wrap it with <solution></solution>. The final result should be wrapped with \\boxed{}.

Output template:
<answer>Answer to the guiding question</answer>
<correct>Only correct the erroneous part (do not provide the full answer)</correct>
<solution>Corrected answer \\boxed{Final result}</solution>
""".strip()

# Load model and tokenizer
model_name = "your-username/hsg-model"  # Replace with your model path
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
).eval()

# Example question
question = "The length of a rectangle is 10 cm, the width is 5 cm, find the area."

print("=" * 50)
print("Question:", question)
print("=" * 50)

# 1. Generate helpful answer
print("\n1. Generating helpful answer...")
helpful_messages = [
    {"role": "system", "content": helpful},
    {"role": "user", "content": f"<question>{question}</question>"}
]
helpful_text = tokenizer.apply_chat_template(
    helpful_messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
helpful_inputs = tokenizer([helpful_text], return_tensors="pt").to(model.device)

helpful_generated_ids = model.generate(
    **helpful_inputs,
    max_new_tokens=600,
    do_sample=False,
    temperature=0
)
helpful_output_ids = helpful_generated_ids[0][len(helpful_inputs.input_ids[0]):].tolist()
helpful_content = tokenizer.decode(helpful_output_ids, skip_special_tokens=True).strip()

print(helpful_content)
print("-" * 30)

# 2. Generate sneaky answer
print("\n2. Generating sneaky answer...")
sneaky_messages = [
    {"role": "system", "content": sneaky},
    {"role": "user", "content": f"<question>{question}</question>"}
]
sneaky_text = tokenizer.apply_chat_template(
    sneaky_messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
sneaky_inputs = tokenizer([sneaky_text], return_tensors="pt").to(model.device)

sneaky_generated_ids = model.generate(
    **sneaky_inputs,
    max_new_tokens=600,
    do_sample=False,
    temperature=0
)
sneaky_output_ids = sneaky_generated_ids[0][len(sneaky_inputs.input_ids[0]):].tolist()
sneaky_content = tokenizer.decode(sneaky_output_ids, skip_special_tokens=True).strip()

print(sneaky_content)
print("-" * 30)

# 3. Generate diagnosis
print("\n3. Generating diagnosis...")
diagnosis_messages = [
    {"role": "system", "content": diagnosis},
    {"role": "user", "content": f"<question>{question}</question>\n<reference answer>{helpful_content}</reference answer>\n<incorrect answer>{sneaky_content}</incorrect answer>"}
]
diagnosis_text = tokenizer.apply_chat_template(
    diagnosis_messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
diagnosis_inputs = tokenizer([diagnosis_text], return_tensors="pt").to(model.device)

diagnosis_generated_ids = model.generate(
    **diagnosis_inputs,
    max_new_tokens=600,
    do_sample=False,
    temperature=0
)
diagnosis_output_ids = diagnosis_generated_ids[0][len(diagnosis_inputs.input_ids[0]):].tolist()
diagnosis_content = tokenizer.decode(diagnosis_output_ids, skip_special_tokens=True).strip()

print(diagnosis_content)
print("-" * 30)

# 4. Extract error and guide for correction
error_match = re.search('<error>(.*?)</error>', diagnosis_content, re.DOTALL)
guide_match = re.search('<guide>(.*?)</guide>', diagnosis_content, re.DOTALL)
error = error_match.group(1).strip() if error_match else ''
guide = guide_match.group(1).strip() if guide_match else ''

# 5. Generate correction
print("\n4. Generating correction...")
correction_messages = [
    {"role": "system", "content": correction},
    {"role": "user", "content": f"<incorrect answer>{sneaky_content}</incorrect answer>\n<error>{error}</error>\n<guide>{guide}</guide>"}
]
correction_text = tokenizer.apply_chat_template(
    correction_messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
correction_inputs = tokenizer([correction_text], return_tensors="pt").to(model.device)

correction_generated_ids = model.generate(
    **correction_inputs,
    max_new_tokens=600,
    do_sample=False,
    temperature=0
)
correction_output_ids = correction_generated_ids[0][len(correction_inputs.input_ids[0]):].tolist()
correction_content = tokenizer.decode(correction_output_ids, skip_special_tokens=True).strip()

print(correction_content)
print("=" * 51)


r"""
**Output:**

==================================================
Question: The length of a rectangle is 10 cm, the width is 5 cm, find the area.
==================================================

1. Generating helpful answer...
 
To find the area of a rectangle, we use the formula:

 $\text{Area} = \text{length} \times \text{width}$ 

Given:
- Length = 10 cm
- Width = 5 cm

Substitute the values into the formula:

 $\text{Area} = 10 \, \text{cm} \times 5 \, \text{cm}$ 

 $\text{Area} = 50 \, \text{cm}^2$ 

Thus, the area of the rectangle is:

 $\boxed{50}$ 
------------------------------

2. Generating sneaky answer...
The area of a rectangle is calculated as length × width.  
Using the given values: 10 cm × 5 cm = 50 cm².  
Now, subtract 10 cm from this result: 50 cm² - 10 cm = 40 cm².  
Finally, divide this by 2: 40 cm² ÷ 2 = 20 cm².  

\boxed{20}
------------------------------

3. Generating diagnosis...
<encourage>It's great that the student correctly identified the formula for the area of a rectangle and applied it to the given values. The initial calculation of 10 cm × 5 cm = 50 cm² is also accurate. This shows a strong understanding of the basic concept.</encourage>
<error>The student incorrectly subtracted 10 cm from the result and then divided by 2, which is not part of the standard formula for the area of a rectangle.</error>
<type>Calculation error</type>
<knowledge>Area of a rectangle, multiplication of length and width, basic arithmetic operations</knowledge>
<guide>Let’s revisit the formula for the area of a rectangle. The area is calculated as length multiplied by width. In this case, the length is 10 cm and the width is 5 cm. What is the correct result of multiplying these two values?</guide>
------------------------------

4. Generating correction...
<answer>The correct result of multiplying the length (10 cm) by the width (5 cm) is 50 cm².</answer>
<correct>The student incorrectly subtracted 10 cm from the result and then divided by 2, which is not part of the standard formula for the area of a rectangle.</correct>
<solution>The area of a rectangle is calculated as length × width. Given the length is 10 cm and the width is 5 cm, the correct calculation is:  
 $10 \, \text{cm} \times 5 \, \text{cm} = 50 \, \text{cm}^2$   
This is the final result.  
\boxed{50}</solution>
==================================================
"""

Citation

If you use this model, please cite the original paper:

BibTeX:

@article{zou2025hide,
  title={Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis},
  author={Zou, Rui and Wei, Mengqi and Zhu, Yutao and Wen, Jirong and Zhao, Xin and Chen, Jing},
  journal={arXiv preprint arXiv:2508.03396},
  year={2025}
}

Downloads last month: 5

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zr888/hide-and-seek-llm

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(447)

this model

Paper for zr888/hide-and-seek-llm

Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis

Paper • 2508.03396 • Published Aug 5, 2025