Instructions to use AbdulrhmanEldeeb/lora_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AbdulrhmanEldeeb/lora_model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AbdulrhmanEldeeb/lora_model")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("AbdulrhmanEldeeb/lora_model", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AbdulrhmanEldeeb/lora_model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AbdulrhmanEldeeb/lora_model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdulrhmanEldeeb/lora_model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/AbdulrhmanEldeeb/lora_model

SGLang

How to use AbdulrhmanEldeeb/lora_model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AbdulrhmanEldeeb/lora_model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdulrhmanEldeeb/lora_model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AbdulrhmanEldeeb/lora_model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdulrhmanEldeeb/lora_model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio new

How to use AbdulrhmanEldeeb/lora_model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for AbdulrhmanEldeeb/lora_model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for AbdulrhmanEldeeb/lora_model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for AbdulrhmanEldeeb/lora_model to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="AbdulrhmanEldeeb/lora_model",
    max_seq_length=2048,
)

Docker Model Runner
How to use AbdulrhmanEldeeb/lora_model with Docker Model Runner:
```
docker model run hf.co/AbdulrhmanEldeeb/lora_model
```

YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Fine-Tuned Llama Model for Metallurgy and Materials Science

Developed by: Abdulrhman37
License: Apache-2.0
Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit

This fine-tuned Llama model specializes in metallurgy, materials science, and engineering. It has been enhanced to provide precise and detailed responses to technical queries, making it a valuable tool for professionals, researchers, and enthusiasts in the field.

🛠️ Training Details

This model was fine-tuned with:

Unsloth: Enabled 2x faster training using efficient parameter optimization.
Hugging Face TRL: Used for advanced fine-tuning and training capabilities.

Fine-tuning focused on enhancing domain-specific knowledge using a dataset curated from various metallurgical research and practical case studies.

For a detailed walkthrough of the fine-tuning process, refer to this notebook.

🔑 Features

Supports text generation with scientific and technical insights.
Provides domain-specific reasoning with references to key metallurgical principles and mechanisms.
Built for fast inference with bnb-4bit quantization for optimized performance.

🌟 Example Use Cases

Material property analysis (e.g., "How does adding rare earth elements affect magnesium alloys?").
Failure mechanism exploration (e.g., "What causes porosity in gas metal arc welding?").
Corrosion prevention methods (e.g., "How does cathodic protection work in marine environments?").

📦 How to Use

Install Dependencies:

%%capture
!pip install unsloth

!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

Load the model:

metallurgy_prompt = """You are a highly knowledgeable assistant specializing in metallurgy, materials science,
and engineering. Below is a technical instruction.Your task is to provide an accurate, domain-specific response that appropriately addresses the request.
Ensure Your response is detailed,Provide scientifically rigorous and quantitative responses,Reference fundamental principles and mechanisms,
Include potential equations, calculations, or microstructural insights where relevant,Support statements with scientific reasoning,
Discuss potential variations or alternative interpretations


### Instruction:
{}

### Input:
{}

### Response:
{}"""

from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

if True:
    from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "Abdulrhman37/lora_model", # YOUR MODEL YOU USED FOR TRAINING
        max_seq_length = max_seq_length,
        dtype = dtype,
        load_in_4bit = load_in_4bit,

    )
    FastLanguageModel.for_inference(model) # Enable native 2x faster inference

Use the fine tunned model:

# function tp process question 
def answer(q: str):
    """
    Generates a detailed response to a metallurgy-related question using a pre-trained language model.

    Args:
        q (str): The question or instruction to be answered.

    Returns:
        str: The generated response from the model, specifically the content after "### Response:".
    """

    # Initialize the language model for fast inference
    FastLanguageModel.for_inference(model)  # Enables 2x faster native inference

    # Format the input question using the metallurgy prompt template
    inputs = tokenizer(
        [
            metallurgy_prompt.format(
                q,  # Instruction: The main question
                "",  # Input: Empty for now as no specific input is provided
                ""   # Output: Placeholder for the generated response
            )
        ],
        return_tensors="pt"  # Return input tensors
    ).to("cuda")  # Transfer tensors to GPU for faster computation

    # Generate the model's output based on the formatted input
    outputs = model.generate(**inputs, use_cache=True)  # Use cached values to speed up decoding

    # Decode the model's output into readable text
    result = tokenizer.batch_decode(outputs)

    # Split the result into sections before and after "### Response:"
    split_content = result[0].split("### Response:")
    before_response = split_content[0].strip()  # Extract content before "Response"
    after_response = split_content[1].strip().replace('<|end_of_text|>', '')  # Clean up response content

    # Prepare a detailed response dictionary for debugging or additional processing
    detailed = {
        'after_response': after_response,  # The main content of the generated response
        'before_response': before_response,  # Metadata or introductory content before the response
        'full_result': result  # The full raw output from the model
    }

    # Return only the generated response content
    return detailed['after_response']


# asking model a technical question 
q="To improve strength, toughness, and shock-resistance in Mg-Al-Mn system cast magnesium alloys (e.g. AM100A),what should I do ?"

from pprint import pprint
pprint(answer(q))

follow this notebook for help to use the model

📧 Contact

For any inquiries, feedback, or collaboration opportunities, feel free to reach out:

Email: abdodebo3@gmail.com
LinkedIn
GitHub
Phone: +20 1026821545

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: -; Downloads are not tracked for this model. How to track

AbdulrhmanEldeeb
/

lora_model