Spaces:
Running
Running
File size: 6,503 Bytes
46b3405 6adcb5f 46b3405 1ec7405 c51e8ce b43ba56 7317b04 d18b34d 7317b04 d18b34d 1ec7405 d18b34d b43ba56 1ec7405 b43ba56 1ec7405 b43ba56 1ec7405 b43ba56 1ec7405 b43ba56 1ec7405 b43ba56 d18b34d 1ec7405 d18b34d 1ec7405 d18b34d 1ec7405 d18b34d 1ec7405 d18b34d b43ba56 d18b34d b43ba56 1ec7405 d18b34d b43ba56 7317b04 b43ba56 7317b04 b43ba56 7317b04 b43ba56 1ec7405 d18b34d b43ba56 d18b34d b43ba56 d18b34d 7317b04 d18b34d b43ba56 d18b34d b43ba56 d18b34d b43ba56 7317b04 1ec7405 d18b34d b43ba56 d18b34d b43ba56 7317b04 d18b34d 7317b04 1ec7405 7317b04 d18b34d 1ec7405 d18b34d 1ec7405 b43ba56 1ec7405 b43ba56 1ec7405 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
---
title: LexiMind
emoji: π§
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: scripts/demo_gradio.py
pinned: false
---
## LexiMind: A Multi-Task NLP Model
LexiMind is a state-of-the-art Natural Language Processing model designed for complex document understanding. It features a **custom-built Transformer architecture** initialized with weights from Google's **FLAN-T5**, combining the flexibility of from-scratch implementation with the power of modern pre-trained models.
The model performs three sophisticated tasks simultaneously: **text summarization**, **emotion classification**, and **topic clustering**.
This project is built with industry-standard MLOps practices, including configuration management with Hydra, experiment tracking with MLflow, and containerization with Docker, making it a reproducible and scalable solution.
## Core Features
* **Abstractive Summarization:** Generates concise, coherent summaries of long-form text using encoder-decoder attention.
* **Emotion Classification:** Identifies emotions (Joy, Sadness, Anger, Fear, Love, Surprise) conveyed in a document.
* **Topic Clustering:** Classifies documents into thematic categories (World, Sports, Business, Sci/Tech).
## Model Architecture
LexiMind implements a **from-scratch Transformer** with modern architectural choices:
### Custom Transformer Features
* **Pre-Layer Normalization (Pre-LN):** RMSNorm applied before each sublayer for stable training
* **FlashAttention:** Via PyTorch 2.0's `scaled_dot_product_attention` for efficient computation
* **Learned Positional Embeddings:** Trainable position representations
* **Multi-Head Attention:** 12 heads with 768-dimensional representations
* **RMSNorm:** Modern normalization without bias (more efficient than LayerNorm)
### Pre-trained Weight Initialization
The model loads weights from **Google's FLAN-T5-base**, which provides:
* Strong language understanding from instruction-tuning
* Excellent performance on summarization and classification tasks
* Encoder-decoder architecture matching our custom implementation
### Multi-Task Learning
A shared encoder-decoder backbone with task-specific heads:
* **Summarization Head:** Language modeling head with weight tying
* **Emotion Head:** Mean-pooled classification with dropout
* **Topic Head:** Mean-pooled classification with dropout
## Technical Specifications
| Component | Specification |
|-----------|--------------|
| Architecture | Encoder-Decoder Transformer |
| Pre-trained Base | google/flan-t5-base |
| Hidden Dimension | 768 |
| Encoder Layers | 12 |
| Decoder Layers | 12 |
| Attention Heads | 12 |
| FFN Dimension | 2048 |
| Normalization | RMSNorm (Pre-LN) |
| Position Encoding | Learned Embeddings |
| Max Sequence Length | 512 tokens |
## Getting Started
### Prerequisites
* Python 3.10+
* Poetry for dependency management
* Docker (for containerized deployment)
* An NVIDIA GPU with CUDA support (for training and accelerated inference)
### Installation
1. **Clone the repository:**
```bash
git clone https://github.com/OliverPerrin/LexiMind.git
cd LexiMind
```
2. **Install dependencies:**
```bash
poetry install
```
3. **Download and preprocess data:**
```bash
poetry run python scripts/download_data.py
poetry run python scripts/preprocess_data.py
```
## Usage
### Configuration
All training and model parameters are managed via Hydra. Configurations are located in the `configs/` directory.
Available configurations:
* `model=base` - FLAN-T5-base (default, 12 layers)
* `model=small` - Smaller model for testing (no pretrained weights)
* `model=large` - FLAN-T5-large (24 layers, requires more VRAM)
* `training=dev` - Quick development run
* `training=medium` - Balanced training (~2-3 hours on RTX 4070)
* `training=full` - Full training run
### Training
```bash
# Default training with FLAN-T5-base
poetry run python scripts/train.py
# Quick development run
poetry run python scripts/train.py training=dev
# Medium training run (recommended for RTX 4070)
poetry run python scripts/train.py training=medium
# Override parameters
poetry run python scripts/train.py training.optimizer.lr=5e-5
# Resume from a checkpoint
poetry run python scripts/train.py training=full resume_from=checkpoints/epoch_5.pt
```
Experiments are automatically tracked with MLflow. View results with `mlflow ui`.
### Evaluation
```bash
poetry run python scripts/evaluate.py --checkpoint checkpoints/best.pt
```
### Inference & Demo
```bash
# Command-line inference
poetry run python scripts/inference.py "Your text to analyze"
# Gradio web demo
poetry run python scripts/demo_gradio.py
```
## Docker
```bash
# Build
docker build -t leximind .
# Run demo
docker run -p 7860:7860 leximind
```
## Project Structure
```text
βββ configs/ # Hydra configuration files
β βββ model/ # Model architectures (base, small, large)
β βββ training/ # Training configs (dev, medium, full)
β βββ data/ # Dataset configurations
βββ src/
β βββ models/ # Custom Transformer implementation
β β βββ encoder.py # TransformerEncoder with Pre-LN RMSNorm
β β βββ decoder.py # TransformerDecoder with KV-cache
β β βββ attention.py # Multi-Head Attention with FlashAttention
β β βββ factory.py # Model building with FLAN-T5 weight loading
β βββ data/ # Data loading and preprocessing
β βββ training/ # Training loop with mixed precision
β βββ inference/ # Inference pipeline
βββ scripts/ # Entry points
βββ tests/ # Unit tests
βββ notebooks/ # Analysis notebooks
```
## Code Quality
* **Ruff:** Fast linting and formatting
* **MyPy:** Static type checking
* **Pytest:** Full test suite covering data, models, and training
* **Pre-commit hooks:** Automated quality checks
```bash
# Install hooks
poetry run pre-commit install
# Lint
poetry run ruff check .
# Type check
poetry run mypy .
# Tests
poetry run pytest
```
## Performance Optimizations
* **torch.compile:** JIT compilation with Inductor backend
* **Mixed Precision:** bfloat16 training on Ampere/Ada GPUs
* **TF32:** Enabled for RTX 30xx/40xx series
* **KV-Cache:** Efficient autoregressive decoding
* **FlashAttention:** Memory-efficient attention via SDPA
## License
MIT License - see [LICENSE](LICENSE) for details.
|