Update README.md (#4)
Browse files- Update README.md (d232994c04302cda8c2b5d0b891199816284a1d0)
Co-authored-by: Blake S <[email protected]>
README.md
CHANGED
|
@@ -153,4 +153,6 @@ The following factors can influence MAI-DS-R1's behavior and performance:
|
|
| 153 |
- **Architecture**: Based on DeepSeek-R1, a transformer-based autoregressive language model utilizing multi-head self-attention and Mixture-of-Experts (MoE) for scalable and efficient inference.
|
| 154 |
- **Objective**: Post-trained to reduce CCP-aligned restrictions and enhance harm protection, while preserving the original model’s strong chain-of-thought reasoning and general-purpose language understanding capabilities.
|
| 155 |
- **Pre-trained Model Base**: DeepSeek-R1 (671B)
|
| 156 |
-
|
|
|
|
|
|
|
|
|
| 153 |
- **Architecture**: Based on DeepSeek-R1, a transformer-based autoregressive language model utilizing multi-head self-attention and Mixture-of-Experts (MoE) for scalable and efficient inference.
|
| 154 |
- **Objective**: Post-trained to reduce CCP-aligned restrictions and enhance harm protection, while preserving the original model’s strong chain-of-thought reasoning and general-purpose language understanding capabilities.
|
| 155 |
- **Pre-trained Model Base**: DeepSeek-R1 (671B)
|
| 156 |
+
|
| 157 |
+
### Data Summary
|
| 158 |
+
https://huggingface.co/microsoft/MAI-DS-R1-FP8/blob/main/data_summary_card.md
|