Update README.md
Browse files
README.md
CHANGED
|
@@ -23,7 +23,7 @@ We provide the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruc
|
|
| 23 |
|
| 24 |
# Running in a Mobile App
|
| 25 |
|
| 26 |
-
The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-parq-2w-4e-shared/blob/main/phi4_model_2bit.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://
|
| 27 |
|
| 28 |
# Quantization Recipe
|
| 29 |
|
|
|
|
| 23 |
|
| 24 |
# Running in a Mobile App
|
| 25 |
|
| 26 |
+
The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-parq-2w-4e-shared/blob/main/phi4_model_2bit.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/apple) for doing this in iOS. On iPhone 15 Pro, the model runs at 27 tokens/second and uses 1453 Mb of memory.
|
| 27 |
|
| 28 |
# Quantization Recipe
|
| 29 |
|