Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ We provide the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruc
 # Running in a Mobile App
-The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-parq-2w-4e-shared/blob/main/phi4_model_2bit.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://docs.pytorch.org/executorch/0.7/llm/llama-demo-ios.html) for doing this in iOS. On iPhone 15 Pro, the model runs at 27 tokens/second and uses 1453 Mb of memory.
 # Quantization Recipe

 # Running in a Mobile App
+The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-parq-2w-4e-shared/blob/main/phi4_model_2bit.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/apple) for doing this in iOS. On iPhone 15 Pro, the model runs at 27 tokens/second and uses 1453 Mb of memory.
 # Quantization Recipe