This tutorial will guide you through the process of running large language models (LLMs) on the NPU. The recommended operating system for this tutorial is available here. Credits to Joshua Riek for providing the OS.
First, clone the repository by running the following command:
git clone https://github.com/Pelochus/ezrknn-llm
Install the build dependencies:
sudo apt install -y git git-lfs python-is-python3 python3-pip python3-dev build-essential cmake libxslt1-dev zlib1g-dev libglib2.0-dev libsm6 libgl1-mesa-glx libprotobuf-dev libhdf5-dev
Navigate into the cloned directory and run the installation script:
cd ezrknn-llm && sudo bash install.sh
Pelochus has provided a collection of preconverted RKLLM models, which can be found here. For this example, we will use the model available at phi-3-mini-rk3588.
Clone the model repository using the following command:
git clone LINK_HERE
Replace LINK_HERE
with the actual link to the model you wish to use.
Navigate into the model's directory and pull the large files, if necessary:
cd model_folder
git lfs pull
Replace model_folder
with the name of the directory you cloned.
Finally, run the model using the following command:
rkllm ./pathtomodel.rkllm
Replace ./pathtomodel.rkllm
with the path to your RKLLM model file.
You should now be able to run your chosen LLM on the NPU.
You can monitor the NPU usage using
watch -n 1 sudo cat /sys/kernel/debug/rknpu/load