TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Recomputer Rk Llm

Seeed-Projects / reComputer-RK-LLM

This repository utilizes Docker to package large language models and multimodal models optimized for Rockchip platforms. It provides a unified calling interface that is compatible with the OpenAI API, making it easy for users to integrate and use these models.

3 1 Language: Python License: MIT Updated: 2mo ago

README

Contributors
Forks
Stargazers
Issues
MIT License

Introduction

This repository utilizes Docker to package large language models and multimodal models optimized for Rockchip platforms. It provides a unified calling interface that is compatible with the OpenAI API, making it easy for users to integrate and use these models.

Hardware Prepare

For reComputer RK3588 and reComputer RK3576.

LLM

Fast start

Device Model
RK3588 rk3588-deepseek-r1-distill-qwen:7b-w8a8-latest<br>rk3588-deepseek-r1-distill-qwen:1.5b-fp16-latest<br>rk3588-deepseek-r1-distill-qwen:1.5b-w8a8-latest <br>rk3588-qwen3:1.7b-w8a8-latest<br>rk3588-qwen3:4b-w8a8-latest<br>rk3588-qwen3:0.6b-w8a8-latest<br>rk3588-gemma3:4b-w8a8-latest
RK3576 rk3576-deepseek-r1-distill-qwen:7b-w4a16-g128-latest<br>rk3576-deepseek-r1-distill-qwen:7b-w4a16-latest<br>rk3576-deepseek-r1-distill-qwen:1.5b-fp16-latest<br>rk3576-deepseek-r1-distill-qwen:1.5b-w4a16-g128-latest<br>rk3576-deepseek-r1-distill-qwen:1.5b-w4a16-latest<br>rk3576-qwen3:4b-w4a16-latest<br>rk3576-qwen3:1.7b-w4a16-latest<br>rk3576-qwen3:4b-w4a16-g128-latest<br>rk3576-qwen3:1.7b-w4a16-g128-latest<br>rk3576-qwen3:0.6b-w4a16-latest

VLM

Fast start

Device Model
RK3588 rk3588-qwen2-vl:7b-w8a8-latest<br>rk3588-qwen2-vl:2b-w8a8-latest<br>rk3588-qwen3-vl:4b-instruct_w8a8-latest<br>rk3588-qwen3-vl:2b-Instruct_w8a8-latest<br>rk3588-qwen2-vl:7b-w8a8-latest<br>rk3588-qwen2-vl:2b-w8a8-latest<br>rk3588-qwen2.5-vl:3b-w8a8-latest<br>rk3588-deepseekocr:w8a8-latest<br>rk3588-internvl3:1b-w8a8-latest
RK3576 rk3576-qwen2.5-vl:3b-w4a16-latest<br>rk3576-qwen2.5-vl:3b-w4a16-latest<br>rk3576-qwen3-vl:3b-Instruct_w4a16_g128-latest<br>rk3576-qwen3-vl:2b-Instruct_w4a16_g128-latest<br>rk3576-deepseekocr:w4a16-latest<br>rk3576-internvl3:1b-w4a16-g128-latest

Speed test

Note: A rough estimate of a model's inference speed includes both TTFT and TPOT.
Note: You can use python test_inference_speed.py --help to view the help function.

python -m venv .env && source .env/bin/activate
pip install requests
python llm_speed_test.py

๐Ÿ’ž Top contributors:

<a href="https://github.com/Seeed-Projects/reComputer-RK-LLM/graphs/contributors">
<img src="https://contrib.rocks/image?repo=Seeed-Projects/reComputer-RK-LLM" alt="contrib.rocks image" />
</a>

๐ŸŒŸ Star History

Star History Chart

Reference: rknn-llm

0 AIs selected
Clear selection
#
Name
Task