Differences Between Loading Native and RKNN-Converted Models
Models developed in frameworks like PyTorch, TensorFlow, or ONNX must be converted into the proprietary RKNN format to leverage Rockchip’s NPU aceleration. The RKNN format is optimized for Rockchip’s neural processing units, enabling efficient execution on embedded platforms such as the RK3588, RK3566, or RV1126.
Model Format Native models (e.g., .pt, .onnx) are incompatible with the RKNN runtime. Only after conversion to the .rknn binary format can they be executed directly on Rockchip hardware. Supported source formats include:
Caffe TensorFlow ONNX PyTorch MxNet MindSpore
Precision Modes During conversion, two precision modes are available:
Float32: Preserves full numerical precision, yielding higher accuracy but consuming more memory and computational resources. Quantized (INT8): Reduces model size and power consumption by converting weights and activations to 8-bit integers, at the cost of potential accuracy loss.
Performance Optimizations The RKNN toolkit automatically applies optimizations during conversion:
Operator fusion to reduce kernel launches Memory reuse and allocation optimization Parallel execution scheduling across NPU cores
Deployment Scope Converted RKNN models can run on:
Rockchip NPU (primary target) GPU or CPU (fallback, with reduced performance)
Crucially, only converted RKNN models can be deployed on actual hardware. Models loaded via load_rknn() cannot be executed in the RKNN simulator — they require physical device connectivity.
Preparing the Environment Ensure the rknn-toolkits installed on your host system. The accompanying code package can be obtained by scanning the QR code in the original article and replying with "RKNN评估与推理" on the associated WeChat public account.
Inferring from a Native Model (PyTorch → RKNN)
The full pipeline involves conversion, export, and runtime initialization:
from rknn.api import RKNN
import cv2
import numpy as np
def show_top5(output):
sorted_indices = np.argsort(output)[::-1]
top5_str = '\n----------top5-----------\n'
for i in range(5):
idx = sorted_indices[i]
prob = output[idx]
top5_str += f"{idx}:{prob:.6f}\n"
print(top5_str)
def softmax(x):
exp_x = np.exp(x - np.max(x)) # Numerical stability
return exp_x / np.sum(exp_x)
if __name__ == '__main__':
rknn = RKNN(verbose=True)
# Configuration for preprocessing and target platform
rknn.config(
mean_values=[[123.675, 116.28, 103.53]],
std_values=[[58.395, 58.395, 58.395]],
target_platform='rk3588'
)
# Load PyTorch model
rknn.load_pytorch(
model='./resnet18.pt',
input_size_list=[[1, 3, 224, 224]]
)
# Build and quantize the model
rknn.build(
do_quantization=True,
dataset='dataset.txt',
rknn_batch_size=-1
)
# Export the RKNN model
rknn.export_rknn('resnet18.rknn')
# Initialize runtime for NPU execution
rknn.init_runtime(
target='rk3588',
perf_debug=False,
eval_mem=False,
core_mask=RKNN.NPU_CORE_AUTO
)
# Load and preprocess input image
img_bgr = cv2.imread('space_shuttle_224.jpg')
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
# Run inference
outputs = rknn.inference(inputs=[img_rgb], data_format='nhwc')
# Post-process and display top-5 predictions
prob = softmax(np.array(outputs[0][0]))
show_top5(prob)
rknn.release()
Inferring Directly from an RKNN Model
Once the model is exported as resnet18.rknn, you can skip conversion and load it directly:
from rknn.api import RKNN
import cv2
import numpy as np
def show_top5(output):
sorted_indices = np.argsort(output)[::-1]
top5_str = '\n----------top5-----------\n'
for i in range(5):
idx = sorted_indices[i]
prob = output[idx]
top5_str += f"{idx}:{prob:.6f}\n"
print(top5_str)
def softmax(x):
exp_x = np.exp(x - np.max(x))
return exp_x / np.sum(exp_x)
if __name__ == '__main__':
rknn = RKNN(verbose=True)
# Directly load the pre-converted RKNN model
rknn.load_rknn('resnet18.rknn')
# Initialize runtime (same as above)
rknn.init_runtime(
target='rk3588',
perf_debug=False,
eval_mem=False,
core_mask=RKNN.NPU_CORE_AUTO
)
# Load and preprocess input
img_bgr = cv2.imread('space_shuttle_224.jpg')
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
# Execute inference
outputs = rknn.inference(inputs=[img_rgb], data_format='nhwc')
# Output top-5 predictions
prob = softmax(np.array(outputs[0][0]))
show_top5(prob)
rknn.release()
Key Observations
Both workflows require identical runtime initialization and inference calls. The critical difference lies in whether you use load_pytorch() (conversion on-the-fly) or load_rknn() (direct deployment). Direct RKNN loading is faster for repeated inference tasks since conversion is already complete. Always verify the output class index against the ImageNet label file — in this case, index 812 corresponds to "space shuttle".