This benchmark evaluates neural processing unit (NPU) performance across Rockchip chipsets using the SSD Inceptoin V2 object detection model with 300×300 RGB input tensors.
Performance Results
| Platform | Inference Time | Throughput (FPS) |
|---|---|---|
| RK1808 | 25 ms/frame | 40 |
| RK3588 | 35–50 ms/frame | 20 |
| RK3568 | 150 ms/frame | 6 |
RK3588 Benchmark Output
$ ./rknn_benchmark ssd_inception_v2.rknn 10 7
rknn_api/rknnrt version: 1.4.0, driver version: 0.7.2
total weight size: 35947840, total internal size: 7473600
model input num: 1, output num: 2
input tensors:
index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3]
n_elems=270000, size=270000, fmt=NHWC, type=INT8, qnt_type=AFFINE
output tensors:
index=0, name=concat:0, dims=[1, 1917, 1, 4], n_elems=7668
index=1, name=concat_1:0, dims=[1, 1917, 91, 1], n_elems=174447
Warmup phase:
0: Elapse Time = 32.03ms, FPS = 31.23
1: Elapse Time = 31.38ms, FPS = 31.86
2: Elapse Time = 36.38ms, FPS = 27.48
3: Elapse Time = 43.76ms, FPS = 22.85
4: Elapse Time = 42.17ms, FPS = 23.71
5: Elapse Time = 42.99ms, FPS = 23.26
6: Elapse Time = 44.48ms, FPS = 22.48
7: Elapse Time = 44.33ms, FPS = 22.56
8: Elapse Time = 44.96ms, FPS = 22.24
9: Elapse Time = 47.09ms, FPS = 21.24
Performance measurement:
0: Elapse Time = 46.54ms, FPS = 21.49
1: Elapse Time = 47.15ms, FPS = 21.21
2: Elapse Time = 50.46ms, FPS = 19.82
3: Elapse Time = 49.54ms, FPS = 20.18
4: Elapse Time = 49.52ms, FPS = 20.19
5: Elapse Time = 49.88ms, FPS = 20.05
6: Elapse Time = 49.91ms, FPS = 20.03
7: Elapse Time = 47.64ms, FPS = 20.99
8: Elapse Time = 48.64ms, FPS = 20.56
9: Elapse Time = 50.26ms, FPS = 19.90
Avg FPS = 20.426
RK3568 Benchmark Output
$ ./rknn_benchmark ssd_inception_v2.rknn 10
rknn_api/rknnrt version: 1.4.0, driver version: 0.7.2
total weight size: 35009792, total internal size: 3873600
model input num: 1, output num: 2
input tensors:
index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3]
n_elems=270000, size=270000, fmt=NHWC, type=INT8, qnt_type=AFFINE
output tensors:
index=0, name=concat:0, dims=[1, 1917, 1, 4], n_elems=7668
index=1, name=concat_1:0, dims=[1, 1917, 91, 1], n_elems=174447
Warmup phase:
0: Elapse Time = 157.74ms, FPS = 6.34
1: Elapse Time = 176.21ms, FPS = 5.67
2: Elapse Time = 156.27ms, FPS = 6.40
3: Elapse Time = 143.38ms, FPS = 6.97
4: Elapse Time = 143.27ms, FPS = 6.98
5: Elapse Time = 143.13ms, FPS = 6.99
6: Elapse Time = 145.36ms, FPS = 6.88
7: Elapse Time = 146.58ms, FPS = 6.82
8: Elapse Time = 142.40ms, FPS = 7.02
9: Elapse Time = 145.87ms, FPS = 6.86
Performance measurement:
0: Elapse Time = 145.39ms, FPS = 6.88
1: Elapse Time = 146.34ms, FPS = 6.83
2: Elapse Time = 147.25ms, FPS = 6.79
3: Elapse Time = 143.72ms, FPS = 6.96
4: Elapse Time = 144.48ms, FPS = 6.92
5: Elapse Time = 145.42ms, FPS = 6.88
6: Elapse Time = 139.06ms, FPS = 7.19
7: Elapse Time = 173.21ms, FPS = 5.77
8: Elapse Time = 148.35ms, FPS = 6.74
9: Elapse Time = 143.97ms, FPS = 6.95
Avg FPS = 6.770
Analysis
The RK1808 demonstrates the highest throughput at 40 FPS, followed by RK3588 at approximately 20 FPS. The RK3568 shows significantly lower performence at 6 FPS. The RK3588 exhibits inference time variance between 35–50 ms per frame during initial runs, stabilizing arround 46–50 ms during sustained execution. The RK3568 maintains relatively consistent latency around 140–150 ms per frame throughout the benchmark.