Deploying YOLOv8 on ITX-3588J with RKNN Toolkit

Environment Setup

Target Device Configuration

Refer to previous documentation regarding ITX-3588J development board setup. This guide assumes usage of rknn-toolkit2-lite version 2.0 with updated board drivers and toolkit.

Host Development Environment

The PC-side toolkit has compatibility limitations with Windows systems. Ubuntu-based environment are recommended, including virtual machines.

Toolkit repository: https://github.com/airockchip/rknn-toolkit2/tree/master

Switch to the 2.0 tag release and navigate to the appropriate directory.

Download the wheel package and requirements file matching your Python version. For Python 3.9:

conda create -n rknn_env python=3.9
conda activate rknn_env

Install required dependencies. It's advisable to modify the torch entry in the requirements file to torch==1.11.0:

pip install -r requirements_cp39-2.0.0b0.txt

Proceed with toolkit installation:

pip install rknn_toolkit2-2.0.0b0+9bab5682-cp39-cp39-linux_x86_64.whl

Model Preparation

For standard applications, download the corresponding ONNX files from rknn_model_zoo.

To train YOLOv8 with custom datasets and generate ONNX files, use the official airockchip version. The training process aligns with standard YOLOv8 procedures.

Modify the activation function in conv.py:

class ConvBlock(nn.Module):
    # Standard convolution implementation
    activation_func = nn.ReLU()   # Replaced SiLU for RKNN optimization
    
    def __init__(self, input_channels, output_channels, kernel_size=1, stride=1, padding=None, groups=1, dilation=1, activate=True):
        super().__init__()
        self.conv_layer = nn.Conv2d(input_channels, output_channels, kernel_size, stride, autopad(kernel_size, padding, dilation), groups=groups, dilation=dilation, bias=False)
        self.batch_norm = nn.BatchNorm2d(output_channels)
        self.activation = self.activation_func if activate is True else activate if isinstance(activate, nn.Module) else nn.Identity()

Add ONNX export support in head.py within the DetectionHead class:

class DetectionHead(nn.Module):
    dynamic_mode = False  
    export_mode = False  
    tensor_shape = None
    anchor_points = torch.empty(0)  
    stride_values = torch.empty(0)  

    # RKNN ONNX export enhancement
    point_conv = nn.Conv2d(16, 1, 1, bias=False).requires_grad_(False)
    index_tensor = torch.arange(16, dtype=torch.float)
    point_conv.weight.data[:] = nn.Parameter(index_tensor.view(1, 16, 1, 1))
    
    def __init__(self, num_classes=80, channels=()):  
        super().__init__()
        self.num_classes = num_classes  
        self.num_layers = len(channels)  
        self.distribution_channels = 16  
        self.outputs_per_anchor = num_classes + self.distribution_channels * 4  
        self.stride_computed = torch.zeros(self.num_layers)  
        channel_2, channel_3 = max((16, channels[0] // 4, self.distribution_channels * 4)), max(channels[0], min(self.num_classes, 100))  
        self.regression_branch = nn.ModuleList(
            nn.Sequential(ConvBlock(x, channel_2, 3), ConvBlock(channel_2, channel_2, 3), nn.Conv2d(channel_2, 4 * self.distribution_channels, 1)) for x in channels)
        self.classification_branch = nn.ModuleList(nn.Sequential(ConvBlock(x, channel_3, 3), ConvBlock(channel_3, channel_3, 3), nn.Conv2d(channel_3, self.num_classes, 1)) for x in channels)
        self.distribution_focal = DFL(self.distribution_channels) if self.distribution_channels > 1 else nn.Identity()

    def forward(self, input_features):
        feature_shape = input_features[0].shape  
        
        if self.export_mode and self.format in ('onnx'):
            output_list = []
            for layer_idx in range(self.num_layers):
                regression_output = self.regression_branch[layer_idx](input_features[layer_idx])
                classification_output = self.classification_branch[layer_idx](input_features[layer_idx])
                output_list.append(self.point_conv(regression_output.view(regression_output.shape[0], 4, 16, -1).transpose(2, 1).softmax(1)))
                output_list.append(classification_output)
            return output_list

        for layer_idx in range(self.num_layers):
            input_features[layer_idx] = torch.cat((self.regression_branch[layer_idx](input_features[layer_idx]), self.classification_branch[layer_idx](input_features[layer_idx])), 1)
        if self.training:
            return input_features

Create export_onnx.py script:

from ultralytics import YOLO

model_weights = "runs/detect/train/weights/best.pt" 
neural_network = YOLO(model_weights)
conversion_result = neural_network.export(format='onnx', imgsz=(640, 640), opset=12, simplify=True)  
assert conversion_result
print("ONNX conversion completed successfully")

Converting ONNX to RKNN

The official rknn_model_zoo repository contains conversion scripts for various models. Locate the YOLOv8 conversion script and execute it directly.

Tags: YOLOv8 RKNN ITX-3588J ONNX Neural Network Deployment

Posted on Fri, 15 May 2026 17:39:35 +0000 by BenMo