Version Checking
# Check CUDA version (Command Prompt)
nvcc -V or nvcc --version
# Check Python version (Command Prompt)
python
# Check available CUDA versions (Command Prompt)
nvidia-smi # CUDA Version is displayed after this text
Installation Process
1. Visual Studio Installation
- Version Selection: For CUDA 11.8.0 (can be higher than system requirements), visit the CUDA Toolkit Archive at https://developer.nvidia.com/cuda-toolkit-archive to find compatible Visual Studio versions.
- Download Visual Studio: Get the 2022 Community edition from https://visualstudio.microsoft.com/zh-hans/downloads/
- Installation: After launching the installer, wait for initial loading to complete.
- Path Configuration: Modify installation paths for the three main components to a non-system drive (e.g., D:) to save space.
- Component Selection: Choose necessary components - for deep learning work, select "Desktop development with C++" and "Python developmant".
- Installation: Click the install button to begin the process.
2. CUDA Installation
- Download: Get the appropriate version from https://developer.nvidia.com/cuda-toolkit-archive
- Installation: Run the downloaded .exe file. The temporary file path can be modified but shouldn't match the final CUDA installation path.
- Environment Configuraton: Verify system variables are automatically created (NV11_0 will vary based on installed version). If missing, manually add:
- NVCUDASAMPLES_ROOT
- NVCUDASAMPLES11_0_ROOT
- Verification: In Command Prompt, use: ```
nvcc --version
set cudato check CUDA version and environment variables.
3. cuDNN Installation
- Download: Get the compatible version from https://developer.nvidia.com/rdp/cudnn-archive
- Installation: Extract the downloaded archive and copy all folders to the CUDA installation directory (default: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8)
- Environment Configuration: Add these paths to the system PATH variable: ```
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp
- Verification: In Command Prompt, navigate to extras\demo_suite and run: ```
bandwidthTest.exe
deviceQuery.exe
4. Conda Environment Setup
# List environments
conda env list
# Create environment (name: MC, Python: 3.8)
conda create --name MC python=3.8
# Create environment with custom path (name: MC, path: D:/env, Python: 3.8)
conda create --name MC --prefix D:/env python=3.8
# Activate environment
conda activate MC
source conda activate MC # For Linux
# Remove environment
conda remove --name MC --all
5. PyCharm Installation
- Download: Get the Community edition from https://www.jetbrains.com/pycharm/download/?section=windows
- Installation: Run the .exe file
- Path Configuration: Can use default path
- Component Selection: Generally select all components
6. Python Installation
- Download: From https://www.python.org/downloads/, select appropriate version (not latest recommended) and system architecture
- Installation : Run the installer. The "Add Path" option can be checked during first setup 4. Path Configuration : Remember installation directory for environment variables 6. Verification : In Command Prompt, check with: ``` python
pip
10. **Environment Variables**
: Add Python installation path to system variables 12. **Final Check**
: Restart computer and verify Python and pip functionality
### 7. PyTorch Installation
1. **Download**
: From https://pytorch.org/get-started/previous-versions/, select appropriate version 3. **Installation**
: Use the following command in PyCharm terminal: ```
# CUDA 11.8
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
- Troubleshooting : If download fails, configure alternative mirrors or download .whl files for offline installation Package Management
Mirror Configuration
# List current mirrors
pip config list
# Configure mirrors
# Configure Tsinghua mirror
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# Trust the mirror
pip config set install.trusted-host pypi.tuna.tsinghua.edu.cn
# Common mirrors
# Tsinghua: https://pypi.tuna.tsinghua.edu.cn/simple
# Douban: https://pypi.douban.com/simple/
# Aliyun: https://mirrors.aliyun.com/pypi/simple/
Pip Commands
# List installed packages
pip list
# Install package
pip install package_name
# Install specific version
pip install package_name==1.1.1
# Uninstall package
pip uninstall package_name
# Update package
pip install --upgrade package_name
# Show package information
pip show package_name
# Install local .whl file
cd D:\package
pip install package_name.whl
Project Setup and Execution
Dataset Organization
RTTS Dataset Structure
\---RTTS
+---images
| +---train
+---image_001.png
\---image_002.png
| \---val
\---labels
+---train
+---image_001.txt
\---image_002.txt
\---val
\---rtts.json
\---rtts_100_val.json
DINO Framework Configuration
Library Installation
panopticapi
git clone https://github.com/cocodataset/panopticapi.git
cd .\panopticapi\
python setup.py build_ext --inplace
python setup.py build_ext install
MultiScaleDeformableAttention
cd .\models\dino\ops\
python setup.py build install
If installation fails with "unsupported Microsoft Visual Studio version", modify:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include\crt\host_config.h
Change the number after "_MSC_VER >=" to 2000, then retry installation.
ResNet50 Backbone Modifications
Dataset Configuration
Edit DINO\datasets\coco.py (around line 615):
# PATHS = {
# "train": (root / "train2017", root / "annotations" / f'{mode}_train2017.json'),
# "train_reg": (root / "train2017", root / "annotations" / f'{mode}_train2017.json'),
# "val": (root / "val2017", root / "annotations" / f'{mode}_val2017.json'),
# "eval_debug": (root / "val2017", root / "annotations" / f'{mode}_val2017.json'),
# "test": (root / "test2017", root / "annotations" / 'image_info_test-dev2017.json' ),
# }
# modified paths
PATHS = {
"train": (root / "images/train", root / "labels/rtts.json"),
"train_reg": (root / "images/train", root / "labels/rtts.json"),
"val": (root / "images/val", root / "labels/rtts_100_val.json"),
"eval_debug": (root / "images/val", root / "labels/rtts_100_val.json"),
"test": (root / "images/val", root / "labels/rtts_100_val.json"),
};
Training Hyperparameters
Edit DINO\config\DINO\DINO_4scale_mc.py:
num_classes=5
batch_size = 2
epochs = 12
Training Script Modifications
Edit DINO\main.py:
parser.add_argument('--config_file', '-c', type=str, default='config/DINO/DINO_4scale_mc.py')
parser.add_argument('--output_dir', default='logs/DINO/R50-MS4', help='path where to save, empty for no saving')
parser.add_argument('--coco_path', type=str, default='D:/project/dataset/RTTS/')
Edit DINO\config\DINO\DINO_4scale.py:
embed_init_tgt = True
use_ema = False
dn_box_noise_scale = 1.0
dn_scalar=100
dn_label_coef=1.0
dn_bbox_coef=1.0
Swin Backbone Modifications
Dataset Configuration
Same as ResNet50 configuration above
Training Hyperparameters
Edit DINO\config\DINO\DINO_4scale_swin_mc.py:
num_classes=5
batch_size = 2
epochs = 12
Training Script Modifications
Edit DINO\main.py:
parser.add_argument('--config_file', '-c', type=str, default='config/DINO/DINO_4scale_swin_mc.py')
Common Issues and Solutions
Shared Folder Error
Error: "No shared folder available dino"
Solution: Create a folder named "comp_robot" in the project directory with an "experiments" subfolder. In run_with_submitit.py, modify the get_shared_folder() function to use your custom folder path.
Resume Training Issues
When resuming training after interruption, only setting the resume parameter can cause issues with checkpoint selection.
Solution: Specify the best checkpoint manually:
parser.add_argument('--resume', default='logs/DINO/checkpoint0010.pth', help='resume from checkpoint')
parser.add_argument('--start_epoch', default=11, type=int)
Library Version Conflicts
Error: "FormatCode() got an unexpected keyword argument 'verify'"
Solution: Install compatible versions:
pip install yapf==0.40.0
pip install mmcv==2.1.0
pip install mmengine==0.10.3
pip install mmdet==3.3.0
Encoding Issues
Error: "UnicodeDecodeError: 'gbk' codec can't decode byte"
Solution: Modify util/slconfig.py:
# with open(filename) as f:
with open(filename,encoding='UTF-8') as f
# with open(filename, 'r') as f:
with open(filename, 'r',encoding='UTF-8') as f
DINOv2 Framework
A self-supervised learning model that doesn't rely on manually labeled data, instead using the inherent structure of data to generate pseudo-labels for training.
TogetherNet Framework
Environment Setup
pip install h5py==2.10.0
pip install tensorboard
Dataset Structure
\---RTTS
+---images
| +---train
+---image_001.png
\---image_002.png
| +---train_dehaze
+---image_001_dehaze.png
\---image_002_dehaze.png
| +---val
+---0.png
\---1.png
| +---val_dehaze
+---0_MSBDN.png
\---1_MSBDN.png
\---labels
+---train
+---image_001.txt
\---image_002.txt
\---val
\---rtts.json
\---rtts_100_val.json
Configuration Files
Additional configuration files are needed: train.txt, train_dehaze.txt, val.txt, val_dehaze.txt
Code Modifications
train.py
classes_path = 'model_data/rtts_classes.txt'
# model_path = 'model_data/yolox_s.pth'
#batch and epoch settings
Freeze_batch_size = 16
UnFreeze_Epoch = 100
Unfreeze_batch_size = 16
# Dataset paths
train_annotation_path = '2007_train_fog.txt'
val_annotation_path = '2007_val_fog.txt'
clear_annotation_path = '2007_train.txt'
val_clear_annotation_path = '2007_val.txt'
utils/dataloader.py
def get_random_data(self, annotation_line, clearimage_line, input_shape, jitter=.3, hue=.1, sat=0.7, val=0.4, random=True):
# Extract image names
pic_name=line[0]
pic_clear_name=clearline[0]
# Construct image paths
line=os.path.join("D:/project/dataset/RTTS/images/" ,pic_name+ ".png")
clearline = os.path.join("D:/project/dataset/RTTS/images/", pic_clear_name + ".png")
# Load images
image = Image.open(line)
image = cvtColor(image)
clearimg = Image.open(clearline)
clearimg = cvtColor(clearimg)
# Load label data
train_label_path=os.path.join("D:/project/dataset/RTTS/labels/" ,pic_name+ ".txt")
with open(train_label_path, encoding='utf-8') as f:
train_labels_lines = f.readlines()
box = np.array([np.array(list(map(float,box_element.split( )[1:]))) for box_element in train_labels_lines])
nets/yolo.py
def forward(self, input):
if self.training:
# Split input into haze and clear images
input, clear_x = input.split((4, 4), dim=0)
Training Process
Run train.py to start training. Results are stored in the logs folder. With 4300+ images and batch size 16, each epoch takes approximately 3 minutes.
RT-DETR-V2 Framework
Environment Setup
conda create --name rt python=3.8
conda install pytorch2.0.1 torchvision0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
Dataset Notes
JSON class numbering should start from 1.