1. docker安装

参考博文 ubuntu20.04 安装离线版docker-20.10.0

(1) 下载docker文件

开发PC为x86处理器,从 https://download.docker.com/linux/static/stable/x86_64/下载docker-20.10.0.tgz。

(2) 解压OE包里GPU镜像

解压 OE包里的GPU镜像

gzip -d docker_open_explorer_ubuntu_22_j6_gpu_v3.2.0.tar.gz

(3) 拷贝文件到/usr/bin/目录下

sudo cp docker/* /usr/bin/

若出现如下错误:

cp: cannot create regular file '/usr/bin/dockerd': Text file busy

可用如下指令,强制覆盖目标文件。

sudo cp -f docker/* /usr/bin/

(4) 将docker注册成系统服务

sudo gedit /etc/systemd/system/docker.service

文件中写人如下内容:

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
 
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
 
[Install]
WantedBy=multi-user.target

(5) 文件增加可执行权限

sudo chmod +x /etc/systemd/system/docker.service
systemctl daemon-reload

(6) 启动及测试

systemctl start docker
docker --version

输出如下类似:

Docker version 20.10.0, build 7287ab3

代表安装完成。

设置开机自启动

systemctl enable docker.service

(7) 卸载

# 停止docker
sudo systemctl stop docker
# 移除开机自启动
systemctl disable docker.service
# 删除service服务
rm -f /etc/systemd/system/docker.service
# 删除docker相关命令
rm -f /usr/bin/docker*
rm -f /usr/bin/containerd*
rm -f /usr/bin/ctr
rm -f /usr/bin/runc
# 删除docker目录和容器相关文件
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
# 测试
docker --version

2. 安装NVIDIA Container Toolkit

# 配置源
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
# 安装NVIDIA Container Toolkit包
export NVIDIA_CONTAINER_TOOLKIT_VERSION=1.17.8-1
sudo apt-get install -y       nvidia-container-toolkit=${NVIDIA_CONTAINER_TOOLKIT_VERSION}       nvidia-container-toolkit-base=${NVIDIA_CONTAINER_TOOLKIT_VERSION}       libnvidia-container-tools=${NVIDIA_CONTAINER_TOOLKIT_VERSION}       libnvidia-container1=${NVIDIA_CONTAINER_TOOLKIT_VERSION}

#重启docker
sudo systemctl restart docker

3. 加载镜像

# 加载tar镜像
docker load -i docker_open_explorer_ubuntu_22_j6_gpu_v3.2.0.tar
# 查看镜像id
docker images

输出如下:

REPOSITORY                                   TAG       IMAGE ID       CREATED        SIZE
openexplorer/ai_toolchain_ubuntu_22_j6_gpu   v3.2.0    1d79ca1300ec   3 months ago   28.4GB

创建容器,运行:

sudo docker run -it --gpus all 1d79ca1300ec

若import torch或者模型训练时出现如下错误:

OpenBLAS blas_thread_init: pthread_create failed for thread 1 of 16: Operation not permitted OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max

则需要在创建容器时增加–privileged=true参数,即:

sudo docker run -it --privileged=true --gpus all (你的镜像id)

容器内验证可使用宿主机资源

kai@kai:~/Downloads/OE_3.2.0$ sudo docker run -it --gpus all 1d79ca1300ec
root@ca8a5a317809:/open_explorer# nvidia-smi 
Fri Oct 10 16:38:41 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro P2200                   Off | 00000000:65:00.0  On |                  N/A |
| 52%   46C    P8               8W /  75W |    569MiB /  5120MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

4.基于qemu的交叉编译arm镜像使用

下载qemu(出自X86平台使用Docker模拟ARM64环境):

# 下载
wget https://github.com/multiarch/qemu-user-static/releases/download/v5.1.0-2/qemu-aarch64-static.tar.gz
# 解压
tar -zxvf qemu-aarch64-static.tar.gz
# 移动到/usr/bin
mv qemu-aarch64-static /usr/bin
# 安装
sudo apt-get install qemu-user-static
# 挂载j6m交叉编译镜像
sudo docker load -i latest-humble-v1.0.tar

查看镜像

sudo docker images #查询指令,输出与下面类似
REPOSITORY                                   TAG                  IMAGE ID       CREATED         SIZE
openexplorer/ai_toolchain_ubuntu_22_j6_gpu   v3.2.0               1d79ca1300ec   4 months ago    28.4GB
autoware/autoware                            latest-humble-v1.0   1a9d0114d807   13 months ago   13.7GB

测试,1a9d0114d807替换为你自己的arm镜像id:

sudo docker run -it -v /usr/bin/qemu-aarch64-static:/usr/bin/qemu-aarch64-static 1a9d0114d807 /bin/bash -c "uname -m; exec /bin/bash" 
# 输出示例,出现aarch64即代表测试成功
WARNING: The requested image's platform (linux/arm64/v8) does not match the detected host platform (linux/amd64) and no specific platform was requested
aarch64

也可采用以下指令,创建一个arm容器,其中-v后路径将宿主机pc地址挂载到容器内制定地址。这样方便在本地使用ide进行代码修改,而在容器中进行编译。

# 创建容器,并映射宿主机地址
sudo docker run -d -it -v /home/nuvo/Downloads/j6m_bev/:/root/app --platform=linux/arm64/v8 1a9d0114d807
# 输出如下
sudo docker ps
CONTAINER ID   IMAGE          COMMAND       CREATED          STATUS          PORTS     NAMES
cf6ca5ec2be6   1a9d0114d807   "/bin/bash"   19 seconds ago   Up 19 seconds             optimistic_proskuriakova
ae83715f631b   1d79ca1300ec   "/bin/bash"   45 hours ago     Up 15 minutes   22/tcp    nifty_fermi
# 1a9d0114d807对应的arm的镜像,通过exec启动容器
sudo docker exec -it optimistic_proskuriakova /bin/bash
# 可以验证,文件地址已经挂载
root@cf6ca5ec2be6:/# cd /root/app/
root@cf6ca5ec2be6:~/app# ls
bevformer_v0      bevformer_v1      float-checkpoint-best.pth.tar  package.tar.gz  qemu-aarch64-static.tar.gz  samples.zip  thread_set.sh
bevformer_v0.zip  bevformer_v1.tar  latest-humble-v1.0.tar         package.zip     samples                     script.sh

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐