swin-transformer在Jeston Nx部署

Jumi爱笑笑

720人浏览 · 2023-06-27 11:17:35

Jumi爱笑笑 · 2023-06-27 11:17:35 发布

源的配置

将原有文件备份一下：

sudo cp /etc/apt/sources.list /etc/apt/sources.list.back

将以下阿里源sources.list之后update一下：
(备注：不用清华源也可以，之前清华源的更新必须要用到pacman…装pacman弄了很久）

deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse

apt-get update

2.拉取对应版本的镜像；

https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch/tags
sudo docker pull nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.8-py3

该镜像里面预装了pytorch和torchvision以及cuda;
踩坑：https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048 不知道是不是被强制屏蔽了，jeston的pytorch版本始终下载不下来~

3.运行镜像

sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4

备注：重点在于 --runtime nvidia，因为要调用host上的nvidia驱动~；
[图片]

4.安装opencv

先是安装opencv:
pip安装的方式（失败）
以下两种pip install的方式都失败了~原因在于arm上的opencv-python wheel貌似有些问题，最后就只能尝试源码编译；
安装opencv-python时一直在 running setup.py bdist_wheel for opencv-python卡着,所以：

pip3 install --upgrade pip

opencv比较大，编译起来很慢，所以用–verbose来追踪编译过程，也方便指导哪里卡住了~

pip3 install opencv-python --verbose

以上这种方式仍然会报错：

  TypeError: _classify_installed_files() got an unexpected keyword argument '_cmake_install_dir'
  Building wheel for opencv-python (pyproject.toml) ... error
  ERROR: Failed building wheel for opencv-python
Failed to build opencv-python
ERROR: Could not build wheels for opencv-python, which is required to install pyproject.toml-based projects

源码编译(成功）
下载4.7的opencv源码

git clone https://github.com/opencv/opencv.git
mkdir build && cd build

使用以下命令构建makefile，注意用哪个python版本去编译

cmake -DCMAKE_BUILD_TYPE=RELEASE -DENABLE_NEON=ON -DWITH_TENGINE=ON -DOPENCV_ENABLE_NONFREE=ON -DCMAKE_INSTALL_PREFIX=/usr/local -DOPENCV_GENERATE_PKGCONFIG=ON -DPYTHON_EXECUTABLE=/usr/bin/python3.6 -DOPENCV_PYTHON3_INSTALL_PATH=/usr/local/lib/python3.6/dist-packages ..

指定编译的并行任务个数

make -j8

执行编译

make install

确认是否安装成功

python3
import cv2
cv2.__version__

能正确打印版本说明安装成功
5.安装mmcv-full

pip3 install -v mmcv-full==1.4.0

（以下后面证明版本不对，不可用）

pip3 install mmcv-full --verbose

–verbose ：加上这个追踪编译的过程，不然编译的很慢，老是觉得卡住了
7.下载swin-transfomer和编译mmdetection
不能从源码编译最新版本，要下载指定版本

pip3 install mmdet==2.28.2

（以下不可用）

git clone https://github.com/open-mmlab/mmdetection.git    
cd mmdetection
pip install -v -e .

安装apex
APEX 是来自英伟达(NVIDIA) 的一个很好用的深度学习加速库。由英伟达开源，完美支持PyTorch框架，用于改变数据格式来减小模型显存占用的工具。代码当中有用到对apex的依赖；

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

下载对应版本的pth文件；然后run:

8.docker commit 提交镜像；

 [root@docker-test1 ~]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
651a8541a47d        docker.io/ubuntu    "/bin/bash"         37 seconds ago      Up 36 seconds                           myubuntu

docker commit :从容器创建一个新的镜像。
# docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
-a :提交的镜像作者；
-c :使用Dockerfile指令来创建镜像；
-m :提交时的说明文字；
-p :在commit时，将容器暂停。
 
根据这个myubuntu容器提交镜像
[root@docker-test1 ~]# docker commit -a "wangshibo" -m "this is test" 651a8541a47d myubuntu:v1
sha256:6ce4aedd12cda693d0bbb857cc443a56f9f569106e09ec61aa53563d0932ea4d

其他知识点总结：

1.arm\x86-64的区别
指令集不一样
2.arm架构上的conda要安装miniforge
3.arm上的ubuntu的软件管理器是pacman,这个安装起来也有点麻烦，因为这个库有一些前置依赖库pkg-config；
4.如果dockerfile里面的CMD和entrypoint什么都没写的，直接docker run起来的容器会很快退出，而且没有log,如果要在容器中停留比较久的话需要-it进入交互式界面；
6.cmake
cmake是一个跨平台的安装（编译）工具，可以用简单的语句来描述所有平台的安装（编译）过程。他能够输出各种各样的makefile或者project文件，能测试编译器所支持的C++特性。CMake的组态栏取名为CMakeLists.txt。CMake并不直接建构出最终的软件，而是产生标准的建构档（如Unix的Makefile），然后再依一般的建构方式使用。
7.交叉编译
交叉编译指的是在一个平台上编译在另外一个平台上运行的程序，比如通常指的是在x86上编译在arm上运行的程序，需要用到交叉编译器，这种通常是运行平台不支持编译工具或者编译较慢，或者目前上头上没有目标运行平台；
8.apt install 和 pip install的区别
[图片]
9.pip install 默认的安装路径是在/usr/local下面，不要放到/usr/lib下面
/lib是内核级的,/usr/lib是系统级的,/usr/local/lib是用户级的. 仅仅被/usr目录下的程序所使用的共享库不必放到/lib目录下。只有/bin和/sbin下的程序所需要的库有必要放到/lib目录下。
10.Linux编译安装中configure、make和make install各自的作用详解
./configure --prefix=/usr
configure一般是用来生成Makefile,make是用来编译的，make install就是根据makefile进行编译；
cmake也是用来生成makefile的
11.docker 相关命令的解释