零基础入门 AI 视觉！5 个 OpenMV 实战例程，从环境搭建到代码运行一步到位

本文为AI视觉入门者提供5个基于OpenMV开发板的经典实验教程，从基础概念到代码实现全面解析。内容涵盖：1）硬件准备（OpenMV开发板+TF卡）和开发环境配置；2）5个由易到难的实验：颜色识别、条形码/二维码识别、人脸检测及68特征点识别；3）每个实验包含完整代码、逐行注释、操作步骤和常见问题解决方案。教程采用"原理讲解+实战演示"模式，特别注重新手易用性，所有代码均可直接

xy1117111704

1014人浏览 · 2025-10-19 23:56:52

xy1117111704 · 2025-10-19 23:56:52 发布

作为刚接触 AI 视觉的小白，你是不是也遇到过这些问题：看着网上的代码无从下手，不知道硬件怎么连接，跑通第一个例程要折腾大半天？别担心，这篇文章专为你打造，用最通俗的语言讲解 5 个经典 AI 视觉实验，从环境准备到代码逐行解析，再到实际运行调试，全程无门槛，带你轻松入门！

一、前置知识：AI 视觉入门必备基础

在开始实战前，我们先搞懂几个核心概念，避免后续看代码时一脸懵。

1. 核心硬件：OpenMV 开发板

我们所有实验都基于OpenMV 开发板，它是专为机器视觉设计的低成本开发板，自带摄像头和 LCD 屏幕，支持 Python 编程，非常适合新手。

优势：无需复杂的硬件焊接，插上电脑就能用，Python 语法简单易上手。
必备配件：OpenMV 开发板、Micro USB 数据线、TF 卡（用于存放模型文件）。

2. 开发环境：OpenMV IDE

编写和运行代码需要用到OpenMV IDE，这是官方推出的专用开发工具，集成了代码编辑、烧录、调试功能。

下载地址：OpenMV 官方网站（根据自己的系统选择 Windows/macOS 版本）。
安装要点：无需复杂配置，双击安装包一步步下一步即可，安装完成后打开就能用。

3. 核心概念：你需要知道的 3 个关键术语

ROI（感兴趣区域）：简单说就是我们要重点分析的图像区域，比如颜色识别中 “让物体放在中间方框里”，这个方框就是 ROI。
阈值（Threshold）：判断 “是不是目标” 的标准，比如颜色识别中，只有颜色参数在某个范围内的区域，才会被判定为我们要找的颜色。
KPU（神经网络处理器）：OpenMV 上的 “AI 大脑”，专门用来跑深度学习模型，比如人脸检测、特征点识别都需要 KPU 来加速计算。

二、实战教程：5 个 AI 视觉例程，从易到难逐个突破

接下来我们从最简单的颜色识别开始，逐步挑战条形码、二维码、人脸检测，最后实现人脸 68 个特征点识别，每个例程都包含 “代码逐行解析 + 运行步骤 + 常见问题”，保证你能看懂、能跑通。

例程 1：颜色识别 ——AI 视觉的 “Hello World”

颜色识别是最基础的 AI 视觉实验，原理是让开发板 “学习” 某一种颜色，然后在画面中找到所有这种颜色的物体并标记出来。

import sensor
import image
import time
import lcd

# 初始化硬件
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time=2000)
sensor.set_auto_gain(False)
sensor.set_auto_whitebal(False)
clock = time.clock()

# 创建中心方框
r = [(320//2)-(50//2), (240//2)-(50//2), 50, 50]

# 显示白色方框（等待阶段）
for i in range(50):
    img = sensor.snapshot()
    img.draw_rectangle(r)
    lcd.display(img)

# 学习颜色阈值
print("Learning thresholds...")
threshold = [50, 50, 0, 0, 0, 0]  # Middle L, A, B values.
for i in range(50):
    img = sensor.snapshot()
    hist = img.get_histogram(roi=r)
    lo = hist.get_percentile(0.01)
    hi = hist.get_percentile(0.99)
    threshold[0] = (threshold[0] + lo.l_value()) // 2
    threshold[1] = (threshold[1] + hi.l_value()) // 2
    threshold[2] = (threshold[2] + lo.a_value()) // 2
    threshold[3] = (threshold[3] + hi.a_value()) // 2
    threshold[4] = (threshold[4] + lo.b_value()) // 2
    threshold[5] = (threshold[5] + hi.b_value()) // 2
    img.draw_rectangle(r, color=(0,255,0))  # 绿色方框
    for blob in img.find_blobs([threshold], pixels_threshold=100, area_threshold=100, merge=True):
        img.draw_rectangle(blob.rect())
    lcd.display(img)

# 主识别循环
print("Thresholds learned...")
print("Tracking colors...")
while True:
    img = sensor.snapshot()
    for blob in img.find_blobs([threshold], pixels_threshold=100, area_threshold=100, merge=True):
        img.draw_rectangle(blob.rect())
    lcd.display(img)

2. 运行步骤（手把手教你操作）

打开 OpenMV IDE，新建一个文件，将上面的代码复制进去，保存为 “color_detect.py”。
将 OpenMV 开发板通过 Micro USB 数据线连接到电脑，IDE 会自动识别设备（左下角显示设备名称）。
点击 IDE 左上角的 “运行” 按钮（绿色三角形），代码会自动烧录到开发板。
烧录完成后，屏幕会显示一个白色方框，此时把你要识别的物体（比如红色苹果、蓝色笔）放在方框里。
等待 1 秒后，方框会变成绿色，开发板开始 “学习” 颜色，之后就能在画面中找到所有相同颜色的物体并标记。

3. 常见问题解决

问题 1：识别不准确，总是把其他颜色当成目标颜色？
- 解决：确保环境光线稳定，避免强光或阴影；可以增加循环学习的次数（把for i in range(50)改成 100）。
问题 2：屏幕不显示图像？
- 解决：检查数据线是否插紧，重新插拔开发板；确认代码中lcd.init()和sensor.reset()没有被注释。

例程 2：条形码识别 —— 超市扫码原理轻松实现

条形码识别在生活中很常见（超市结账、商品溯源），这个例程会教你如何让 OpenMV 读取条形码的内容，并显示在屏幕上。

import sensor
import image
import time
import math
import lcd

# 初始化硬件
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)  # 灰度模式
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time=100)
sensor.set_auto_gain(False)
sensor.set_auto_whitebal(False)
clock = time.clock()

# 条形码类型转换函数
def barcode_name(code):
    if(code.type() == image.EAN2):
        return "EAN2"
    if(code.type() == image.EAN5):
        return "EAN5"
    if(code.type() == image.EAN8):
        return "EAN8"
    if(code.type() == image.UPCE):
        return "UPCE"
    if(code.type() == image.ISBN10):
        return "ISBN10"
    if(code.type() == image.UPCA):
        return "UPCA"
    if(code.type() == image.EAN13):
        return "EAN13"
    if(code.type() == image.ISBN13):
        return "ISBN13"
    if(code.type() == image.I25):
        return "I25"
    if(code.type() == image.DATABAR):
        return "DATABAR"
    if(code.type() == image.DATABAR_EXP):
        return "DATABAR_EXP"
    if(code.type() == image.CODABAR):
        return "CODABAR"
    if(code.type() == image.CODE39):
        return "CODE39"
    if(code.type() == image.PDF417):
        return "PDF417"
    if(code.type() == image.CODE93):
        return "CODE93"
    if(code.type() == image.CODE128):
        return "CODE128"
    return "Unknown"

# 主识别循环
while(True):
    clock.tick()
    img = sensor.snapshot()
    fps = clock.fps()
    
    # 查找条形码
    codes = img.find_barcodes()
    for code in codes:
        img.draw_rectangle(code.rect(), color=(0,255,0))  # 绿色框标记
        print_args = (barcode_name(code), code.payload(), 
                     (180 * code.rotation()) / math.pi, 
                     code.quality(), fps)
        print("Barcode %s, Payload \"%s\", rotation %f (degrees), quality %d, FPS %f" % print_args)
    
    # 显示帧率
    img.draw_string(0, 0, "%2.1ffps" % (fps), color=(0,60,128), scale=2.0)
    lcd.display(img)

2. 运行步骤

复制代码到 IDE，保存为 “barcode_detect.py”，烧录到开发板。
找一个带条形码的商品（比如饮料瓶、零食包装），将条形码对准开发板的摄像头。
调整距离（最佳距离 10-20 厘米），直到屏幕上出现绿色方框，此时串口终端会显示条形码的内容（比如 “6923456789012”）。

3. 小技巧

条形码需要平整，避免褶皱或污损，否则会识别失败。
如果识别速度慢（帧率低于 10fps），可以将图像尺寸改成sensor.QQVGA（160x120），加快处理速度。

例程 3：二维码识别 —— 比条形码能存更多信息

二维码比条形码能存储更多内容（比如网址、文字、联系方式），这个例程的代码比条形码更简单，因为 OpenMV 已经封装好了二维码识别功能。

1. 核心代码（重点看二维码识别部分）

import sensor
import image
import time
import lcd

# 初始化硬件
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time=100)

clock = time.clock()

# 主识别循环
while(True):
    clock.tick()
    img = sensor.snapshot()
    
    # 查找并识别二维码
    for code in img.find_qrcodes():
        img.draw_rectangle(code.rect(), color=127, thickness=3)  # 用灰色框标记二维码
        print(code)  # 打印二维码信息
    
    lcd.display(img)
    # print(clock.fps())  # 可选的帧率显示

趣味实验

用手机生成一个二维码（比如在微信 “发现 - 小程序 - 二维码生成器” 中，输入一段文字或网址）。
让开发板摄像头对准手机二维码，屏幕上会出现灰色方框，串口终端会打印二维码的内容（比如你输入的文字）。
试试识别不同内容的二维码，比如手机号、网址，看看开发板能不能准确读取。

例程 4：人脸检测 —— 让 AI “看见” 人脸

人脸检测是 AI 视觉的经典应用（比如手机解锁、摄像头美颜），这个例程需要用到 KPU（神经网络处理器），跑一个预训练好的人脸检测模型。

1. 准备工作（这一步很重要）

下载人脸检测模型：yolo_face_detect.kmodel（OpenMV 官方模型库）。
将模型文件放到 TF 卡的 “/sd/KPU/yolo_face_detect/” 目录下（需要手动创建文件夹）。
将 TF 卡插入 OpenMV 开发板的 TF 卡槽。

import sensor
import image
import time
import lcd
from maix import KPU

# 初始化硬件
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time=100)
clock = time.clock()

# 初始化KPU参数
od_img = image.Image(size=(320, 256))
anchor = (0.893, 1.463, 0.245, 0.389, 1.55, 2.58, 0.375, 0.594, 3.099, 
          0.057, 0.090, 0.567, 0.904, 0.101, 0.160, 0.159, 0.255)

kpu = KPU()
kpu.load_kmodel("/sd/KPU/yolo_face_detect/yolo_face_detect.kmodel")
kpu.init_yolo2(anchor, anchor_num=9, img_w=320, img_h=240, 
               net_w=320, net_h=256, layer_w=10, layer_h=8, 
               threshold=0.7, nms_value=0.3, classes=1)

# 主检测循环
while True:
    clock.tick()
    img = sensor.snapshot()
    
    # 准备神经网络输入图像
    a = od_img.draw_image(img, 0, 0)
    od_img.pix_to_ai()
    
    # 运行KPU计算
    kpu.run_with_output(od_img)
    dect = kpu.regionlayer_yolo2()
    fps = clock.fps()
    
    # 绘制检测结果
    if len(dect) > 0:
        print("dect:", dect)
        for l in dect:
            a = img.draw_rectangle(l[0], l[1], l[2], l[3], color=(0, 255, 0))
    
    # 显示帧率并刷新屏幕
    a = img.draw_string(0, 0, "%2.1ffps" % (fps), color=(0, 60, 128), scale=2.0)
    lcd.display(img)

# 释放KPU资源
kpu.deinit()

3. 运行步骤

确保 TF 卡已插入，模型文件路径正确。
复制代码到 IDE，烧录到开发板。
让开发板对准人脸（距离 30-50 厘米），屏幕上会出现绿色方框，精准框住人脸，串口终端会打印人脸的位置坐标。

4. 常见问题

问题 1：提示 “找不到模型文件”？
- 解决：检查 TF 卡是否插紧，模型文件路径是否正确（必须是 “/sd/KPU/yolo_face_detect/yolo_face_detect.kmodel”）。
问题 2：检测不到人脸？
- 解决：确保环境光线充足，人脸正对摄像头，距离不要太近或太远；可以降低threshold参数（比如改成 0.5），提高检测灵敏度。

例程 5：人脸 68 特征点检测 —— 让 AI “看懂” 人脸细节

人脸 68 特征点检测是在人脸检测的基础上，进一步识别眼睛、鼻子、嘴巴、眉毛的位置，是美颜、表情识别的基础。

1. 准备工作

下载两个模型：
1. 人脸检测模型：face_detect_320x240.kmodel（用于先找到人脸）。
2. 68 特征点模型：landmark68.kmodel（用于识别特征点）。
将模型分别放到 TF 卡的 “/sd/KPU/face_detect_with_68landmark/” 目录下。

2. 核心代码解析（特征点绘制是重点）

import sensor
import image
import time
import lcd
from maix import KPU

# 初始化硬件
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time=100)
clock = time.clock()

# 初始化人脸检测KPU参数
anchor = (0.1075, 0.126875, 0.126875, 0.175, 0.1465625, 0.2246875, 
          0.1953125, 0.25375, 0.2440625, 0.351875, 0.341875, 0.4721875, 
          0.5078125, 0.6696875, 0.8984375, 1.099687, 2.129062, 2.425937)

kpu = KPU()
kpu.load_kmodel("/sd/KPU/yolo_face_detect/face_detect_320x240.kmodel")
kpu.init_yolo2(anchor, anchor_num=9, img_w=320, img_h=240, net_w=320, net_h=240,
               layer_w=10, layer_h=8, threshold=0.7, nms_value=0.2, classes=1)

# 初始化68点特征检测模型
lm68_kpu = KPU()
print("ready load model")
lm68_kpu.load_kmodel("/sd/KPU/face_detect_with_68landmark/landmark68.kmodel")

# 扩展人脸框函数
def extend_box(x, y, w, h, scale):
    x1_t = x - scale*w
    x2_t = x + w + scale*w
    y1_t = y - scale*h
    y2_t = y + h + scale*h
    x1 = int(x1_t) if x1_t > 1 else 1
    x2 = int(x2_t) if x2_t < 320 else 319
    y1 = int(y1_t) if y1_t > 1 else 1
    y2 = int(y2_t) if y2_t < 240 else 239
    cut_img_w = x2 - x1 + 1
    cut_img_h = y2 - y1 + 1
    return x1, y1, cut_img_w, cut_img_h

# 主检测循环
while True:
    clock.tick()
    img = sensor.snapshot()
    
    # 人脸检测
    kpu.run_with_output(img)
    dect = kpu.regionlayer_yolo2()
    fps = clock.fps()
    
    if len(dect) > 0:
        print("dect:", dect)
        for l in dect:
            # 扩展人脸区域
            x1, y1, cut_img_w, cut_img_h = extend_box(l[0], l[1], l[2], l[3], scale=0.08)
            face_cut = img.cut(x1, y1, cut_img_w, cut_img_h)
            
            # 绘制人脸框
            a = img.draw_rectangle(l[0], l[1], l[2], l[3], color=(0, 255, 0))
            
            # 准备特征点检测输入
            face_cut_128 = face_cut.resize(128, 128)
            face_cut_128.pix_to_ai()
            
            # 检测68个特征点
            out = lm68_kpu.run_with_output(face_cut_128, getlist=True)
            
            # 绘制特征点
            for j in range(68):
                x = int(KPU.sigmoid(out[2*j]) * cut_img_w + x1)
                y = int(KPU.sigmoid(out[2*j+1]) * cut_img_h + y1)
                a = img.draw_circle(x, y, 2, color=(0, 0, 255), fill=True)
            
            # 释放内存
            del(face_cut_128)
            del(face_cut)
    
    # 显示帧率并刷新屏幕
    a = img.draw_string(0, 0, "%2.1ffps" % (fps), color=(0, 60, 255), scale=2.0)
    lcd.display(img)