【1分钟学会万相文生视频】windows环境4080显卡部署Wan2.1-T2V-1.3B，亲手实操

windows环境4080显卡部署Wan2.1-T2V-1.3B，一手实测

繁华落尽，寻一世真情

1095人浏览 · 2025-02-26 23:41:15

繁华落尽，寻一世真情 · 2025-02-26 23:41:15 发布

开源的风吹到视频生成：阿里开源登顶VBench的万相大模型

1.先从github上下载源码,或者下载zip解压：
2.下载模型
3.按照官方指令进行推理
4.输入prompt，生成对应的视频

=话不多说，看看标题就知道：=

1.先从github上下载源码,或者下载zip解压：

git clone https://github.com/Wan-Video/Wan2.1

2.下载模型

from modelscope import snapshot_download

# 指定模型名称
model_name = "Wan-AI/Wan2.1-T2V-1.3B"

# 指定下载路径（绝对路径）
custom_path = "Wan2.1-T2V-1.3B"

# 下载模型
model_dir = snapshot_download(
     model_name,  # 模型名称
     cache_dir=custom_path,  # 指定下载路径
    revision="master"  # 可选：指定模型版本（默认master）
 )

 print(f"模型已下载到：{model_dir}")

3.按照官方指令进行推理

python generate.py  --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

本人做了三个个改动：
1.用–t5_cpu参数会将文本编码器放到CPU上跑，但是在CPU上跑文本编码也是很慢的，因此先在GPU上跑，然后在源码上torch.cuda.empty_cache()清理掉缓存，就快了很多
在这里插入图片描述
2.由于很多机器，比如V100或者2080Ti，或多或少都不支持flash_attention。只需要在源码的model.py将flash_attention替换成from .attention import attention as flash_attention就可以了

# 源码
# from .attention import flash_attention
#替换成
from .attention import attention as flash_attention

3.这个一个意想不到的问题，windows上不支持用号明明文件，如果用源码跑完会保存不了mp4，因此需要将替换掉。如在generate.py文件中将*号替换成X

args.save_file = args.save_file.replace("*", "X")

4.输入prompt，生成对应的视频

prompt:

In a realistic close-up shot with smooth camera movement, a charming woman is seen outdoors on a grassy lawn. She is wearing a white shirt paired with a white jacket, and she adorns a necklace and earrings, adding elegance to her appearance. The woman is gracefully walking around an area enclosed by a wooden fence, moving in a gentle arc as she walks past the fence. The background features a lush green lawn and tent-like structures, creating a serene and refreshing atmosphere. The lighting is ample, highlighting the natural beauty of the scene.

效果视频：
在这里插入图片描述

魔乐社区

魔乐社区（Modelers.cn) 是一个中立、公益的人工智能社区，提供人工智能工具、模型、数据的托管、展示与应用协同服务，为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作，由全产业链共同建设、共同运营、共同享有，推动国产AI生态繁荣发展。

更多推荐

【计算机视觉】Pixel逐像素分类&Mask掩码分类理解摘要

魔乐社区

计算机视觉（opencv）实战三十二——CascadeClassifier 人脸微笑检测（摄像头）

本文从原理到实现，详细介绍了基于 OpenCV Haar 分类器的人脸与微笑检测：讲解了 Haar 特征和级联检测原理。对代码逐行拆解并解释参数含义。画出完整流程图，帮助理解执行过程。给出了常见问题和优化建议，甚至扩展到深度学习方法。这种方法简单、轻量、实时性好，非常适合入门和小型应用项目。但如果需要更高准确率和更强鲁棒性，建议使用深度学习检测器替代 Haar 分类器。