1 数据下载

1.1 官方的Colab往后翻可以找到每个数据对应的路径:

数据集项目链接
官方的Colab

gs://gresearch/robotics/fractal20220817_data/0.1.0 has size 111.07 GiB
gs://gresearch/robotics/kuka/0.1.0 has size 778.02 GiB
gs://gresearch/robotics/bridge/0.1.0 has size 387.49 GiB
gs://gresearch/robotics/taco_play/0.1.0 has size 47.77 GiB
gs://gresearch/robotics/jaco_play/0.1.0 has size 9.24 GiB
gs://gresearch/robotics/berkeley_cable_routing/0.1.0 has size 4.67 GiB
gs://gresearch/robotics/roboturk/0.1.0 has size 45.39 GiB
gs://gresearch/robotics/nyu_door_opening_surprising_effectiveness/0.1.0 has size 7.12 GiB
gs://gresearch/robotics/viola/0.1.0 has size 10.40 GiB
gs://gresearch/robotics/berkeley_autolab_ur5/0.1.0 has size 76.39 GiB
gs://gresearch/robotics/toto/0.1.0 has size 127.66 GiB
gs://gresearch/robotics/language_table/0.0.1 has size 399.23 GiB
gs://gresearch/robotics/columbia_cairlab_pusht_real/0.1.0 has size 2.80 GiB
gs://gresearch/robotics/stanford_kuka_multimodal_dataset_converted_externally_to_rlds/0.1.0 has size 31.98 GiB
gs://gresearch/robotics/nyu_rot_dataset_converted_externally_to_rlds/0.1.0 has size 5.33 MiB
gs://gresearch/robotics/stanford_hydra_dataset_converted_externally_to_rlds/0.1.0 has size 72.48 GiB
gs://gresearch/robotics/austin_buds_dataset_converted_externally_to_rlds/0.1.0 has size 1.49 GiB
gs://gresearch/robotics/nyu_franka_play_dataset_converted_externally_to_rlds/0.1.0 has size 5.18 GiB
gs://gresearch/robotics/maniskill_dataset_converted_externally_to_rlds/0.1.0 has size 151.05 GiB
gs://gresearch/robotics/cmu_franka_exploration_dataset_converted_externally_to_rlds/0.1.0 has size 602.24 MiB
gs://gresearch/robotics/ucsd_kitchen_dataset_converted_externally_to_rlds/0.1.0 has size 1.33 GiB
gs://gresearch/robotics/ucsd_pick_and_place_dataset_converted_externally_to_rlds/0.1.0 has size 3.53 GiB
gs://gresearch/robotics/austin_sailor_dataset_converted_externally_to_rlds/0.1.0 has size 18.85 GiB
gs://gresearch/robotics/austin_sirius_dataset_converted_externally_to_rlds/0.1.0 has size 6.55 GiB
gs://gresearch/robotics/bc_z/0.1.0 has size 80.54 GiB
gs://gresearch/robotics/usc_cloth_sim_converted_externally_to_rlds/0.1.0 has size 254.52 MiB
gs://gresearch/robotics/utokyo_pr2_opening_fridge_converted_externally_to_rlds/0.1.0 has size 360.57 MiB
gs://gresearch/robotics/utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds/0.1.0 has size 829.37 MiB
gs://gresearch/robotics/utokyo_saytap_converted_externally_to_rlds/0.1.0 has size 55.34 MiB
gs://gresearch/robotics/utokyo_xarm_pick_and_place_converted_externally_to_rlds/0.1.0 has size 1.29 GiB
gs://gresearch/robotics/utokyo_xarm_bimanual_converted_externally_to_rlds/0.1.0 has size 138.44 MiB
gs://gresearch/robotics/robo_net/1.0.0 has size 799.91 GiB
gs://gresearch/robotics/berkeley_mvp_converted_externally_to_rlds/0.1.0 has size 12.34 GiB
gs://gresearch/robotics/berkeley_rpt_converted_externally_to_rlds/0.1.0 has size 40.64 GiB
gs://gresearch/robotics/kaist_nonprehensile_converted_externally_to_rlds/0.1.0 has size 11.71 GiB
gs://gresearch/robotics/stanford_mask_vit_converted_externally_to_rlds/0.1.0 has size 76.17 GiB
gs://gresearch/robotics/tokyo_u_lsmo_converted_externally_to_rlds/0.1.0 has size 335.71 MiB
gs://gresearch/robotics/dlr_sara_pour_converted_externally_to_rlds/0.1.0 has size 2.92 GiB
gs://gresearch/robotics/dlr_sara_grid_clamp_converted_externally_to_rlds/0.1.0 has size 1.65 GiB
gs://gresearch/robotics/dlr_edan_shared_control_converted_externally_to_rlds/0.1.0 has size 3.09 GiB
gs://gresearch/robotics/asu_table_top_converted_externally_to_rlds/0.1.0 has size 737.60 MiB
gs://gresearch/robotics/stanford_robocook_converted_externally_to_rlds/0.1.0 has size 124.62 GiB
gs://gresearch/robotics/eth_agent_affordances/0.1.0 has size 17.27 GiB
gs://gresearch/robotics/imperialcollege_sawyer_wrist_cam/0.1.0 has size 81.87 MiB
gs://gresearch/robotics/iamlab_cmu_pickup_insert_converted_externally_to_rlds/0.1.0 has size 50.29 GiB
gs://gresearch/robotics/uiuc_d3field/0.1.0 has size 15.82 GiB
gs://gresearch/robotics/utaustin_mutex/0.1.0 has size 20.79 GiB
gs://gresearch/robotics/berkeley_fanuc_manipulation/0.1.0 has size 8.85 GiB
gs://gresearch/robotics/cmu_play_fusion/0.1.0 has size 6.68 GiB
gs://gresearch/robotics/cmu_stretch/0.1.0 has size 728.06 MiB
gs://gresearch/robotics/berkeley_gnm_recon/0.1.0 has size 18.73 GiB
gs://gresearch/robotics/berkeley_gnm_cory_hall/0.1.0 has size 1.39 GiB
gs://gresearch/robotics/berkeley_gnm_sac_son/0.1.0 has size 7.00 GiB

1.2 在服务器上下载数据

下载gsutil库

pip install gsutil

下载数据集

gsutil -m cp -r gs://gresearch/robotics/fractal20220817_data/0.1.0[具体的数据集路径根据需求替换成上面的] /path/to/your/dir

或者

/snap/bin/gsutil -m cp -r gs://gresearch/robotics/fractal20220817_data/0.1.0[具体的数据集路径根据需求替换成上面的] /path/to/your/dir

找不到的数据集可以先查看是否存在:

/snap/bin/gsutil ls gs://gresearch/robotics/jaco_play[替换成数据集的Registered Dataset Name]

具体的Registered Dataset Name可以在数据信息表中找到

2 数据可视化

2.1 环境

pip install tensorflow-datasets==4.9.3

2.2 代码

import numpy as np
import tensorflow_datasets as tfds
from PIL import Image
from IPython import display
import os

display_key = 'image'
datasets_name = "bridge"

b = tfds.builder_from_directory(f"/path/to/Open_X_Embodiment_Datasets/{datasets_name}")

ds = b.as_dataset(split='train') # 具体可以根据需求改

output_dir = f'{datasets_name}_videos'
os.makedirs(output_dir, exist_ok=True)


instructions_file_path = os.path.join(output_dir, f'{datasets_name}.txt')
state_file_path = os.path.join(output_dir, f'state.txt')
# 遍历数据集
for idx, episode in enumerate(ds):
    # 为每个视频创建一个文件夹
    video_folder = os.path.join(output_dir, f'video_{idx}')
    os.makedirs(video_folder, exist_ok=True)

    # 提取该视频的所有帧
    frames = episode['steps']

    # 遍历每一帧并保存
    state_list = []
    for frame_idx, step in enumerate(frames):
    	# 每个数据集image的特征名字不一样,具体要看数据集下载好后的 features.json 文件中对应的字段是什么
        image = step['observation'][image] # fractal20220817_data
        # image = step['observation']["agentview_rgb"] # viola
        # image = step['observation']["image"] # bridge

        # 获取自然语言指令,具体要看数据集下载好后的 features.json 文件对应的字段是什么
        # natural_language_instruction = step["language_instruction"].numpy().decode('utf-8') # for ucsd、berkeley_fanuc_manipulation
        natural_language_instruction = step['observation']["natural_language_instruction"].numpy().decode('utf-8') 

        state_list.append(step['observation']["state"])

        # 将图像转换为 PIL 格式
        image_pil = Image.fromarray(image.numpy())

        # 保存图像,文件名格式为 frame_{frame_idx}.png
        output_path = os.path.join(video_folder, f'frame_{frame_idx}.png')
        image_pil.save(output_path)

    with open(state_file_path, 'a') as f:
        f.write(f"state {idx}: {state_list}\n")

    with open(instructions_file_path, 'a') as f:
        f.write(f"Video {idx} Instruction: {natural_language_instruction}\n")

    print(f"第 {idx} 个视频的所有帧已保存到: {video_folder}, 该视频共有{frame_idx + 1}帧")

print("所有视频的帧提取完成。")

  • 代码主要参考数据集官方的Colab
  • 不同的数据集要根据 features.json 文件中的字段来修改代码中的字段,比如 fractal20220817_data 数据集的 features.json 文件中图片的字段是 image,所以代码中:image = step['observation'][image]
    在这里插入图片描述
  • 自然语言指令的字段是 natural_language_instruction,所以代码中:natural_language_instruction = step['observation']["natural_language_instruction"].numpy().decode('utf-8')
    在这里插入图片描述
  • 如果想提取其他信息也是同理,主要根据 features.json 中的字段对上面的代码进行调整

2.3 可视化结果

代码运行完毕后会保存图片帧:
在这里插入图片描述

  • 以及对应的 instruction,在.txt文件中:
    在这里插入图片描述
  • 最后,如果要将图片转为视频,可以参考以下代码:
import cv2
import os
import re


def natural_sort_key(s):
    # 提取字符串中的数字部分,并将其转换为整数
    return [int(text) if text.isdigit() else text for text in re.split('([0-9]+)', s)]


def images_to_video(image_folder, video_name, fps=10):
    # 获取文件夹中的所有图像文件
    # images = [img for img in os.listdir(image_folder) if img.endswith(".jpg") or img.endswith(".png")]
    filenames = sorted(os.listdir(image_folder), key=natural_sort_key)
    # images.sort()  # 确保图像按顺序排序

    # 确定视频的宽度和高度
    first_image = cv2.imread(os.path.join(image_folder, filenames[0]))
    height, width, layers = first_image.shape

    # 创建视频写入对象
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # 可以选择不同的编码格式
    video = cv2.VideoWriter(video_name, fourcc, fps, (width, height))

    for image in filenames:
        img_path = os.path.join(image_folder, image)
        frame = cv2.imread(img_path)
        video.write(frame)  # 将每一帧写入视频

    video.release()  # 释放视频写入对象
    cv2.destroyAllWindows()  # 关闭所有 OpenCV 窗口
    print(f"视频 '{video_name}' 已成功创建!")

# 使用示例
image_folder = "/path/to/Open_X_Embodiment_Datasets/RT-1_Robot_Action_videos/video_0"  # 替换为图像文件夹路径
video_name = "/path/to/output/RT-1_videos_277.mp4"  # 输出视频文件名
images_to_video(image_folder, video_name, fps=5)

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐