安装docker
要用到docker和docker-compose,我们依次来安装

安装docker
如果你之前安装过 docker,请先删掉

yum remove docker docker-common docker-selinux docker-engine

安装一些依赖

yum install -y yum-utils device-mapper-persistent-data lvm2

拉取docker-ce.repo

wget -O /etc/yum.repos.d/docker-ce.repo https://download.docker.com/linux/centos/docker-ce.repo

把软件仓库地址替换为 TUNA:

sed -i 's+download.docker.com+mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo

最后安装(默认安装的是最新版docker):

yum makecache fast
yum install docker-ce

启动docker,设为开机自启,查看docker版本

systemctl start docker
systemctl enable  docker

安装docker-compose
依次执行以下命令

curl -L "https://github.com/docker/compose/releases/download/1.26.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose

docker-compose --version

docker-hive

链接:https://github.com/big-data-europe/docker-hive
可以根据README.md中的介绍来进行部署

注:本文所有docker-compose都是在docker-hive目录下执行的
或者自己创建 docker-compose.yml

version: "3"

services:
  namenode:
    image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
    volumes:
      - namenode:/hadoop/dfs/name
    environment:
      - CLUSTER_NAME=test
    env_file:
      - ./hadoop-hive.env
    ports:
      - "50070:50070"
  datanode:
    image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
    volumes:
      - datanode:/hadoop/dfs/data
    env_file:
      - ./hadoop-hive.env
    environment:
      SERVICE_PRECONDITION: "namenode:50070"
    ports:
      - "50075:50075"
  hive-server:
    image: bde2020/hive:2.3.2-postgresql-metastore
    env_file:
      - ./hadoop-hive.env
    environment:
      HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
      SERVICE_PRECONDITION: "hive-metastore:9083"
    ports:
      - "10000:10000"
  hive-metastore:
    image: bde2020/hive:2.3.2-postgresql-metastore
    env_file:
      - ./hadoop-hive.env
    command: /opt/hive/bin/hive --service metastore
    environment:
      SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432"
    ports:
      - "9083:9083"
  hive-metastore-postgresql:
    image: bde2020/hive-metastore-postgresql:2.3.0
  presto-coordinator:
    image: shawnzhu/prestodb:0.181
    ports:
      - "8080:8080"

volumes:
  namenode:
  datanode:
cd docker-hive

这步在后台起一个hive,元数据库用的是postgresql
会费一点时间,需要耐心等待

docker-compose up -d

等上面命令运行完成后,可以执行docker-compose ps命令查看正在运行的镜像。

docker-compose ps

可以看到有namenode、datanode、hive等,表示部署成功了。

使用Hive命令行
依次执行以下步骤
进入bash

docker-compose exec hive-server bash

#使用beeline客户端连接

/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000

#执行SQL。这两句是可以直接执行的,镜像带了example文件

CREATE TABLE pokes (foo INT, bar STRING);
LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE pokes;

#查询

select * from pokes;
Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐