Hadoop2.7实战v1.0之Hive-2.0.0+MySQL远程模式安装

本文涉及的产品
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
服务治理 MSE Sentinel/OpenSergo,Agent数量 不受限
简介: 环境:Apache Hadoop2.7分布式集群环境(HDFS HA,Yarn HA,HBase HA) 元数据库mysql部署在hadoop-01机器上 user:hive password:...
环境:Apache Hadoop2.7分布式集群环境(HDFS HA,Yarn HA,HBase HA)
元数据库mysql部署在hadoop-01机器上
user:hive
password:hive
database:hive_remote_meta
hive服务端部署在hadoop-01机器上
hive客户端部署在hadoop-02机器上

1.Install MySQL5.6.23 on hadoop-01
2.Create db and user
hadoop-01:mysqladmin:/usr/local/mysql:>mysql -uroot -p
mysql> create database hive_remote_meta;
Query OK, 1 row affected (0.04 sec)

mysql> create user 'hive' identified by 'hive';
Query OK, 0 rows affected (0.05 sec)

mysql> grant all privileges on hive_remote_meta.* to 'hive'@'%';
Query OK, 0 rows affected (0.03 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

3.在安装Install hive-2.0.0
[root@hadoop-01 tmp]# wget http://apache.communilink.net/hive/hive-2.0.0/apache-hive-2.0.0-bin.tar.gz

[root@hadoop-01 tmp]# tar zxvf apache-hive-2.0.0-bin.tar.gz
[root@hadoop-01 tmp]# mv apache-hive-2.0.0-bin /hadoop/hive-remote-server
[root@hadoop-01 tmp]# cd /hadoop/hive-remote-server
[root@hadoop-01 hive-remote-server]# ll
total 588
drwxr-xr-x 3 root root   4096 Mar 29 23:19 bin
drwxr-xr-x 2 root root   4096 Mar 29 23:19 conf
drwxr-xr-x 4 root root   4096 Mar 29 23:19 examples
drwxr-xr-x 7 root root   4096 Mar 29 23:19 hcatalog
drwxr-xr-x 4 root root  12288 Mar 29 23:19 lib
-rw-r--r-- 1 root root  26335 Jan 22 12:28 LICENSE
-rw-r--r-- 1 root root    513 Jan 22 12:28 NOTICE
-rw-r--r-- 1 root root   4348 Feb 10 09:50 README.txt
-rw-r--r-- 1 root root 527063 Feb 10 09:56 RELEASE_NOTES.txt
drwxr-xr-x 4 root root   4096 Mar 29 23:19 scripts
[root@hadoop-01 hive-remote-server]# 

4.Configure profile on hadoop-01
[root@hadoop-01 ~]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export HADOOP_HOME=/hadoop/hadoop-2.7.2
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HBASE_HOME=/hadoop/hbase-1.2.0
export ZOOKEEPER_HOME=/hadoop/zookeeper
export HIVE_HOME=/hadoop/hive-remote-server
export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH
[root@hadoop-01 ~]# source /etc/profile
[root@hadoop-01 ~]# 

5.configure jdbc jar
[root@hadoop-01 tmp]# wget http://ftp.nchu.edu.tw/Unix/Database/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz
[root@hadoop-01 tmp]# tar zxvf  mysql-connector-java-5.1.36.tar.gz
[root@hadoop-01 tmp]# cd mysql-connector-java-5.1.36
[root@hadoop-01 mysql-connector-java-5.1.36]# ll
total 1428
-rw-r--r-- 1 root root  90430 Jun 20  2015 build.xml
-rw-r--r-- 1 root root 235082 Jun 20  2015 CHANGES
-rw-r--r-- 1 root root  18122 Jun 20  2015 COPYING
drwxr-xr-x 2 root root   4096 Mar 29 23:35 docs
-rw-r--r-- 1 root root 972009 Jun 20  2015 mysql-connector-java-5.1.36-bin.jar
-rw-r--r-- 1 root root  61423 Jun 20  2015 README
-rw-r--r-- 1 root root  63674 Jun 20  2015 README.txt
drwxr-xr-x 8 root root   4096 Jun 20  2015 src
[root@hadoop-01 mysql-connector-java-5.1.36]# cp mysql-connector-java-5.1.36-bin.jar $HIVE_HOME/lib/

6.Configure Hive服务端
[root@hadoop-01 ~]# cd $HIVE_HOME/conf
[root@hadoop-01 conf]# cp hive-default.xml.template hive-default.xml

# "hdfs://mycluster"是指$HADOOP_HOME/etc/hadoop/core-site.xml文件的fs.defaultFS的值(NameNode HA URI)
#格式:jdbc:mysql:///?createDatabaseIfNotExist=true

点击(此处)折叠或打开

  1. [root@hadoop-01 conf]# vi hive-site.xml
  2. <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  3. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  4. <configuration>
  5.      
  6.       <property>
  7.          <name>hive.metastore.warehouse.dir</name>
  8.          <value>hdfs://mycluster/user/hive_remote/warehouse</value>
  9.       </property>

  10.       <property>
  11.         <name>javax.jdo.option.ConnectionURL</name>
  12.         <value>jdbc:mysql://hadoop-01:3306/hive_remote_meta?createDatabaseIfNotExist=true</value>
  13.         <description>JDBC connect string for a JDBC metastore</description>
  14.      </property>
  15.      <property>
  16.         <name>javax.jdo.option.ConnectionDriverName</name>
  17.         <value>com.mysql.jdbc.Driver</value>
  18.         <description>Driver class name for a JDBC metastore</description>
  19.      </property>

  20.      <property>
  21.         <name>javax.jdo.option.ConnectionUserName</name>
  22.         <value>hive</value>
  23.         <description>username to use against metastore database</description>
  24.      </property>
  25.      <property>
  26.         <name>javax.jdo.option.ConnectionPassword</name>
  27.         <value>hive</value>
  28.         <description>password to use against metastore database</description>
  29.      </property>
  30.      <property>
  31.       <name>hive.hwi.war.file</name>
  32.         <value>${HIVE_HOME}/lib/hive-hwi-2.0.0.jar</value>
  33.         <description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description>
  34.      </property>
  35. </configuration>

  36. "hive-site.xml" 26L, 1056C written
  37. [root@hadoop-01 conf]#
7.scp hive to client
[root@hadoop-01 hadoop]# pwd
/hadoop
[root@hadoop-01 hadoop]# scp -r hive-remote-server root@hadoop-02:/hadoop/hive-remote-client

8.Configure profile on hadoop-01
[root@hadoop-01 ~]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export HADOOP_HOME=/hadoop/hadoop-2.7.2
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HBASE_HOME=/hadoop/hbase-1.2.0
export ZOOKEEPER_HOME=/hadoop/zookeeper
export HIVE_HOME=/hadoop/hive-remote-client
export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH
[root@hadoop-01 ~]# source /etc/profile
[root@hadoop-01 ~]# 

9.Configure Hive客户端

点击(此处)折叠或打开

  1. [root@hadoop-02 conf]# vi hive-site.xml
  2. <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  3. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  4. <configuration>
  5.          <!-- thrift://<host_name>:<port> 默认端口是9083 -->
  6. <property>
  7.  <name>hive.metastore.uris</name>
  8.  <value>thrift://hadoop-01:9083</value>
  9.  <description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  10. </property>
  11.  
  12. <!-- hive表的默认存储路径 -->
  13. <property>
  14.  <name>hive.metastore.warehouse.dir</name>
  15.  <value>hdfs://mycluster/user/hive_remote/warehouse</value>
  16. </property>
  17. </configuration>

  18. "hive-site.xml" 26L, 1056C written
  19. [root@hadoop-01 conf]#
10.hive服务端如果是第一次需要执行初始化命令:schematool -initSchema -dbType mysql 
[root@hadoop-01 bin]# schematool -initSchema -dbType mysql
Metastore connection URL:        jdbc:mysql://hadoop-01.telenav.cn:3306/hive_remote_meta?createDatabaseIfNotExist=true
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       hive
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.mysql.sql
Initialization script completed
schemaTool completed
[root@hadoop-01 bin]# 

11.启动hive服务端和客户端
【服务端】:
hive --service metastore -p [port]
如果不加端口默认启动:hive --service metastore,则默认监听端口是:9083 ,注意客户端中的端口配置需要和启动监听的端口一致。服务端启动正常后,客户端就可以执行hive操作了
### & 以后台方式启动
[root@hadoop-01 bin]# hive --service metastore &
Starting Hive Metastore Server

【客户端】:
[root@hadoop-02 hive-remote-client]# cd bin
[root@hadoop-02 bin]# hive

Logging initialized using configuration in jar:file:/hadoop/hive-remote-client/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> 

12.创建表,load数据验证
## "tab制表符"分隔
[root@hadoop-02 bin]#  vi /tmp/studentInfo.txt 
1       a       26      110
1       a       26      113
2       b       11      222

[root@hadoop-02 bin]# hive
Logging initialized using configuration in jar:file:/hadoop/hive-remote-client/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> 
    > create table studentinfo (id int,name string, age int,tel string)
    > row format delimited  fields terminated by '\t'
    > stored as textfile;

hive> load data local inpath '/tmp/studentInfo.txt' overwrite into table studentinfo;
Loading data to table default.studentinfo
Moved: 'hdfs://mycluster/user/hive_remote/warehouse/studentinfo/studentInfo.txt' to trash at: hdfs://mycluster/user/root/.Trash/Current
OK
Time taken: 2.941 seconds

hive> select * from studentinfo;
OK
1       a       26      113
2       b       11      222
Time taken: 1.607 seconds, Fetched: 2 row(s)
hive

13.查看hdfs文件系统
[root@hadoop-01 bin]# ps -ef|grep hive
root     11629  4509  1 19:39 pts/0    00:00:21 /usr/java/jdk1.7.0_67-cloudera/bin/java -Xmx256m -Djava.library.path=/hadoop/hadoop-2.7.2/lib -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/hadoop/hadoop-2.7.2/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/hadoop/hadoop-2.7.2 -Dhadoop.id.str=root -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /hadoop/hive-remote-server/lib/hive-service-2.0.0.jar org.apache.hadoop.hive.metastore.HiveMetaStore
root     13351  4509  0 19:57 pts/0    00:00:00 grep hive
[root@hadoop-01 bin]# hadoop fs -ls /user/hive_remote/warehouse/studentinfo
Found 1 items
-rwx------   3 root root         22 2016-04-16 19:54 /user/hive_remote/warehouse/studentinfo/studentInfo.txt
[root@hadoop-01 bin]# hadoop fs -cat  /user/hive_remote/warehouse/studentinfo/studentInfo.txt
1       a       26      113
2       b       11      222
[root@hadoop-01 bin]# 

14.参考
相关实践学习
基于CentOS快速搭建LAMP环境
本教程介绍如何搭建LAMP环境,其中LAMP分别代表Linux、Apache、MySQL和PHP。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助 &nbsp; &nbsp; 相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
6天前
|
关系型数据库 MySQL 数据库
《MySQL 简易速速上手小册》第1章:MySQL 基础和安装(2024 最新版)
《MySQL 简易速速上手小册》第1章:MySQL 基础和安装(2024 最新版)
28 4
|
2天前
|
关系型数据库 MySQL Windows
windows安装MySQL5.7教程
windows安装MySQL5.7教程
8 0
|
2天前
|
SQL 存储 关系型数据库
MySQL Cluster集群安装及使用
MySQL Cluster集群安装及使用
|
9天前
|
关系型数据库 MySQL Linux
Linux联网安装MySQL Server
Linux联网安装MySQL Server
21 0
|
9天前
|
关系型数据库 MySQL Linux
centos7安装mysql-带网盘安装包
centos7安装mysql-带网盘安装包
57 2
|
9天前
|
存储 分布式计算 Hadoop
Hadoop的运行模式
【4月更文挑战第12天】Hadoop支持三种运行模式:本地模式适用于初学者和小型项目;伪分布式集群模式用于测试,数据存储在HDFS;完全分布式集群模式,适用于企业级大规模数据处理,具有高吞吐量和容错性。选择模式取决于实际需求和环境配置。Hadoop的分布式计算特性使其在扩展性、容错性和可恢复性方面表现出色,是大数据处理的关键工具。
11 1
|
13天前
|
关系型数据库 MySQL 数据库
Docker安装MySQL
Docker安装MySQL
29 1
|
13天前
|
关系型数据库 MySQL 数据安全/隐私保护
MySQL 安装及连接
MySQL 安装及连接
33 0
|
15天前
|
存储 分布式计算 Hadoop
【Hadoop】Hadoop的三种集群模式
【4月更文挑战第9天】【Hadoop】Hadoop的三种集群模式
|
15天前
|
SQL 分布式计算 Hadoop
利用Hive与Hadoop构建大数据仓库:从零到一
【4月更文挑战第7天】本文介绍了如何使用Apache Hive与Hadoop构建大数据仓库。Hadoop的HDFS和YARN提供分布式存储和资源管理,而Hive作为基于Hadoop的数据仓库系统,通过HiveQL简化大数据查询。构建过程包括设置Hadoop集群、安装配置Hive、数据导入与管理、查询分析以及ETL与调度。大数据仓库的应用场景包括海量数据存储、离线分析、数据服务化和数据湖构建,为企业决策和创新提供支持。
55 1