hadoop2.6.0(单节点)下Sqoop-1.4.6安装与配置(数据读取涉及hadoop、hbase和hive)

本文涉及的产品
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
服务治理 MSE Sentinel/OpenSergo,Agent数量 不受限
简介:

下载Sqoop

           http://sqoop.apache.org/

   或

           http://archive-primary.cloudera.com/cdh5/cdh/5/    (这个就是版本已经对应好了,不需去弄版本兼容问题了)

   或通过CM、Ambari

很多同行,也许都知道,对于我们大数据搭建而言,目前主流,分为Apache 和 Cloudera 和 Ambari。

     后两者我不多说,是公司必备和大多数高校科研环境所必须的!

     分别,详情见我如下的博客

Cloudera安装搭建部署大数据集群(图文分五大步详解)(博主强烈推荐)

 

 

 

 

环境准备

  Java

  Hadoop(Hdfs/Yarn)

 

 

 

 

 

 

Hadoop2.6.0(单节点)下安装Sqoop

  第一步:上传sqoop的安装包,这里不多赘述。

[hadoop@djt002 sqoop]$ pwd
/usr/local/sqoop
[hadoop@djt002 sqoop]$ ls
sqoop-1.4.6.bin__hadoop-2.0.4-alpha
[hadoop@djt002 sqoop]$ mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha/ sqoop-1.4.6
[hadoop@djt002 sqoop]$ ls
sqoop-1.4.6
[hadoop@djt002 sqoop]$ cd sqoop-1.4.6/
[hadoop@djt002 sqoop-1.4.6]$ pwd
/usr/local/sqoop/sqoop-1.4.6
[hadoop@djt002 sqoop-1.4.6]$

 

 

 

 

 

 

 

 

 

 

 

 

 

[hadoop@djt002 sqoop-1.4.6]$ ls
bin CHANGELOG.txt conf ivy lib NOTICE.txt README.txt sqoop-patch-review.py src
build.xml COMPILING.txt docs ivy.xml LICENSE.txt pom-old.xml sqoop-1.4.6.jar sqoop-test-1.4.6.jar testdata
[hadoop@djt002 sqoop-1.4.6]$ cd conf/
[hadoop@djt002 conf]$ pwd
/usr/local/sqoop/sqoop-1.4.6/conf
[hadoop@djt002 conf]$ ls
oraoop-site-template.xml sqoop-env-template.cmd sqoop-env-template.sh sqoop-site-template.xml sqoop-site.xml
[hadoop@djt002 conf]$ cp sqoop-env-template.sh sqoop-env.sh
[hadoop@djt002 conf]$ vim sqoop-env.sh

 

 

 

 

 

   第二步:配置文件

 

[hadoop@djt002 conf]$ vim sqoop-env.sh

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=    (建议都安装上)

#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=     (建议都安装上)

#set the path to where bin/hbase is available  
#export HBASE_HOME=          (建议都安装上)

#Set the path to where bin/hive is available
#export HIVE_HOME=           (建议都安装上)

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=            (因为,我这里是,hadoop-2.6.0的单节点分布,所以就没必要去配置Zookeeper了)

 

 

 

 

    如果数据读取不涉及hbase和hive,那么相关hbase和hive的配置可以不加;如果集群有独立的zookeeper集群,那么配置zookeeper,反之,不用配置。

在这里,我就全部配置吧,为了大家的方便!

 

  所以,就没配置Zookeeper了。

复制代码
export HADOOP_COMMON_HOME=/usr/local/hadoop/hadoop-2.6.0

export HADOOP_MAPRED_HOME=/usr/local/hadoop/hadoop-2.6.0

export HBASE_HOME=/usr/local/hbase/hbase-1.2.3

export HIVE_HOME=/usr/local/hive/hive-1.0.0
复制代码

 

   

 

 

  第三:配置环境变量

#sqoop
export SQOOP_HOME=/usr/local/sqoop/sqoop-1.4.6
export PATH=$PATH:$SQOOP_HOME/bin

 

 

 

 

  第四步:生效环境变量

source /etc/profile

 

   

 

 

  第五步:这里大家,记得要给sqoop安装目录,授予权限给hadoop用户

chown -R hadoop:hadoop sqoop-1.4.6

 

 

 

 

  第六步:将相关的驱动 jar 包拷贝到 sqoop/lib 目录下。

   这里,省略了,很多,包括。hadoo的相关核心jar包、hive的相关核心jar包和hbase的相关核心jar包(补补)

 

 

 

 

 

 

 

 

 

 测试

   比如,我这里打开下,数据库

Navicat for MySQL之MySQL客户端的下载、安装和使用

个人推荐,比较好的MySQL客户端工具

 

   得,先启动之前安装好的数据库。

复制代码
[hadoop@djt002 ~]$ su root
Password: 
[root@djt002 hadoop]# cd /usr/local/
[root@djt002 local]# pwd
/usr/local
[root@djt002 local]# service mysqld start
Starting mysqld:                                           [  OK  ]
[root@djt002 local]# 
复制代码

 

  然后,这边,选择连接。

 

 

复制代码
[hadoop@djt002 sqoop-1.4.6]$ sqoop list-databases --connect jdbc:mysql://djt002/ --username hive --password hive
Warning: /usr/local/sqoop/sqoop-1.4.6/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/sqoop-1.4.6/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/sqoop-1.4.6/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
17/03/17 20:30:25 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/03/17 20:30:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/03/17 20:30:27 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
hive
mysql
test
[hadoop@djt002 sqoop-1.4.6]$ sqoop list-tables --connect jdbc:mysql://djt002/hive --username hive --password hive
Warning: /usr/local/sqoop/sqoop-1.4.6/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /usr/local/sqoop/sqoop-1.4.6/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /usr/local/sqoop/sqoop-1.4.6/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 17/03/17 20:30:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 17/03/17 20:30:48 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 17/03/17 20:30:50 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. BUCKETING_COLS CDS COLUMNS_V2 DATABASE_PARAMS DBS FUNCS FUNC_RU GLOBAL_PRIVS IDXS INDEX_PARAMS PARTITIONS PARTITION_KEYS PARTITION_KEY_VALS PARTITION_PARAMS PART_COL_PRIVS PART_COL_STATS PART_PRIVS ROLES SDS SD_PARAMS SEQUENCE_TABLE SERDES SERDE_PARAMS SKEWED_COL_NAMES SKEWED_COL_VALUE_LOC_MAP SKEWED_STRING_LIST SKEWED_STRING_LIST_VALUES SKEWED_VALUES SORT_COLS TABLE_PARAMS TAB_COL_STATS TBLS TBL_COL_PRIVS TBL_PRIVS VERSION [hadoop@djt002 sqoop-1.4.6]$ pwd /usr/local/sqoop/sqoop-1.4.6 [hadoop@djt002 sqoop-1.4.6]$ 
复制代码

 

   然后,继续,还没达到我们想要的目的效果。

继续,怎么做呢?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

补充Sqoop命令怎么用?

  

复制代码
[hadoop@djt002 sqoop-1.4.6]$ sqoop help
Warning: /usr/local/sqoop/sqoop-1.4.6/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/sqoop-1.4.6/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/sqoop-1.4.6/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
17/03/17 20:03:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
usage: sqoop COMMAND [ARGS]    即sqoop 命令 参数

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table Import a table definition into Hive              跟hive有关 eval  Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help  List available commands import Import a table from a database to HDFS            导入 import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server             列出数据库  list-tables List available tables in a database              列出数据表 merge Merge results of incremental imports             合并增量导入  metastore  Run a standalone Sqoop metastore               元数据存储 version Display version information                  版本号 See 'sqoop help COMMAND' for information on a specific command. [hadoop@djt002 sqoop-1.4.6]$ 
复制代码

 

  大家,最好,还是擅于读官方文档

 http://sqoop.apache.org/

 

http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html

 

 

 

 

 

 

 

 

 

 

待整理,如下,暂时不要去看

 

 

 

二、Hadoop2.6.0下安装Sqoop

   我这里,暂时,以Ubuntu环境下。

     步骤一: 下载sqoop2安装包:

http://mirrors.hust.edu.cn/apache/sqoop/1.99.6/   或者   http://archive.apache.org/dist/sqoop/   (推荐这个)

   我一般,传到Linux里的目录是,/usr/local/sqoop,这个,自行去设置。

步骤二、解压安装包,sudo  tar  -zxvf  sqoop-1.99.6-bin-hadoop200.tar.gz

步骤三、修改文件的名字为sqoop1996 ,

           sudo  mv  sqoop-1.99.6-bin-hadoop200.tar.gz     sqoop-1.99.6

步骤四、进入sqoop-1.99.6文件,进行配置:cd sqoop-1.99.6

  1.配置环境变量,sudo gedit ~/.bashrc,写入sqoop的安装路径和path变量

        export SQOOP2_HOME=/usr/local/sqoop/sqoop-1.99.6

               export PATH=.:$SQOOP2_HOME/bin:$PATH

               export CALALINA_BASE=$SQOOP2_HOME/server

        2.把MySQL的connect包拷贝到sqoop1.99.6/server/lib下面

   3.修改配置文件$SQOOP2_HOME/server/conf/sqoop.properties

   修改配置文件$SQOOP2_HOME/server/conf/catalina.properties,如图红框部分修改为自己的安装目录

 

 

 步骤五、启动sqoop server

  在sqoop的安装目录下,执行bin/sqoop.sh server start

 

步骤六、启动sqoop client

  在sqoop的安装目录下,执行bin/sqoop.sh client start

 

 

 

 

 

 


本文转自大数据躺过的坑博客园博客,原文链接:http://www.cnblogs.com/zlslch/p/6116363.html,如需转载请自行联系原作者

相关实践学习
云数据库HBase版使用教程
  相关的阿里云产品:云数据库 HBase 版 面向大数据领域的一站式NoSQL服务,100%兼容开源HBase并深度扩展,支持海量数据下的实时存储、高并发吞吐、轻SQL分析、全文检索、时序时空查询等能力,是风控、推荐、广告、物联网、车联网、Feeds流、数据大屏等场景首选数据库,是为淘宝、支付宝、菜鸟等众多阿里核心业务提供关键支撑的数据库。 了解产品详情: https://cn.aliyun.com/product/hbase   ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
相关文章
|
4月前
|
SQL 分布式计算 Hadoop
创建hive表并关联数据
创建hive表并关联数据
35 0
|
4月前
|
消息中间件 分布式计算 大数据
【大数据技术Hadoop+Spark】Flume、Kafka的简介及安装(图文解释 超详细)
【大数据技术Hadoop+Spark】Flume、Kafka的简介及安装(图文解释 超详细)
66 0
|
1月前
|
SQL 关系型数据库 MySQL
Sqoop【付诸实践 01】Sqoop1最新版 MySQL与HDFS\Hive\HBase 核心导入导出案例分享+多个WRAN及Exception问题处理(一篇即可学会在日常工作中使用Sqoop)
【2月更文挑战第9天】Sqoop【付诸实践 01】Sqoop1最新版 MySQL与HDFS\Hive\HBase 核心导入导出案例分享+多个WRAN及Exception问题处理(一篇即可学会在日常工作中使用Sqoop)
88 7
|
1月前
|
存储 Java Linux
Linux安装HBase的详细教程及常用方法
Linux安装HBase的详细教程及常用方法
74 1
|
2月前
|
SQL 消息中间件 Kafka
Flink部署问题之hive表没有数据如何解决
Apache Flink是由Apache软件基金会开发的开源流处理框架,其核心是用Java和Scala编写的分布式流数据流引擎。本合集提供有关Apache Flink相关技术、使用技巧和最佳实践的资源。
|
3月前
|
存储 分布式计算 关系型数据库
Sqoop与HBase结合使用:实现强大的数据存储
Sqoop与HBase结合使用:实现强大的数据存储
|
3月前
|
分布式计算 资源调度 Hadoop
在Linux系统上安装Hadoop的详细步骤
【1月更文挑战第4天】在Linux系统上安装Hadoop的详细步骤
414 0
|
3月前
|
SQL 分布式计算 Java
linux安装Hive详细步骤
【1月更文挑战第2天】linux安装Hive详细步骤
145 0
|
4月前
|
SQL 分布式计算 关系型数据库
Hive安装
Hive安装
45 1
|
4月前
|
存储 分布式计算 Hadoop
hadoop 安装系列教程二——伪分布式
hadoop 安装系列教程二——伪分布式
44 0

热门文章

最新文章