容器服务Docker&Kubernetes 关注
手机版

当 Kubernetes 遇到阿里云 之 快速部署1.6.7版本

  1. 云栖社区>
  2. 容器服务Docker&Kubernetes>
  3. 博客>
  4. 正文

当 Kubernetes 遇到阿里云 之 快速部署1.6.7版本

初扬 2017-04-20 19:55:03 浏览8652 评论17

摘要: # 当 Kubernetes 遇到阿里云 之 快速部署1.6.7版本 阿里云提供了丰富多样的云产品支持,包括ECS、VPC网络、经典网络、负载均衡SLB等等,可以帮助Docker应用轻松在云端运行。

当 Kubernetes 遇到阿里云 之 快速部署1.6.7版本

阿里云提供了丰富多样的云产品支持,包括ECS、VPC网络、经典网络、负载均衡SLB等等,可以帮助Docker应用轻松在云端运行。阿里云除了推出容器服务提供了一站式的容器应用管理解决方案,也在不断推动其他开源容器技术和阿里云的集成更好地满足用户的多样化需求。

本文是一个How To文章,尽量用最简单的方式让您在阿里云上一最快的方式一键部署起来一个Kubernetes集群。本文基于Kubernetes最新版1.6.7版本。并且集成了Kubernetes的阿里云CloudProvider,让你能方便的使用阿里云上提供的各种服务,如VPC网络,阿里云SLB,NAS文件存储等等。

同时,您还可以通过阿里云的ROS模板的方式来快速部署本文的Kubernetes。参见

前置条件

  • 支持阿里云CentOS 7.2-x64版本及Ubuntu 16.04版本
  • 支持阿里云VPC网络
  • 准备阿里云账号KeyID与KeySecret,参见
  • 如果您需要下载任何墙外的镜像,请移步使用阿里云镜像服务加速器
  • 请至少准备两个ECS实例,其中 node1 将作为master节点,node2作为工作节点

安装Kubernetes

安装Kubernetes的过程非常简单,总共分两步,1.创建Master;2.添加slave节点。

创建Master节点

创建Master只需要两个参数,阿里云账号的ACCESS_KEY_ID,ACCESS_KEY_SECRET,从这里获得.注意记录输出中的 token 及endpoint.

[root@master ~]# export ACCESS_KEY_ID=your_key_id
[root@master ~]# export ACCESS_KEY_SECRET=your_key_secret
[root@master ~]# curl -sSL http://aliacs-k8s.oss-cn-hangzhou.aliyuncs.com/installer/kubemgr-1.6.7.sh \
| bash -s nice --node-type master --key-id ${ACCESS_KEY_ID} --key-secret ${ACCESS_KEY_SECRET} 

.......

准备中...                          ################################# [100%]
正在升级/安装...
   1:kubernetes-cni-0.5.1-0           ################################# [ 20%]
   2:kubelet-1.6.7-0                  ################################# [ 40%]
   3:kubectl-1.6.7-0                  ################################# [ 60%]
   4:kubeadm-1.6.7-0                  ################################# [ 80%]
   5:ossfs-1.80.0-1                   ################################# [100%]
TOKEN: 612391.bcb426dc8367e04f
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.6.0
[init] Using Authorization mode: RBAC
[init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster.
    (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose)

......

[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready

.......

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run (as a regular user):

  sudo cp /etc/kubernetes/admin.conf $HOME/
  sudo chown $(id -u):$(id -g) $HOME/admin.conf
  export KUBECONFIG=$HOME/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join --token aea23c.721b254c602d82c6 10.24.2.46:6443

添加Slave节点

创建另一个ECS作为Kubernetes集群的Slave节点,SSH登录到Slave节点,在上一步中记录输出的TOKEN=aea23c.721b254c602d82c6,ENDPOINT=10.24.2.46:6443

[root@node1 ~]# export ACCESS_KEY_ID=your_key_id
[root@node1 ~]# export ACCESS_KEY_SECRET=your_key_secret
[root@node1 ~]# curl -sSL http://aliacs-k8s.oss-cn-hangzhou.aliyuncs.com/installer/kubemgr-1.6.7.sh \
| bash -s nice --node-type node --key-id ${ACCESS_KEY_ID} --key-secret ${ACCESS_KEY_SECRET} --token ${TOKEN} --endpoint ${ENDPOINT}

.......

[preflight] Skipping pre-flight checks
[discovery] Trying to connect to API Server "10.24.2.46:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.24.2.46:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://10.24.2.46:6443"
[discovery] Successfully established connection with API Server "10.24.2.46:6443"
[bootstrap] Detected server version: v1.6.7-2+555a0aa47c5afb
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
  received.
* Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

到此一个最小化的Kubernetes集群就已经创建出来了。您可以多次重复添加节点步骤来为集群添加更多的节点。

使用Kubernetes集群

登录到master上面可以通过kubectl命令来操作集群ssh root@master, 如下:运行一个nginx应用,并使用--type=LoadBalancer来使用阿里云CloudProvider创建阿里云SLB。

[root@master ~]# export KUBECONFIG=/etc/kubernetes/admin.conf
[root@master ~]# kubectl get po --namespace=kube-system
[root@master ~]# kubectl run nginx --image=registry.cn-hangzhou.aliyuncs.com/spacexnice/nginx:latest --replicas=2 --labels run=nginx
[root@master ~]# kubectl expose deployment nginx --port=80 --target-port=80 --type=LoadBalancer

同时我们也提前为您部署了Kubernetes的dashboard. 您可以通过以下命令来查看dashboard的NodePort

[root@master ~]# kubectl --namespace=kube-system get svc kubernetes-dashboard
NAME                   CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes-dashboard   172.19.52.104   <nodes>       80:31432/TCP   3h 

上面显示端口为31432,然后打开浏览器,通过http://masterip:31432就可以访问到dashboard了。如果没有PORT,可以修改SVC的type=NodePort. kubectl --namespace=kube-system edit svc kubernetes-dashboard

Enjoy your Kubernetes!

【云栖快讯】阿里云栖开发者沙龙(Java技术专场)火热来袭!快来报名参与吧!  详情请点击

网友评论

1F
weizoom_robert

Apr 21 16:02:07 iZbp126lnzvkiahroz4t5qZ kubelet[13038]: I0421 16:02:07.468344 13038 alicloud.go:213] Alicloud.ExternalID("izbp126lnzvkiahroz4t5qz")
Apr 21 16:02:07 iZbp126lnzvkiahroz4t5qZ kubelet[13038]: I0421 16:02:07.468375 13038 alicloud_instances.go:132] Alicloud.findInstanceByNodeName("izbp126lnzvkiahroz4t5qz")
Apr 21 16:02:07 iZbp126lnzvkiahroz4t5qZ kubelet[13038]: E0421 16:02:07.468439 13038 kubelet_node_status.go:73] Unable to construct v1.Node object for kubelet: failed to get external ID from cloud provider: instance not found

执行脚本里出错,kubelet log 如上

初扬

不要修改实例名称

291939600367437427

然后呢

评论
2F
earthdistance

1.6.1版本只是能在vpc网络的ECS上使用吗?我的经典网络ECS能装1.6.0,但是1.6.1死活装不上,提示要配置region参数,但是我已经替换了啊.

初扬

仅支持VPC网络

评论
3F
cloudexp

[apiclient] Created API client, waiting for the control plane to become ready
卡住了,这个怎么办?

偏执的码农

我修改了一下脚本 可以参考https://gist.github.com/XdaTk/95cb57f9949c76fd078ab2d136e981f6

yufeiok

好象现在还是卡住状态,等好久都过不去

评论
4F
cloudexp

我改过hostname,是不是这个脚本就没办法用了?

偏执的码农

恩 重新安装一下就可以 然后把 vpc 的备注更改为主机名 就可以了

1461184991924141

@262173491044034621 应该只要保持hostname和实例名一致就可以吧? 我测试好像OK的

评论
5F
times21
在vpc中安装遇到如下问题

Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787228 23726 remote_image.go:61] ListImages with filter "nil" from image service failed: rpc error: code = 2 desc = Cannot conne
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787238 23726 kuberuntime_image.go:106] ListImages failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: W0424 23:57:36.787242 23726 image_gc_manager.go:176] [imageGCManager] Failed to update image list: rpc error: code = 2 desc = Cannot connect to
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787266 23726 remote_image.go:61] ListImages with filter "nil" from image service failed: rpc error: code = 2 desc = Cannot conne
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787276 23726 kuberuntime_image.go:106] ListImages failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: W0424 23:57:36.787280 23726 image_gc_manager.go:176] [imageGCManager] Failed to update image list: rpc error: code = 2 desc = Cannot connect to
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787300 23726 remote_image.go:61] ListImages with filter "nil" from image service failed: rpc error: code = 2 desc = Cannot conne
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: E0424 23:57:36.787309 23726 kuberuntime_image.go:106] ListImages failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the
Apr 24 23:57:37 iZm5ehezcx91hfoqsswrdrZ kubelet[23726]: W0424 23:57:36.787314 23726 image_gc_manager.go:176] [imageGCManager] Failed to update image list: rpc error: code = 2 desc = Cannot connect to
A

初扬

docker 没启动成功

评论
6F
master12

@初扬 给dashboard安装heapster(1.2.0)插件后,heapster访问不了APIserver的接口,怀疑是heapster不能放完apiserver的安全接口,所以把 --source修改了,如下
command:

  • /heapster
  • --source=kubernetes:http://192.168.163.136:8080?inClusterConfig=false
  • --sink=influxdb:http://monitoring-influxdb:8086
    但是修改后还是报错,Failed to list *api.Pod: Get http://192.168.163.136:8080/api/v1/pods?resourceVersion=0: dial tcp 192.168.163.136:8080: getsockopt: connection refused,有没有解决办法?
初扬

访问不了APIserver 报什么错?

评论
7F
cloudexp

@初扬 安装成功,但是有个问题一直没找到原因和解决办法,当我访问dashboard想进行一些操作时,会报权限错误,请问如何配置serviceaccount使得可以通过dashboard操作?
错误如下:
User "system:serviceaccount:kube-system:dashboard" cannot list deployments.extensions in the namespace "kube-system". (get deployments.extensions)

初扬

默认给dashboard的是只读权限,可以修改dashboard的serviceAccount: clusteradmin, 具有最高权限,看你需要配置

评论
8F
龙芩

node安装成功,master找不到该节点
...
[discovery] Successfully established connection with API Server "192.168.5.26:6443"
[bootstrap] Detected server version: kubernetes-v1.6.1-alicloud.release.1-3-ga9a63c03d8554b-dirty
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:

  • Certificate signing request sent to master and response
    received.
  • Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

[root@master ~]# kubectl get nodes --namespace=all
NAME STATUS AGE VERSION
izwz9ha54l79m5erg053tmz Ready 2h v1.6.1-2+ed9e3d33a07093

初扬

在node 节点上执行 journalctl -u kubelet -f 看看日志,能找到原因

评论
9F
289874284219650557

这个部署出来是单master节点吗?
如何做多master、etcd节点?

10F
七神之光

怎么支持跨账户ecs?怎么使用公网ip
ENDPOINT 为什么不是公网ip呀。。。。。。。。。
我把ip换成公网ip 安装node节点日志如下。。。。

Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service.
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: hostname "iz2zebf5a2" could not be reached
[preflight] WARNING: hostname "iz2zebf5a2" lookup iz2zebf5a2yp2rnjoznjuez on 100.100.2.136:53: no such host
[discovery] Trying to connect to API Server "47.93.xxx:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://47.93.xxx:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://47.93.xxx:6443"
[discovery] Successfully established connection with API Server "47.93.xxx:6443"
failed to check server version: Get https://172.17.39.xx:6443/version: dial tcp 172.17.39.xx:6443: i/o timeout

11F
七神之光

请问 集群默认的这个service 怎么改 Endpoints 为公网ip
kube-apiserver的启动参数 里面 改为公网ip 直接不能启动。。。

kubectl describe svc kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver

        provider=kubernetes

Annotations:
Selector:
Type: ClusterIP
IP: 10.254.0.1
Port: https 443/TCP
Endpoints: 172.17.39.128:6443
Session Affinity: ClientIP
Events:

12F
少校-tt

VPC网络 二进制 安装的 flannel。 想支持 host-gw可以吗。 不使用你的脚本 。 如何使用ACCESS_KEY_ID,ACCESS_KEY_SECRET

13F
chinayie

unkonw option [nice]
cat: cat: No such file or directory
RTNETLINK answers: No such process
docker has been installed
3.0: Pulling from google-containers/pause-amd64
Digest: sha256:3b3a29e3c90ae7762bdf587d19302e62485b6bef46e114b741f7d75dba023bd3
Status: Image is up to date for registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0
Error: Container network CIDR not found.

1258168999730472chinayie 赞同
1258168999730472

和你同一个问题

评论
14F
初扬

ip route ; ip addr 看看你的172 网段还有 192.168 网段 10网段是否被占用 启用 set -x 调试执行脚本

15F
elvisliu

脚本有些小bug,基于kubemgr-1.6.7.sh这个版本

detect_cidr这个函数里:
"""
code=$(ip route |grep default|grep "$prefix"|wc -l)
code=$(ip addr |grep "$p"|wc -l)
"""
这两行的grep "$prefix"和grep "$p"都需要grep的参数-F,以便escape ".",取消grep对pattern的正则表达式匹配

1258168999730472初扬 赞同
16F
1258168999730472

unkonw option [nice]
cat: cat: No such file or directory
cat: /etc/centos-release: No such file or directory
RTNETLINK answers: No such process
docker has been installed
3.0: Pulling from google-containers/pause-amd64
Digest: sha256:3b3a29e3c90ae7762bdf587d19302e62485b6bef46e114b741f7d75dba023bd3
Status: Image is up to date for registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0
Error: Container network CIDR not found.
root@iZj6c6589vorqplvg02w35Z:~# cat

17F
233125908292412655

Unable to construct v1.Node object for kubelet: failed to get external ID from cloud provider: instance not found

遇到相似的问题,如果用terraform provision的ECS,会默认添加实例名称ECS-Instance,这样用user-data就自动安装不了kubernetes了,尝试用API或者SDK修改实例名称还是没用。