What is the association?
与子例程一样,协程也是一种程序组件。 相对子例程而言,协程更为一般和灵活,但在实践中使用没有子例程那样广泛。 协程源自Simula和Modula-2语言,但也有其他语言支持。 协程更适合于用来实现彼此熟悉的程序组件,如合作式多任务,迭代器,无限列表和管道。
协程拥有自己的寄存器上下文和栈,协程调度切换时,将寄存器上下文和栈保存到其他地方,在切回来的时候,恢复先前保存的寄存器上下文和栈。因此:协程能保留上一次调用时的状态(即所有局部状态的一个特定组合),每次过程重入时,就相当于进入上一次调用的状态,换种说法:进入上一次离开时所处逻辑流的位置。
协程的优缺点:
优点
-
无需线程上下文切换的开销
-
无需原子操作锁定及同步的开销(更改一个变量)
-
方便切换控制流,简化编程模型
-
高并发+高扩展性+低成本:一个CPU支持上万的协程都不是问题。所以很适合用于高并发处理。
缺点:
-
无法利用多核资源:协程的本质是个单线程,它不能多核,协程需要和进程配合才能运行在多CPU上,当然我们日常所编写的绝大部分应用都没有这个必要,除非是CPU密集型应用。
-
进行阻塞(Blocking)操作(如IO时)会阻塞掉整个程序
实现协程实例
yield
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
def
consumer(name):
print
(
"--->starting eating baozi..."
)
while
True
:
new_baozi
=
yield
# 直接返回
print
(
"[%s] is eating baozi %s"
%
(name, new_baozi))
def
producer():
r
=
con.__next__()
r
=
con2.__next__()
n
=
0
while
n <
5
:
n
+
=
1
con.send(n)
# 唤醒生成器的同时传入一个参数
con2.send(n)
print
(
"\033[32;1m[producer]\033[0m is making baozi %s"
%
n)
if
__name__
=
=
'__main__'
:
con
=
consumer(
"c1"
)
con2
=
consumer(
"c2"
)
p
=
producer()
|
Greenlet
安装greenlet
1
|
pip3 install greenlet
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
# -*- coding:utf-8 -*-
from
greenlet
import
greenlet
def
func1():
print
(
12
)
gr2.switch()
print
(
34
)
gr2.switch()
def
func2():
print
(
56
)
gr1.switch()
print
(
78
)
# 创建两个携程
gr1
=
greenlet(func1)
gr2
=
greenlet(func2)
gr1.switch()
# 手动切换
|
Gevent
Gevent可以实现并发同步或异步编程,在gevent中用到的主要模式是Greenlet, 它是以C扩展模块形式接入Python的轻量级协程,Greenlet全部运行在主程序操作系统进程的内部,但它们被协作式地调度。
安装Gevent
1
|
pip3 install gevent
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
import
gevent
def
foo():
print
(
'Running in foo'
)
gevent.sleep(
2
)
print
(
'Explicit context switch to foo again'
)
def
bar():
print
(
'Explicit context to bar'
)
gevent.sleep(
3
)
print
(
'Implicit context switch back to bar'
)
# 自动切换
gevent.joinall([
gevent.spawn(foo),
# 启动一个协程
gevent.spawn(bar),
])
|
页面抓取
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
from
urllib
import
request
from
gevent
import
monkey
import
gevent
import
time
monkey.patch_all()
# 当前程序中只要设置到IO操作的都做上标记
def
wget(url):
print
(
'GET: %s'
%
url)
resp
=
request.urlopen(url)
data
=
resp.read()
print
(
'%d bytes received from %s.'
%
(
len
(data), url))
urls
=
[
'https://www.python.org/'
,
'https://www.python.org/'
,
'https://github.com/'
,
'https://blog.ansheng.me/'
,
]
# 串行抓取
start_time
=
time.time()
for
n
in
urls:
wget(n)
print
(
"串行抓取使用时间:"
, time.time()
-
start_time)
# 并行抓取
ctrip_time
=
time.time()
gevent.joinall([
gevent.spawn(wget,
'https://www.python.org/'
),
gevent.spawn(wget,
'https://www.python.org/'
),
gevent.spawn(wget,
'https://github.com/'
),
gevent.spawn(wget,
'https://blog.ansheng.me/'
),
])
print
(
"并行抓取使用时间:"
, time.time()
-
ctrip_time)
|
输出
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
C:\Python\Python35\python.exe E:
/
MyCodeProjects
/
协程
/
s4.py
GET: https:
/
/
www.python.org
/
47424
bytes received
from
https:
/
/
www.python.org
/
.
GET: https:
/
/
www.python.org
/
47424
bytes received
from
https:
/
/
www.python.org
/
.
GET: https:
/
/
github.com
/
25735
bytes received
from
https:
/
/
github.com
/
.
GET: https:
/
/
blog.ansheng.me
/
82693
bytes received
from
https:
/
/
blog.ansheng.me
/
.
串行抓取使用时间:
15.143015384674072
GET: https:
/
/
www.python.org
/
GET: https:
/
/
www.python.org
/
GET: https:
/
/
github.com
/
GET: https:
/
/
blog.ansheng.me
/
25736
bytes received
from
https:
/
/
github.com
/
.
47424
bytes received
from
https:
/
/
www.python.org
/
.
82693
bytes received
from
https:
/
/
blog.ansheng.me
/
.
47424
bytes received
from
https:
/
/
www.python.org
/
.
并行抓取使用时间:
3.781306266784668
Process finished with exit code
0
|
本文转自 Edenwy 51CTO博客,原文链接:http://blog.51cto.com/edeny/1924915,如需转载请自行联系原作者