046.Python协程
阅读原文时间:2023年07月08日阅读:1

1 生成器

初始化生成器函数 返回生成器对象,简称生成器

def gen():
for i in range(10):
#yield 返回便能够保留状态
yield i
mygen = gen()
for i in mygen:
print(i)

执行

[root@node10 python]# python3 test.py
0
1
2
3
4
5
6
7
8
9

使用next定义遍历里次数

def gen():
for i in range(10):
yield i

初始化生成器函数 返回生成器对象,简称生成器

mygen = gen()
for i in range (3):
res = next(mygen)
print (res)

执行

[root@node10 python]# python3 test.py
0
1
2

2 用协程改写生产者消费者模型

def producer():
for i in range(100):
yield i

def consumer():
g = producer()
for i in g:
print(i)

consumer()

3 协程的具体实现

switch 一般遇到阻塞时,可以手动调用该函数进行任务切换

缺点:不能够自动规避io,即不能自动实现遇到阻塞就切换

from greenlet import greenlet
import time
def plane():
print ("plane one")
print ("Plane two")
def fly():
print ("fly to newyork")
print ("fly to beijing")
g1 = greenlet(plane)
g2 = greenlet(fly)
g1.switch()

在执行之前,需要安装greenlet模块

[root@node10 python]# pip-3 install wheel

[root@node10 python]# pip-3 install gevent

执行python

[root@node10 python]# python3 test.py
plane one
Plane two

添加阻塞,并配置一个swith

import time
def plane():
print ("plane one")
g2.switch()
time.sleep(2)
print ("Plane two")
def fly():
print ("fly to newyork")
time.sleep(2)
print ("fly to beijing")
g1 = greenlet(plane)
g2 = greenlet(fly)
g1.switch()

执行

[root@node10 python]# python3 test.py
plane one
fly to newyork
fly to beijing

有阻塞不能启动切换

from greenlet import greenlet
import time
def plane():
print ("plane one")
g2.switch()
time.sleep(2)
print ("Plane two")
def fly():
print ("fly to newyork")
time.sleep(2)
print ("fly to beijing")
g1.switch()
g1 = greenlet(plane)
g2 = greenlet(fly)
g1.switch()

执行

[root@node10 python]# python3 test.py
plane one
fly to newyork
fly to beijing
Plane two

4 使用gevent

缺陷:不能够识别time.sleep 阻塞

from greenlet import greenlet
import gevent
import time
def plane():
print ("plane one")
time.sleep(2)
print ("Plane two")
def fly():
print ("fly to newyork")
time.sleep(2)
print ("fly to beijing")

利用gevent 创建协程对象g1

g1 = gevent.spawn(plane)

利用gevent 创建协程对象g2

g2 = gevent.spawn(fly)
g1.join() #阻塞,直到g1协程任务执行完毕
g2.join() #阻塞,直到g2协程任务执行完毕
print("主线程执行完毕")

执行

[root@node10 python]# python3 test.py
plane one
Plane two
fly to newyork
fly to beijing
主线程执行完毕

阻塞没有生效

进阶改造

5 用gevent.sleep 取代 time.sleep()

from greenlet import greenlet
import gevent
import time
def plane():
print ("plane one")
gevent.sleep(2)
print ("Plane two")
def fly():
print ("fly to newyork")
gevent.sleep(2)
print ("fly to beijing")

利用gevent 创建协程对象g1

g1 = gevent.spawn(plane)

利用gevent 创建协程对象g2

g2 = gevent.spawn(fly)
g1.join() #阻塞,直到g1协程任务执行完毕
g2.join() #阻塞,直到g2协程任务执行完毕
print("主线程执行完毕")

执行,自动实现任务切换

[root@node10 python]# python3 test.py
plane one
fly to newyork
Plane two
fly to beijing
主线程执行完毕

终极解决不识别问题

6 引入ba patch_all

下面所有引入的模块所包含的阻塞,重新识别出来.

from greenlet import greenlet
from gevent import monkey
monkey.patch_all()
import gevent
import time
def plane():
print ("plane one")
time.sleep(2)
print ("Plane two")
def fly():
print ("fly to newyork")
time.sleep(2)
print ("fly to beijing")

利用gevent 创建协程对象g1

g1 = gevent.spawn(plane)

利用gevent 创建协程对象g2

g2 = gevent.spawn(fly)
g1.join() #阻塞,直到g1协程任务执行完毕
g2.join() #阻塞,直到g2协程任务执行完毕
print("主线程执行完毕")

执行

[root@node10 python]# python3 test.py
plane one
fly to newyork
Plane two
fly to beijing
主线程执行完毕

7 协程案例

  1. spawn(函数,参数1,参数2,参数3….) 启动切换一个协程
  2. join() 阻塞,直到某个协成执行完毕
  3. joinall() 等待所有协成执行任务完毕
    • g1.join() g2.join() 可以通过joinall简写
    • gevent.joinall( [g1,g2] ) 等价于 1; 参数是一个列表;
  4. value 获取协成返回值

oinall value函数的用法

rom gevent import monkey;monkey.patch_all()
import time
import gevent
def plane():
print ("plane one")
time.sleep(2)
print ("Plane two")
return ("有两架飞机")
def fly():
print ("fly to newyork")
time.sleep(2)
print ("fly to beijing")
return ("fly two place")
g1 = gevent.spawn(plane)
g2 = gevent.spawn(fly)
gevent.joinall( [g1,g2] )

获取协成的返回值

print(g1.value)
print(g2.value)
print("主线程执行完毕")

执行

plane one
fly to newyork
Plane two
fly to beijing
有两架飞机
fly two place
主线程执行完毕

利用协程爬取页面数据

安装request模块

[root@node10 python]# pip-3 install requests

import gevent
import requests
import time

抓取网站信息,返回响应对象

print ("<++++++++++++抓取网站信息,返回响应对象+++++++++++++++++>")
response = requests.get("http://www.baidu.com")
print(response)

获取状态码

print ("<++++++++++++获取状态码+++++++++++++++++>")
res = response.status_code
print(res)

获取字符编码集 apparent_encoding

print ("<++++++++++++获取字符编码集 apparent_encoding+++++++++++++++++>")
res_code = response.apparent_encoding
print(res_code)

设置编码集

print ("<++++++++++++++设置编码集+++++++++++++++++>")
response.encoding = res_code
print ("<++++++++++++获取网页里面的内容+++++++++++++++++>")
res = response.text
print(res)

import re
strvar = r''
obj = re.search("src=(.*?) ",strvar)
res = obj.group()
print (res)
res = obj.groups()
print (res)
res = obj.groups()[0]
print (res)

执行

<++++++++++++抓取网站信息,返回响应对象+++++++++++++++++>

<++++++++++++获取状态码+++++++++++++++++>
200
<++++++++++++获取字符编码集 apparent_encoding+++++++++++++++++>
utf-8
<++++++++++++++设置编码集+++++++++++++++++>
<++++++++++++获取网页里面的内容+++++++++++++++++>

百度一下,你就知道

关于百度 About Baidu

©2017 Baidu 使用百度前必读  意见反馈 京ICP证030173号 

src="https://www.baidu.com/img/bd_logo1.png"
('"https://www.baidu.com/img/bd_logo1.png"',)
"https://www.baidu.com/img/bd_logo1.png"

爬虫实例

import gevent
import requests
import time

抓取网站信息,返回响应对象

response = requests.get("http://www.baidu.com")
print(response)

获取状态码

res = response.status_code
print(res)

获取字符编码集 apparent_encoding

res_code = response.apparent_encoding
print(res_code)

设置编码集

response.encoding = res_code
res = response.text
print(res)

url_list = [
"http://www.baidu.com",
"http://www.4399.com",
"http://www.7k7k.com",
"http://www.jingdong.com",
"http://www.taobao.com",
]
def get_url(url):
response = requests.get(url)
if response.status_code == 200:
pass
# print(response.text)

(1) 正常方式爬取数据

startime = time.time()
for i in url_list:
get_url(i)
endtime = time.time()
print("<=1=1=1=1=1=1=1=1=>")
print(endtime-startime)

(2) 用协程爬取数据 更快

startime = time.time()
lst = []
for i in url_list:
g = gevent.spawn(get_url,i)
lst.append(g)

gevent.joinall(lst)
endtime = time.time()
print("<=2=2=2=2=2=2=2=2=2=>")
print(endtime - startime)

执行


200
utf-8

百度一下,你就知道

关于百度 About Baidu

©2017 Baidu 使用百度前必读  意见反馈 京ICP证030173号 

<=1=1=1=1=1=1=1=1=>
14.512077331542969
<=2=2=2=2=2=2=2=2=2=>
6.321309566497803

协程的速度比较快

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器