python爬虫-模拟浏览器

阅读原文时间：2021年04月22日阅读：1

import urllib.request
import random

url="http://www.badu.com" ''' #设置一个较完整的请求头 headers={ "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" " Content-Type":"text/html;charset=utf-8", "User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36" } #设置一个请求体 req=urllib.request.Request(url,headers=headers) #发起请求 response=urllib.request.urlopen(req) data=response.read().decode("utf-8") print(data) ''' #多弄几个UA就可以防止封ip agentsList=[ "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
"Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11" ] #随机拿取一个AG agentStr=random.choice(agentsList) req=urllib.request.Request(url) #用add_header直接向请求体里添加了User-Agent req.add_header("User-Agent",agentStr) response=urllib.request.urlopen(req) print(response.read().decode("utf-8"))

手机扫一扫

移动阅读更方便

你可能感兴趣的文章

【可视化大屏】用Python开发「淄博烧烤」微博热评舆情分析大屏