5t4we 发表于 2017-11-27 15:54:59

Python3中request模块访问网页以及客户端伪装

                                                在python3中我们使用request模块访问一个网页,可以选择对文件的读写或者urllib.request.urlretrieve()方法将我们浏览的页面保存到本地。
方法1:
url_list=["http://www.bundcredit.com","http://www.baidu.com","http://www.winnerlook.com","http://www.winnertoke.com"]
forurlinfoin url_list:
    file=urllib.request.urlopen(urlinfo)
    data=file.read()
    withopen(str(urlinfo).split(".")+".html","wb") as fileinfo:
      fileinfo.write(data)
方法2:
filename=urllib.request.urlretrieve("http://www.cniao5.com/course/sz.html",filename=str(fileline)
检查Web服务器Nginx的访问日志:
IP地址   时间   访问方法访问协议访问状态等
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Python-urllib/3.5" "-"
伪装其信息
import   urllib.request
importre
url="http://www.bundcredit.com"
headers = ("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0")
opener = urllib.request.build_opener()
opener.addheaders=
data=opener.open(url).read()
withopen( "1.html", "wb") as fileinfo:
    fileinfo.write(data)
伪装后的请求:
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0" "-"
180.156.222.228 - - "GET / HTTP/1.1" 200 4462 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0" "-"

                                       

页: [1]
查看完整版本: Python3中request模块访问网页以及客户端伪装