python基础---常用模块（未完待续）

mancha · 发表于 2018-8-13 07:14:05

　　re模块（正则模块）
　　正则就是用一些具有特殊含义的符号组合到一起（称为正则表达式）来描述字符或者字符串的方法。或者说：正则就是用来描述一类事物的规则。（在Python中）它内嵌在Python中，并通过 re 模块实现。正则表达式模式被编译成一系列的字节码，然后由用 C 编写的匹配引擎执行。
　　\w       匹配字母数字及下划线
　　\W    匹配非字母数字下划线
　　\s       匹配任意空白字符，等价于【\t\n\r\f】
　　\S       匹配任意非空字符
　　\d       匹配任意数字，等价于【0-9】
　　\D       匹配任意非数字
　　\A       匹配字符串
　　\Z       匹配字符串结束，如果是存在换行，只匹配到换行前的结束字符串
　　\z       匹配字符串结束
　　\G       匹配最后匹配完成的位置
　　\n       匹配一个换行符
　　\t       匹配一个制表符
　　^       匹配字符串的开头
　　$       匹配字符串的末尾
　　.       匹配任意字符，除了换行符，当re.DOTALL标记被指定时，则可以匹配包括换行符的任意字符
　　[…]    用来表示一组字符，单独列出：【amk】匹配’a’，’m’或‘k’
　　[^…]       不在[]中的字符
　　*       匹配0个或多个的表达式
　　+       匹配1个或多个的表达式
　　?       匹配0个或1个由前面的正则表达式定义的片段，非贪婪方式
　　{n}    精确匹配n个前面表达式
　　{n,m}       匹配n到m次由前面的正则表达式定义的片段，贪婪方式
　　a|b    匹配a或b
　　()       匹配括号内的表达式，也表示一个组
import re　　
print(re.findall('\w','hello_ | egon 123'))
　　
print(re.findall('\W','hello_ | egon 123'))
　　
print(re.findall('\s','hello_ | egon 123 \n \t'))
　　
print(re.findall('\S','hello_ | egon 123 \n \t'))
　　
print(re.findall('\d','hello_ | egon 123 \n \t'))
　　
print(re.findall('\D','hello_ | egon 123 \n \t'))
　　
print(re.findall('h','hello_ | hello h egon 123 \n \t'))
　　
print(re.findall('\Ahe','hello_ | hello h egon 123 \n \t'))
　　
print(re.findall('^he','hello_ | hello h egon 123 \n \t'))
　　
print(re.findall('123\Z','hello_ | hello h egon 123 \n \t123'))
　　
print(re.findall('123$','hello_ | hello h egon 123 \n \t123'))
　　
print(re.findall('\n','hello_ | hello h egon 123 \n \t123'))
　　
print(re.findall('\t','hello_ | hello h egon 123 \n \t123'))
　　

　　
输出：
　　
['h', 'e', 'l', 'l', 'o', '_', 'e', 'g', 'o', 'n', '1', '2', '3']
　　
[' ', '|', ' ', ' ']
　　
[' ', ' ', ' ', ' ', '\n', ' ', '\t']
　　
['h', 'e', 'l', 'l', 'o', '_', '|', 'e', 'g', 'o', 'n', '1', '2', '3']
　　
['1', '2', '3']
　　
['h', 'e', 'l', 'l', 'o', '_', ' ', '|', ' ', 'e', 'g', 'o', 'n', ' ', ' ', '\n', ' ', '\t']
　　
['h', 'h', 'h']
　　
['he']
　　
['he']
　　
['123']
　　
['123']
　　
['\n']
　　
['\t']
　　re模块提供的方法：
　　re.findall() 查找所有满足匹配条件的结果，放在列表中
re.search()          只找到第一个匹配到的然后返回一个包含匹配信息的对象，该对象可以通过调用group()方法得到匹配的字符串,如果字符串没有匹配，则返回Nonere.match()             同search，不过在字符串开始出进行匹配，完全可以使用search+^代替matchre.split()                按匹配内容对对象进行分割re.sub()                替换，（老的值，新的值，替换对象，替换次数），不指定替换次数，默认替换所有re.subn()             同sub，不过结果中返回替换的次数re.compile          重用匹配格式　　3、time模块
　　Python中，通常有以下三种方式来计算时间：
　　a.时间戳：
　　时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”，返回的是float类型
　　b.格式化的时间字符串
　　c.结构化的时间
　　struct_time元组共有9个元素:(年，月，日，时，分，秒，一年中第几周，一年中第几天，夏令时)
　　4、random模块
　　5、os模块
　　6、sys模块
　　7、json和pickle模块（序列化模块）
　　把对象(变量)从内存中变成可存储或传输的过程称为序列化
　　在Python中叫pickling，在其他语言中也被称之为serialization，marshalling，flattening等等
　　序列化的作用：
　　a.持久保存状态
　　在断电或重启程序之前将程序当前内存中所有的数据都保存下来（保存到文件中），以便于下次程序执行能够从文件中载入之前的数据，然后继续执行，这就是序列化
　　b.跨平台数据交互
　　序列化之后，不仅可以把序列化后的内容写入磁盘，还可以通过网络传输到别的机器上，如果收发的双方约定好实用一种序列化的格式，那么便打破了平台/语言差异化带来的限制，实现了跨平台数据交互。反过来，把变量内容从序列化的对象重新读到内存里称之为反序列化，即unpickling
　　
　　json模块
　　如果我们要在不同的编程语言之间传递对象，就必须把对象序列化为标准格式，比如XML，但更好的方法是序列化为JSON，因为JSON表示出来就是一个字符串，可以被所有语言读取，也可以方便地存储到磁盘或者通过网络传输。JSON不仅是标准格式，并且比XML更快，而且可以直接在Web页面中读取，非常方便，所以json适合数据跨平台交互时使用（但是跨平台意味着不会支持某种语言的所有数据类型，如不支持python函数的序列化）
　　内存中结构化的数据<---> 格式json <--->字符串 <---> 保存到文件中或基于网络传输
　　使用：
　　dump       序列化
　　load          反序列化
import json　　
dic={'name':'egon','age':18}
　　
with open('a.json','w') as f: # 序列化字典到文件内容
　　
f.write(json.dumps(dic))
　　
with open('a.json','r') as f: # 反序列化输出
　　
data=f.read()
　　
dic=json.loads(data)
　　dumps       序列化
　　loads       反序列化
import json　　
dic={'name':'egon','age':18}
　　
json.dump(dic,open('b.json','w'))    # 序列化字典到文件内容
　　
print(json.load(open('b.json','r'))['name'])  # 反序列化输出
　　pickle模块
　　pickle只能用于Python（所有数据类型），并且可能不同版本的Python彼此都不兼容，因此，只能用Pickle保存那些不重要的数据，不能成功地反序列化也没关系。
　　内存中结构化的数据<---> 格式pickl<---> bytes类型 <---> 保存到文件中或基于网络传输
　　dumps          序列化
　　loads       反序列化
　　dump          序列化
　　load          反序列化
import pickle　　
dic={'name':'egon','age':18}
　　
with open('d.pkl','wb') as f:       # 序列化字典到文件内容
　　
f.write(pickle.dumps(dic))
　　
with open('d.pkl','rb') as f:       # 反序列化输出
　　
dic=pickle.loads(f.read())
　　
print(dic['name'])
import pickle　　
dic={'name':'egon','age':18}
　　
pickle.dump(dic,open('e.pkl','wb')) # 序列化字典到文件内容
　　
print(pickle.load(open('e.pkl','rb'))['name']) # 反序列化输出
　　pickle是根据内存地址进行反序列化的，所以该内存地址对应的数据在命名空间中必须是已定义的
　　8、shelve模块
　　9、shutil模块
　　高级的文件、文件夹、压缩包处理模块
　　常用方法：
　　将文件内容拷贝到另一个文件中：
　　shutil.copyfileobj(源文件, 目标文件[, length])
　　拷贝文件：
　　shutil.copyfile(src, dst) # 目标文件无需存在
　　仅拷贝权限。内容、组、用户均不变
　　shutil.copymode(src, dst) # 目标文件必须存在
　　仅拷贝状态的信息，包括：mode bits,atime, mtime, flags
　　shutil.copystat(src, dst)    #目标文件必须存在
　　拷贝文件和权限
　　shutil.copy(src, dst)
　　拷贝文件和状态信息
　　shutil.copy2(src, dst)
　　递归的去拷贝文件夹
　　shutil.ignore_patterns(*patterns)
　　shutil.copytree(src, dst, symlinks=False, ignore=None)  #目标目录不能存在，注意对dst目录父级目录要有可写权限，ignore的意思是排除
　　拷贝软连接
　　import shutil
　　shutil.copytree('f1', 'f2', symlinks=True,ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))
　　通常的拷贝都把软连接拷贝成硬链接，即对待软连接来说，创建新的文件
　　递归的去删除文件
　　shutil.rmtree(path[, ignore_errors[,onerror]])
　　递归的去移动文件，它类似mv命令，其实就是重命名
　　shutil.move(src, dst)
　　创建压缩包并返回文件路径，例如：zip、tar
　　shutil.make_archive(base_name, format,...)
　　base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径
　　如  data_bak          =>保存至当前路径
　　如  /tmp/data_bak       =>保存至/tmp/
　　format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
　　root_dir：要压缩的文件夹路径（默认当前目录）
　　owner：用户，默认当前用户
　　group：组，默认当前组
　　logger：用于记录日志，通常是logging.Logger对象
　　练习：
#将 /data 下的文件打包放置当前程序目录　　
import shutil
　　
ret = shutil.make_archive("data_bak", 'gztar', root_dir='/data')
　　

　　
#将 /data下的文件打包放置 /tmp/目录
　　
import shutil
　　
ret = shutil.make_archive("/tmp/data_bak", 'gztar', root_dir='/data')
　　shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：
import zipfile　　
# 压缩
　　
z = zipfile.ZipFile('laxi.zip', 'w')
　　
z.write('a.log')
　　
z.write('data.data')
　　
z.close()
　　

　　
# 解压
　　
z = zipfile.ZipFile('laxi.zip', 'r')
　　
z.extractall(path='.')
　　
z.close()
　　

　　
import tarfile
　　

　　
# 压缩
　　
t=tarfile.open('/tmp/egon.tar','w')
　　
t.add('/test1/a.py',arcname='a.bak')
　　
t.add('/test1/b.py',arcname='b.bak')
　　
t.close()
　　

　　

　　
# 解压
　　
t=tarfile.open('/tmp/egon.tar','r')
　　
t.extractall('/egon')
　　
t.close()
　　10、xml模块
　　xml是实现不同语言或程序之间进行数据交换的协议，跟json功能差不多，但json使用起来更简单，由于比json出现的早，至今很多传统公司如金融行业的很多系统的接口还主要是xml
　　xml是通过<>节点（标签）来区别数据结构的
<?xml version="1.0"?>　　
<data>
　　
<country name="Liechtenstein">
　　
   <rank updated="yes">2</rank>
　　
   <year>2008</year>
　　
   <gdppc>141100</gdppc>
　　
   <neighbor name="Austria" direction="E"/>
　　
   <neighbor name="Switzerland" direction="W"/>
　　
</country>
　　
<country name="Singapore">
　　
   <rank updated="yes">5</rank>
　　
   <year>2011</year>
　　
   <gdppc>59900</gdppc>
　　
   <neighbor name="Malaysia" direction="N"/>
　　
</country>
　　
<country name="Panama">
　　
   <rank updated="yes">69</rank>
　　
   <year>2011</year>
　　
   <gdppc>13600</gdppc>
　　
   <neighbor name="Costa Rica" direction="W"/>
　　
   <neighbor name="Colombia" direction="E"/>
　　
</country>
　　
</data>
　　对xml进行操作：
import xml.etree.ElementTree as ET #导入模块方法　　

　　
tree = ET.parse("xmltest.xml")
　　
root = tree.getroot()
　　
print(root.tag)
　　

　　
#遍历xml文档
　　
for child in root:
　　
print('========>',child.tag,child.attrib,child.attrib['name'])
　　
fori in child:
　　
   print(i.tag,i.attrib,i.text)
　　

　　
#只遍历year 节点
　　
for node in root.iter('year'):
　　
print(node.tag,node.text)
　　
#---------------------------------------
　　

　　
import xml.etree.ElementTree as ET
　　

　　
tree = ET.parse("xmltest.xml")
　　
root = tree.getroot()
　　

　　
#修改
　　
for node in root.iter('year'):
　　
new_year=int(node.text)+1
　　
node.text=str(new_year)
　　
node.set('updated','yes')
　　
node.set('version','1.0')
　　
tree.write('test.xml')
　　

　　
#删除node
　　
for country in root.findall('country'):
　　
  rank = int(country.find('rank').text)
　　
ifrank > 50:
　　
root.remove(country)
　　

　　
tree.write('output.xml')
　　11、configparser模块
　　主要用来解析配置文件
　　配置文件为以下格式：
　　[section1]
　　k1 = v1
　　k2:v2
　　user=egon
　　age=18
　　is_admin=true
　　salary=31
　　[section2]
　　k1 = v1
　　操作方法如下：
　　import configparser                   #  导入模块
　　config=configparser.ConfigParser()    #使用ConfigParser方法得到一个对象赋值给config
　　查看标题：
　　config.sections()
　　查看标题section1下所有key=value的key
　　config.options('section1')
　　查看标题section1下所有key=value的(key,value)格式
　　config.items('section1')
　　查看标题section1下user的值，字符串格式
　　config.get('section1','user')
　　查看标题section1下age的值，整数格式
　　val1=config.getint('section1','age')
　　查看标题section1下is_admin的值，布尔值格式
　　config.getboolean('section1','is_admin')
　　查看标题section1下salary的值，浮点型格式
　　config.getfloat('section1','salary')
　　删除整个标题section2
　　config.remove_section('section2')
　　删除标题section1下的某个k1和k2
　　config.remove_option('section1','k1')
　　config.remove_option('section1','k2')
　　判断是否存在某个标题
　　config.has_section('section1')
　　判断标题section1下是否有user
　　config.has_option('section1','user')
　　添加一个标题
　　config.add_section('egon')
　　在标题egon下添加name=egon,age=18的配置
　　config.set('egon','name','egon')
　　config.set('egon','age',18) #报错,必须是字符串
　　最后将修改的内容写入文件,完成最终的修改
　　config.write(open('a.cfg','w'))
　　12、hashlib模块
　　hash：一种算法 ,3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法
　　三个特点：
　　1.内容相同则hash运算结果相同，内容稍微改变则hash值则变
　　2.不可逆推
　　3.相同算法：无论校验多长的数据，得到的哈希值长度固定
import hashlib　　

　　
m=hashlib.md5()# m=hashlib.sha256()
　　
m.update('hello'.encode('utf8'))
　　
print(m.hexdigest())  #5d41402abc4b2a76b9719d911017c592
　　
m.update('alvin'.encode('utf8'))
　　
print(m.hexdigest())  #92a7e713c30abbb0319fa07da2a5c4af
　　
m2=hashlib.md5()
　　
m2.update('helloalvin'.encode('utf8'))
　　
print(m2.hexdigest()) #92a7e713c30abbb0319fa07da2a5c4af
　　
'''
　　
注意：把一段很长的数据update多次，与一次update这段长数据，得到的结果一样
　　
但是update多次为校验大文件提供了可能。
　　
'''
　　以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。
import hashlib　　

　　
# ######## 256 ########
　　
hash = hashlib.sha256('898oaFs09f'.encode('utf8'))
　　
hash.update('alvin'.encode('utf8'))
　　
print (hash.hexdigest())#e79e68f070cdedcfe63eaf1a2e92c83b4cfb1b5c6bc452d214c1b7e77cdfd1c7
　　

　　

　　
import hashlib
　　
passwds=[
　　
'alex3714',
　　
'alex1313',
　　
'alex94139413',
　　
'alex123456',
　　
'123456alex',
　　
'a123lex',
　　
]
　　
def make_passwd_dic(passwds):
　　
dic={}
　　
for passwd inpasswds:
　　
      m=hashlib.md5()
　　
   m.update(passwd.encode('utf-8'))
　　
   dic[passwd]=m.hexdigest()
　　
return dic
　　

　　
def break_code(cryptograph,passwd_dic):
　　
for k,v inpasswd_dic.items():
　　
      if v == cryptograph:
　　
         print('密码是===>\033[46m%s\033[0m'%k)
　　

　　
cryptograph='aee949757a2e698417463d47acac93df'
　　
break_code(cryptograph,make_passwd_dic(passwds))
　　
python 还有一个 hmac 模块，它内部对我们创建 key 和内容进行进一步的处理然后再加密:
　　
import hmac
　　
h = hmac.new('alvin'.encode('utf8'))
　　
h.update('hello'.encode('utf8'))
　　
print (h.hexdigest())#320df9832eab4c038b6c1d7ed73a5940
　　

　　
#要想保证hmac最终结果一致，必须保证：
　　
#1:hmac.new括号内指定的初始key一样
　　
#2:无论update多少次，校验的内容累加到一起是一样的内容
　　

　　
import hmac
　　

　　
h1=hmac.new(b'egon')
　　
h1.update(b'hello')
　　
h1.update(b'world')
　　
print(h1.hexdigest())
　　

　　
h2=hmac.new(b'egon')
　　
h2.update(b'helloworld')
　　
print(h2.hexdigest())
　　

　　
h3=hmac.new(b'egonhelloworld')
　　
print(h3.hexdigest())
　　

　　
'''
　　
f1bf38d054691688f89dcd34ac3c27f2
　　
f1bf38d054691688f89dcd34ac3c27f2
　　
bcca84edd9eeb86f30539922b28f3981
　　
'''
　　5.subprocess模块
　　在python解释器中开启一个子进程执行shell命令
　　stdout          标准正确输出    # 输出内容为bytes类型，如果在windows输出需要解码为decode（‘gbk’），linux解码为decode（‘utf-8’）
　　stderr       标准错误输出
　　stdin       标准输入
　　shell=True    使用shell命令
　　subprocess.PIPE 把输出结果放到管道
　　res1=subprocess.Popen('ls/Users/jieli/Desktop',shell=True,stdout=subprocess.PIPE)
　　# 先列出桌面上的文件
　　subprocess.Popen('grep txt$',shell=True,stdin=res1.stdout,stdout=subprocess.PIPE)
　　# 把上面的数据交给这条命令作为输入结果，过滤以txt结尾的文件

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] python基础---常用模块（未完待续）

浏览过的版块

扫码加入运维网微信交流群