lujiguo115 发表于 2017-5-8 08:35:11

Python二进制文件与十六进制文本文件转换

  Python有一个binhex模块,在http://docs.python.org/library/binhex.html,用来Encode and decode binhex4 files。我没搞懂binhex4格式,搜索了很久,找到一个讲的相对比较好的http://www.5dmail.net/html/2006-3-2/200632222823.htm。
  控制欲强的人,对未知或不可控充满恐惧。自己写个二进制与十六进制文件转换也许有点“重复发明轮子”的嫌疑,但是实现起来并没有想象的那么复杂,同时增加可控可用度,还有些意想不到的收获。

  ·filehelper
  在《python判断对象是否为文件对象(file object)》中介绍了判断对象是否为文件对象(file object)的方法。这里就派上用场了。
  还要介绍一个同时处理文件输入输出的帮助函数:

view plaincopyprint?





[*]def fileinoutpattern(inp, out, callback=None, inmode="r", outmode="wb"):
[*]
"""
[*]Make sure that 'inp' and 'out' has been 'converted' to file objects,
[*]and call 'callback' with them, finally clear it up.
[*]"""
[*]# Set up
[*]fin = inp
[*]
if not isfilelike_r(fin):
[*]fin = open(inp, inmode)
[*]fout = out
[*]
if not isfilelike_w(fout):
[*]fout = open(out, outmode)
[*]# Call the 'callback'
[*]
result = None
[*]
if callback != None:
[*]result = callback(fin, fout)
[*]# Clear up
[*]
if not isfilelike_r(inp):
[*]fin.close()
[*]
if not isfilelike_w(out):
[*]fout.close()
[*]
return result



def fileinoutpattern(inp, out, callback=None, inmode="r", outmode="wb"):
"""
Make sure that 'inp' and 'out' has been 'converted' to file objects,
and call 'callback' with them, finally clear it up.
"""
# Set up
fin = inp
if not isfilelike_r(fin):
fin = open(inp, inmode)
fout = out
if not isfilelike_w(fout):
fout = open(out, outmode)
# Call the 'callback'
result = None
if callback != None:
result = callback(fin, fout)
# Clear up
if not isfilelike_r(inp):
fin.close()
if not isfilelike_w(out):
fout.close()
return result


  1、判断inp是否为可读文件对象,如果不是就调用open,以inmode模式打开inp所指文件,并赋值给fin;
  2、判断out是否为可写文件对象,如果不是就调用open,以outmode模式打开out所指文件,并赋值给fou;
  3、如果callback不为None就调用它,并将返回结果保存在result;
  4、如果fin是我们自己打开的就关闭它;如果传进来的就是文件对象,则不关闭,避免重复关闭和非用户想要的关闭;
  5、如果fout是我们自己打开的就关闭它;如果传进来的就是文件对象,则不关闭,避免重复关闭和非用户想要的关闭;
  6、将callback的结果返回
  这个很重要,却很简单。当然也可以写成类,像Junit的TestCase那样的,写seup,tearDown方法,分别在调用callback前后执行,这里这个函数就够了。
  ·binhex
  回到我们的主题,有了这个帮助函数,怎么把二进制文件转成十六进制形式呢?

view plaincopyprint?





[*]def binhex(inp, out, extfun=lambda x: x, blocksize=256):
[*]
"""
[*]Convert a binary file 'inp' to binhex file output.
[*]The inp may be a filename or a file-like object supporting read() and close() methods.
[*]The output parameter can either be a filename or a file-like object supporting a write() and close() method.
[*]"""
[*]
def _binhex(fin, fout):
[*]
filesize = 0
[*]
while True:
[*]chunk = fin.read(blocksize)
[*]
if chunk:
[*]redlen = len(chunk)
[*]
for b in chunk:
[*]
fout.write('%02X ' % extfun(ord(b)))
[*]
fout.write('\n')
[*]filesize += redlen
[*]
else:
[*]
break
[*]
return filesize
[*]
return fileinoutpattern(inp, out, _binhex, inmode="rb", outmode="w")



def binhex(inp, out, extfun=lambda x: x, blocksize=256):
"""
Convert a binary file 'inp' to binhex file output.
The inp may be a filename or a file-like object supporting read() and close() methods.
The output parameter can either be a filename or a file-like object supporting a write() and close() method.
"""
def _binhex(fin, fout):
filesize = 0
while True:
chunk = fin.read(blocksize)
if chunk:
redlen = len(chunk)
for b in chunk:
fout.write('%02X ' % extfun(ord(b)))
fout.write('\n')
filesize += redlen
else:
break
return filesize
return fileinoutpattern(inp, out, _binhex, inmode="rb", outmode="w")


  1、需要inp和out两个参数,指明输入和输出文件,一个应用在每个字节上的函数extfun,默认块大小为256字节;
  2、定义一个嵌套的函数_binhex,实现处理逻辑:每次读入blocksize大小的数据,逐字节调用extfun处理,将处理的结果转成十六进制形式,并写入fout;每blocksize数据后添加换行;返回处理字节数;
  3、调用fileinoutpattern传递对应的参数;
  这里我们就可以发现,有了这个帮助函数,我们的函数变得非常简洁,逻辑也十分清晰。
  ·hexbin
  和binhex类似,将十六进制格式文本文件转成二进制文件也很简单:

view plaincopyprint?





[*]def hexbin(inp, out, extfun=slambda x: x):
[*]
"""
[*]Decode a binhex file inp to binary file outpu.
[*]The inp may be a filename or a file-like object supporting read() and close() methods.
[*]The output parameter can either be a filename or a file-like object supporting a write() and close() method.
[*]"""
[*]
def _hexbin(fin, fout):
[*]
for line in fin:
[*]
for i in range(len(line)/3):
[*]
x = int(line, 16)
[*]
fout.write(struct.pack('B', extfun(x)))
[*]
fileinoutpattern(inp, out, _hexbin, inmode="r", outmode="wb")



def hexbin(inp, out, extfun=slambda x: x):
"""
Decode a binhex file inp tobinary file outpu.
The inp may be a filename or a file-like object supporting read() and close() methods.
The output parameter can either be a filename or a file-like object supporting a write() and close() method.
"""
def _hexbin(fin, fout):
for line in fin:
for i in range(len(line)/3):
x = int(line, 16)
fout.write(struct.pack('B', extfun(x)))
fileinoutpattern(inp, out, _hexbin, inmode="r", outmode="wb")

  1、传人inp和out作为输入和输出文件,一个应用在转换后每个字节上的函数extfun;
  2、定义一个嵌套函数_hexbin,实现处理逻辑:逐行读入,每3个字符一组,去掉空格,转成整数,在该整数上调用extfun,写入fout。
  3、调用fileinoutpattern传递对应的参数;
  ·测试代码

view plaincopyprint?





[*]def test():
[*]
"""
[*]Test case.
[*]"""
[*]# binhex test
[*]
zipfilename = r"D:\2.zip"
[*]
binhex(zipfilename, r"D:\2.1.txt")
[*]
with open(zipfilename, "rb") as fin:
[*]
binhex(fin, r"D:\2.2.txt")
[*]
with open(r"D:\2.3.txt", "w") as fout:
[*]binhex(zipfilename, fout)
[*]
with open(zipfilename, "rb") as fin, open(r"D:\2.4.txt", "w") as fout:
[*]binhex(fin, fout)
[*]# hexbin test
[*]
txtfile = r"D:\2.1.txt"
[*]
hexbin(txtfile, r"D:\2.1.zip")
[*]
with open(txtfile, "r") as fin:
[*]
hexbin(fin, r"D:\2.2.zip")
[*]
with open(r"D:\2.3.zip", "wb") as fout:
[*]hexbin(txtfile, fout)
[*]
with open(txtfile, "r") as fin, open(r"D:\2.4.zip", "wb") as fout:
[*]hexbin(fin, fout)
[*]
def XOR(x):
[*]
def _XOR(y):
[*]
return x ^ y
[*]
return _XOR
[*]
xor0x13 = XOR(0x13)
[*]
binhex(zipfilename, r"D:\2.encode.txt", extfun=xor0x13)
[*]
hexbin(r"D:\2.encode.txt", r"D:\2.5.zip", extfun=xor0x13)
[*]
print "OK."



def test():
"""
Test case.
"""
# binhex test
zipfilename = r"D:\2.zip"
binhex(zipfilename, r"D:\2.1.txt")
with open(zipfilename, "rb") as fin:
binhex(fin, r"D:\2.2.txt")
with open(r"D:\2.3.txt", "w") as fout:
binhex(zipfilename, fout)
with open(zipfilename, "rb") as fin, open(r"D:\2.4.txt", "w") as fout:
binhex(fin, fout)
# hexbin test
txtfile = r"D:\2.1.txt"
hexbin(txtfile, r"D:\2.1.zip")
with open(txtfile, "r") as fin:
hexbin(fin, r"D:\2.2.zip")
with open(r"D:\2.3.zip", "wb") as fout:
hexbin(txtfile, fout)
with open(txtfile, "r") as fin, open(r"D:\2.4.zip", "wb") as fout:
hexbin(fin, fout)
def XOR(x):
def _XOR(y):
return x ^ y
return _XOR
xor0x13 = XOR(0x13)
binhex(zipfilename, r"D:\2.encode.txt", extfun=xor0x13)
hexbin(r"D:\2.encode.txt", r"D:\2.5.zip", extfun=xor0x13)
print "OK."

  1、这段测试代码用来测试“功能是否正确实现”已经足够了;
  2、我们定义了XOR(x)的工厂方法,来生产异或函数对象,代码中我们生产了与0x13疑惑的函数对象xor0x13=XOR(0x13)
  3、通过两次异或可以还原数据本身
  ·小结
  AOP:面向切面编程,也许这个例子没有表现的那么明显。我们将“二进制/十六进制转换”与“文件(名)处理、文件关闭”等操作分开,作为该问题的两个不同切面。这样的好处就是,我们可以分开修改其中任意一个切面,而不影响或很少影响到另一个,换句话说给我们更大的灵活性和适应性,还有代码重用性。
  嵌套函数:作为Python的一大特色,我们见到了它的用场。
  工厂方法Factory Method:我们使用了XOR这个工厂方法来生产异或函数对象。

源代码下载binhex.zip
页: [1]
查看完整版本: Python二进制文件与十六进制文本文件转换