python小例子之3 -- 解析xml文本
主题: 解析xml文本环境: winxp pro + sp2 + python2.5
备注: 请注意,凡是在源代码文件中使用了中文字符,请最好保存为utf-8格式
测试用例sample.xml也请用utf-8格式保存
代码:
python 代码
[*]# parsexml.py
[*]# 本例子参考自python联机文档,做了适当改动和添加
[*]
[*]import xml.parsers.expat
[*]
[*]# 控制打印缩进
[*]level = 0
[*]
[*]# 获取某节点名称及属性值集合
[*]def start_element(name, attrs):
[*] global level
[*] print ' '*level, 'Start element:', name, attrs
[*] level = level + 1
[*]
[*]# 获取某节点结束名称
[*]def end_element(name):
[*] global level
[*] level = level - 1
[*] print ' '*level, 'End element:', name
[*]
[*]# 获取某节点中间的值
[*]def char_data(data):
[*] if(data == '\n'):
[*] return
[*] if(data.isspace()):
[*] return
[*] global level
[*] print ' '*level, 'Character data:', data
[*]
[*]p = xml.parsers.expat.ParserCreate()
[*]
[*]p.StartElementHandler = start_element
[*]p.EndElementHandler = end_element
[*]p.CharacterDataHandler = char_data
[*]p.returns_unicode = False
[*]
[*]f = file('sample.xml')
[*]p.ParseFile(f)
[*]f.close()
测试用例:
xml 代码:sample.xml
[*]xml version="1.0"?>
[*]<contacts id="bluecrystal">
[*]<item name="keen" fff="ddd">
[*] <telephone type="phone">222222222<!---->telephone>
[*] <telephone type="mobile">134567890<!---->telephone>
[*]<!---->item>
[*]<item name="bcm">
[*] <telephone type="phone">11111111<!---->telephone>
[*] <telephone type="mobile">15909878909<!---->telephone>
[*]<!---->item>
[*]<!---->contacts>
测试结果:
[*]Start element: contacts {'id': 'bluecrystal'}
[*] Start element: item {'fff': 'ddd', 'name': 'keen'}
[*] Start element: telephone {'type': 'phone'}
[*] Character data: 222222222
[*] End element: telephone
[*] Start element: telephone {'type': 'mobile'}
[*] Character data: 134567890
[*] End element: telephone
[*] End element: item
[*] Start element: item {'name': 'bcm'}
[*] Start element: telephone {'type': 'phone'}
[*] Character data: 11111111
[*] End element: telephone
[*] Start element: telephone {'type': 'mobile'}
[*] Character data: 15909878909
[*] End element: telephone
[*] End element: item
[*]End element: contacts
页:
[1]