利用Python进行数据分析(Wes McKinney)附录APython语言精要

淡淡回忆 发表于 2015-12-2 16:04:05

　　参考网站 http://docs.python.org
　　Ipython最新版本IPython2.1.0
　　http://ipython.org/
　　Ipython的安装(摘自samoobook)

废话一句，ipython是python shell的增强版，支持语法高亮，1.2.0不再需要pyreadline模块。
基本软件：python3.3.4,、pypa-setuptools、ipython1.2.0
1、安装python3.3.4，一路next，不述。默认路径为 C:\Python33
2、首先安装setuptools，因为ipython安装时会首先调用该模块：
cmd下cd到setuptools目录下，执行python setup.py install
完成后，cd到ipython的目录下，执行python setup.py install
3、设置ipython的环境变量，在path内加入C:\Python33\Scripts,一直确定，退出。
4、特别注意，重新打开cmd，输入ipython，然后回车会报错，这是为什么呢？
打开C:\Python33\Scripts会发现，ipython的程序名是ipython3.exe。所以
C:\>ipython3
Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:12:08) [MSC v.1600 32 bit (In
tel)]
Type "copyright", "credits" or "license" for more information.

IPython 1.2.0 -- An enhanced Interactive Python.
?       -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help    -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.

　　1.1 利用缩进格式组织代码
　　 python是用缩进来组织代码的.
　　for x in array:
　　    if x < pivot:
　　       less.append(x)
　　    else:
　　       greater.append(x)
　　一般利用4个空格进行缩进.
　　1.2 变量和按引用传递
　　在Python中对变量赋值时，你其实是在创建等号右侧对象的一个引用。
　　a =
　　b =a
　　a.append(4)
　　此时b的值为

　　1.3 动态和强类型
　　a= 5
　　isinstance(a,int)
　　True
　　
　　a=
　　isinstance(a,list)
　　True
　　a=(1,2,3)
　　isinstance(a,tuple)
　　True
　　1.4 属性和方法
　　a = "foo"
　　a.<tab>
　　将会列出a的所有属性和方法.
　　
　　1.5 判断某些类型是否可以迭代
　　def isiteralbe(obj):
　　 try:
　　    iter(obj)
　　    return True
　　 expect TypeError:
　　       return false
　　1.6 判断两个引用是否指向同一个对象
　　a =
　　b = a
　　a is b
　　True
　　a is not c
　　True
　　
　　

　　内置数据类型,字符串和元组及列表不同-----此问题待进一步分析？
　　1.7 严格与懒惰(待补充)
　　1.8可变和不可变对象
　　大部分Python对象是可变的,比如列表,字典,Numpy数组以及用户自定义类型,也就是说它们所包含的对象和值是可以改变的。而其他的(字符串和元组)则是不可变的,意即不能修改原内存块的数据,也只是创建了一个新对象并将其引用赋值给原变量而已。
　　1.9标量类型
　　None
　　str
　　unicode
　　float
　　bool
　　int
　　long
　　1.10 元组拆包
　　普通元组拆包
　　tup = (4,5,6)
　　a,b,c = tup
　　b
　　5
　　嵌套元组拆包
　　tup = 4,5,(6,7)
　　a,b,(c,d) = tup
　　d
　　7
　　交换变量
　　a = 1
　　b =2
　　b,a = a,b
　　元组方法
　　a = (1,2,3,3,3,3)
　　a.count(3)
　　4
　　1.10 列表
　　a_list =
　　tup = ("Tom","Marry","Jack")
　　b_list = list(tup)
　　b_list = "ren"
　　b_list.append("Hawk")
　　b_list.insert(1,red) #在第1个位置将red插入,则red的序号变为1
　　b_list.pop(2) #弹出第2个位置上的元素.
　　b_list.remove("Marry") #按值删除
　　通过in关键字判断列表中是否含有某值.
　　"Hawk" in b_list
　　注意:判断列表是否含有某个值的操作比字典(dict)和集合(set)慢得多,因为Python会对列表中的值进行线形扫描,而另外两个(基于哈希表)则可以瞬间完成.
　　合并列表
　　方式一:+
　　方式二:x =
　　x.extend()
　　方式二的效率要快于方式一
　　排序
　　a =
　　a.sort()
　　二分搜索及维护有序列表
　　import bisect
　　c =
　　bisect.bisect(c,2) # bisect.bisect()决定新元素应该被插入到哪个位置才能保证原列表的有序性,而bisect.insort()则确实将元素插入到那个位置。
　　切片
　　seq[:stop]seq seq sep[::step]
　　内置的序列函数
　　enumerate
　　some_list = ["foo","bat","baz"]
　　mapping = dict((v,i) for i,v in enumerate(some_list))
　　mapping
　　["bat":1,"baz":2,"foo":0]
　　sorted
　　sorted()
　　
　　zip
　　seq1 = ["foo","bar","baz"]
　　seq2 = ["one","two","three"]
　　zip(seq1,seq2)
　　[("foo","one"),("bar","two"),("baz","three")]
　　dict
　　empty_dict={}
　　empty_dict={"name":"shihanbing","gainder":"male","age":40}
　　empty_dict.update({"career":"software engineer"})
　　del empty_dict["name"]
　　temp = empty_dict.pop("gainder")
　　
　　mapping={}
　　key_list =
　　value_list = ["Maozedong","zhude","zhouenlai","dongbiwu"]
　　for key,value in zip(key_list,value_list):
　　mapping = value
　　default
　　some_dict = {"port":"com1","baudrate":15200,"stop":1,"databit":8}
　　value = some_dict.get("port","the key is valid!please enter the correct key")
　　value = some_dict.get("parity","the key is valid!please enter the correct key")
　　
　　words =["apple","pear","peach","berry","banana","pineapple","nut","watermellon"]
　　by_letter = {}
　　letter =""
　　for word in words:
　　letter = word
　　if letter not in by_letter:
　　by_letter =
　　else:
　　by_letter.append(word)

　　可以用一句话代替if ... else ...
　　by_letter.setdefault(letter,[]).append(word)
　　更简单的写法:
　　
　　from collections import defaultdict
　　words =["apple","pear","peach","berry","banana","pineapple","nut","watermellon"]
　　by_letter = defaultdict(list)
　　for word in words:
　　by_letter].append(word)
　　
　　字典键的有效类型
　　键必须是不可变对象,具有哈希性。列表不可以作为键,如果要将列表当做键,最好的方法是将其转换成元组。
　　d={}
　　d)] = 5
　　集合
　　s1 = set()
　　s2 = {2,4,5,6,7,9}
　　| 集合并运算
　　&集合交运算
　　-集合差运算
　　^集合异或运算
　　s1.issuperset(s2) 判断s1是否是s2的超集
　　
　　
　　列表、集合及字典的推导式
　　
　　嵌套列表推导式
　　all_data = [["Tom","Billy","Jefferson","Andrew","Wesley","Steven","Joe"],["Susie","Caseu","jill","Ana","Eva","Jennifer","Stephanie"]]
　　names_of_intrest = []
　　for names in all_data:
　　enough_es =
　　names_of_intrest.extend(enough_es)
　　函数亦对象
　　states = [' Alabama','Georgia!','Georgia','georgia','Fl0rIda','south carolina##','West virginia?']
　　def remove_punctuation(value):
　　return re.sub('[!#?]','',value)
　　clear_ops =
　　def clean_strings(strings,ops):
　　result = []
　　for value in strings:
　　for funciton in ops:
　　value = funciton(value)
　　result.append(value)
　　return result

　　匿名函数(Lambda)
　　def apply_to_list(some_list,f):
　　return
　　ints =
　　apply_to_list(ints,lambda x:x*2)
　　闭包:返回函数的函数
　　闭包就是由其他函数动态生成并返回的函数,其关键性质是,被返回的函数可以访问其创建者的局部命名空间中的变量。闭包和标准Python函数之间的区别在于:即使其创建者
　　已经执行完毕,闭包仍能访问其创建者的局部命名空间.
　　def make_closrue(a):
　　 def closure():
　　    print("I know the secret : %d" %a)
　　 return closure
　　my_closure = make_closure(6)
　　my_closure() # 当执行完这一句后将打印如下信息
　　I know the secret :6
　　在闭包中使用可变对象(如字典、集合和列表等)
　　def make_watcher():
　　have_seen = {}
　　def has_been_seen(x):
　　if x in have_seen:
　　return True
　　else:
　　have_seen = True
　　return False
　　return has_been_seen
　　watcher = make_watcher()
　　vals =

　　创建一个始终返回15位字符串的浮点数格式化器《
　　def format_and_pad(template,space):
　　def formatter(x):
　　return (template % x).rjust(space)
　　return formatter
　　fmt =format_and_pad("%.4f",15)
　　fmt(1.756)
　　'             1.756'
　　扩展调用语法和 *args ,**kwargs
　　当你编写函数func(a,b,c,d=some,e=value)时,位置和关键字其实被打包成元组和字典的,函数实际接收到的是一个元组args和一个字典kwargs,
　　def say_hello_then_call_f(f,*args,**kwargs):
　　print "args is",args
　　print "kwargs is", kwargs
　　print("Hello! Now I'm going to call %s" % f)
　　return f(*args,**kwargs)
　　def g(x,y,z=1)
　　return (x + y) /z
　　say_hello_then_call_f(g,1,2,z=5)
　　args is (1,2)
　　kwargs is {'z':5.0}
　　Now I am going to call{function g at 0x2dd5cf8}
　　生成器
　　生成器是构造新的可迭代对象的一种简单方式。一般的函数执行之后只会返回单个值,而生成器则是以延迟的方式返回一个值序列，将函数的return 改为yield,将会创建一个生成器。
　　def square(n=10):
　　print "Generating squares from 1 to %d" % (n ** 2)
　　for i in xrange(1,n+1):
　　yield i**2
　　生成器表达式
　　生成器表达式是构造生成器的最简单方式。
　　gen = (x**2 for x in xrange(100))
　　sum (x**2 for x in xrange(100))
　　dict((i,i **2) for i in xrange(5))
　　
　　intertools模块
　　
　　
　　
　　
　　

页: [1]

运维网's Archiver

利用Python进行数据分析(Wes McKinney)附录APython语言精要