Python 中有个序列化过程称为pickle,它能够实现任意对象与文本之间的相互转化,也可以实现任意对象与二进制之间的相互转化。也就是说,pickle 可以实现 Python 对象的存储及恢复。

  • 序列化(picking): 把变量从内存中变成可存储或传输的过程称为序列化,序列化之后,就可以把序列化的对象写入磁盘,或者传输给其他设备
  • 反序列化(unpickling):相应的,把变量的内容从序列化的对象重新读到内存里的过程称为反序列化


Python 2中有两个模块可以实现对象的序列化,pickle和cPickle,cPickle是用C语言实现的,pickle是用纯Python语言实现的,相比,cPickle的读写效率高一些。使用的时候,一般先尝试导入cPickle,如果失败,再导入pickle模块。

    import cPickle as pickle
    import pickle

Python 3种无需再这样进行导入:

A common pattern in Python 2.x is to have one version of a module implemented in pure Python, with an optional accelerated version implemented as a C extension; for example, pickle and cPickle. This places the burden of importing the accelerated version and falling back on the pure Python version on each user of these modules. In Python 3.0, the accelerated versions are considered implementation details of the pure Python versions. Users should always import the standard version, which attempts to import the accelerated version and falls back to the pure Python version. The pickle / cPickle pair received this treatment. The profile module is on the list for 3.1. The StringIO module has been turned into a class in the io module. https://docs.python.org/3.1/whatsnew/3.0.html#library-changes

pickle 模块提供了以下 4 个函数供我们使用:

  • dumps():将 Python 中的对象序列化成二进制对象,并返回
  • loads():读取给定的二进制对象数据,并将其转换为 Python 对象
  • dump():将 Python 中的对象序列化成二进制对象,并写入文件
  • load():读取指定的序列化数据文件,并返回对象

以上这 4 个函数可以分成两类,其中 dumps 和 loads 实现基于内存的 Python 对象与二进制互转,dump 和 load 实现基于文件的 Python 对象与二进制互转。


  • JSON只能存储文本形式的存储,Pickle可以存储成二进制
  • JSON是人可读的,Pickle不可读
  • JSON广泛应用于除Python外的其他领域,Pickle是Python独有的
  • JSON只能dump一些python的内置对象,Pickle可以存储几乎所有对象


pickle 模块提供了两个常量

常量 说明
pickle.HIGHEST_PROTOCOL 这是一个整数值,表示可用的最高协议版本。它可以作为协议版本的参数传递给dump()和dumps()函数
pickle.DEFAULT_PROTOCOL 这是一个整数值,表示用于 pickling 的默认协议,其值可能小于最高协议的值

pickle 模块提供的方法:

  • dump(obj, file, protocol=None, *, fix_imports=True, buffer_callback=None)
  • dumps(obj, protocol=None, *, fix_imports=True, buffer_callback=None)
  • load(file, *, fix_imports=True, encoding=”ASCII”, errors=”strict”, buffers=None)
  • loads(data, /, *, fix_imports=True, encoding=”ASCII”, errors=”strict”, buffers=None)


  • Protocol version 0 is the original “human-readable” protocol and is backwards compatible with earlier versions of Python.(原始的纯文本存储)
  • Protocol version 1 is an old binary format which is also compatible with earlier versions of Python.(旧版二进制存储)
  • Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.(新版二进制存储,效率更高,Python 2.3新增)
  • Protocol version 3 was added in Python 3.0. It has explicit support for bytes objects and cannot be unpickled by Python 2.x. This was the default protocol in Python 3.0–3.7.(Python 3引入,在Python 3.0-3.7中默认)
  • Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. It is the default protocol starting with Python 3.8. Refer to PEP 3154 for information about improvements brought by protocol 4.(支持非常大的对象)
  • Protocol version 5 was added in Python 3.8. It adds support for out-of-band data and speedup for in-band data. Refer to PEP 574 for information about improvements brought by protocol 5.



import pickle

# take objects list, dictionary and class
mylist = ['pink', 'green', 'blue', 'red']
mydict = {'a': 23, 'b': 17, 'c': 9}

class Student:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def display_info(self):
        return ("Student name is {name} & is {age} years old".format(name=self.name, age=self.age))

# object created for student
myobj = Student('Maria', 18)

# pickling
# byte stream of objects written in binary format
pickle.dump(mylist, file=open('mylist.pkl', 'wb'))
pickle.dump(mydict, file=open('mydict.pkl', 'wb'))
pickle.dump(myobj, file=open('myobj.pkl', 'wb'))

# delete objects
del mylist
del mydict
del myobj

# unpickling
mylist = pickle.load(file=open('mylist.pkl', 'rb'))
mydict = pickle.load(file=open('mydict.pkl', 'rb'))
myobj = pickle.load(file=open('myobj.pkl', 'rb'))

# printing objects and their types
print('list object: ', mylist, type(mylist))
print('dictionary object: ', mydict, type(mydict))
print('student info: ', myobj.display_info())

list object:  ['pink', 'green', 'blue', 'red'] <class 'list'>
dictionary object:  {'a': 23, 'b': 17, 'c': 9} <class 'dict'>
student info:  Student name is Maria & is 18 years old



介绍说pickleDB是一个轻量级且简单的键值存储。 它基于Python的simplejson模块,受redis启发。不清楚与pickle有什么关系?


>>> import pickledb

>>> db = pickledb.load('test.db', False)

>>> db.set('key', 'value')

>>> db.get('key')

>>> db.dump()


# -*- coding:utf-8 -*-
import pickledb
import pickle
import json

dataList = [[1, 1, 'yes'],
            [1, 1, 'yes'],
            [1, 0, 'no'],
            [0, 1, 'no'],
            [0, 1, 'no']]
dataDic = {0: [1, 2, 3, 4],
           1: ('a', 'b'),
           2: {'c': 'yes', 'd': 'no'}}

p1 = pickle.dumps(dataList)
p2 = pickle.dumps(dataDic)

db = pickledb.load('example.db', False)  # 从文件加载数据库,如果没有会自动创建
db.set('p1', p1)  # set 设置一个键的字符串值
db.set('p2', p2)  # set 设置一个键的字符串值

print(pickle.loads(db.get('p1')))  # get 获取一个键的值
print(pickle.loads(db.get('p2')))  # get 获取一个键的值

db.dump()  # 将数据库从内存保存到example.db


open(filename, flag=’c’, protocol=None, writeback=False)

  • flag 参数表示打开数据存储文件的格式:
    • ‘r’ 以只读模式打开一个已经存在的数据存储文件
    • ‘w’ 以读写模式打开一个已经存在的数据存储文件
    • ‘c’ 以读写模式打开一个数据存储文件,如果不存在则创建
    • ‘n’ 总是创建一个新的、空数据存储文件,并以读写模式打开
  • protocol 参数表示序列化数据所使用的协议版本,默认是pickle v3;
  • writeback 参数表示是否开启回写功能。


# -*- coding:utf-8 -*-
import shelve

with shelve.open('student.db') as db:
    db['name'] = 'Tom'
    db['age'] = 19
    db['hobby'] = ['篮球', '看电影', '弹吉他']
    db['other_info'] = {'sno': 1, 'addr': 'xxxx'}

# 读取数据
with shelve.open('student.db') as db:
    for key, value in db.items():
        print(key, ': ', value)



