对于大规模的数据,可以尝试下 Python 第三方库,来替换自带的Json序列化和反序列化:
[1] - ujson - ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.7+.
[2] - orjson - fast, correct JSON library for Python; fastest python library for json encoding & decoding; 支持序列化 dataclass, datetime, numpy 和 UUID instances.
安装:
pip install ujson orjson
速度对比测试
#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import time
import json
import orjson
import ujson
def benchmark(name, dumps, loads):
start = time.time()
for i in range(3000000):
result = dumps(m)
loads(result)
print(name, time.time() - start)
if __name__ == "__main__":
m = {
"timestamp": 1556283673.1523004,
"task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7",
"task_level": [1, 2, 1],
"action_status": "started",
"action_type": "main",
"key": "value",
"another_key": 123,
"and_another": ["a", "b"],
}
benchmark("Python", json.dumps, json.loads)
benchmark("ujson", ujson.dumps, ujson.loads)
# orjson only outputs bytes, but often we need unicode:
benchmark("orjson", lambda s: str(orjson.dumps(s), "utf-8"), orjson.loads)
输出结果如:
Python 24.219083547592163
ujson 9.381672620773315
orjson 5.3264000415802