对于大规模的数据,可以尝试下 Python 第三方库,来替换自带的Json序列化和反序列化:

[1] - ujson - ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.7+.

[2] - orjson - fast, correct JSON library for Python; fastest python library for json encoding & decoding; 支持序列化 dataclass, datetime, numpy 和 UUID instances.

安装:

pip install ujson orjson

速度对比测试

#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import time
import json
import orjson
import ujson


def benchmark(name, dumps, loads):
    start = time.time()
    for i in range(3000000):
        result = dumps(m)
        loads(result)
    print(name, time.time() - start)


if __name__ == "__main__":
    m = {
        "timestamp": 1556283673.1523004,
        "task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7",
        "task_level": [1, 2, 1],
        "action_status": "started",
        "action_type": "main",
        "key": "value",
        "another_key": 123,
        "and_another": ["a", "b"],
    }

    benchmark("Python", json.dumps, json.loads)
    benchmark("ujson", ujson.dumps, ujson.loads)

    # orjson only outputs bytes, but often we need unicode:
    benchmark("orjson", lambda s: str(orjson.dumps(s), "utf-8"), orjson.loads)

输出结果如:

Python 24.219083547592163
ujson 9.381672620773315
orjson 5.3264000415802
Last modification:June 29th, 2022 at 04:11 pm