This page describes the JSON Lines text format, also called newline-delimited JSON. JSON Lines is a convenient format for storing structured data that may be processed one record at a time. It works well with unix-style text processing tools and shell pipelines. It's a great format for log files. It's also a flexible format for passing messages between cooperating processes.
The JSON Lines format has three requirements:
1. UTF-8 Encoding
JSON allows encoding Unicode strings with only ASCII escape sequences, however those escapes will be hard to read when viewed in a text editor. The author of the JSON Lines file may choose to escape characters to work with plain ASCII files.
Encodings other than UTF-8 are very unlikely to be valid when decoded as UTF-8 so the chance of accidentally misinterpreting characters in JSON Lines files is low.
Like the JSON standard a byte order mark (U+FEFF) must NOT be included.
2. Each Line is a Valid JSON Value
The most common values will be objects or arrays, but any JSON value is permitted.
See json.org for a definition of JSON values.
3. Line Separator is '\n'
This means '\r\n'
is also supported because surrounding white space is
implicitly ignored when parsing JSON values.
The last character in a file following the last JSON value may be a line separator. In this case the line separator does not indicate the start of another JSON value.
4. Suggested Conventions
JSON Lines files may be saved with the file extension .jsonl
.
Stream compressors like gzip
or bzip2
are recommended for
saving space, resulting in .jsonl.gz
or .jsonl.bz2
files.
MIME type may be application/jsonl
, but this is not yet standardized; any help
writing the RFC would be greatly appreciated (see issue).
Text editing programs call the first line of a text file "line 1". The first value in a JSON Lines file should also be called "value 1".