This page describes the JSON Lines text format, also called newline-delimited JSON. JSON Lines is a convenient format for storing structured data that may be processed one record at a time. It works well with unix-style text processing tools and shell pipelines. It's a great format for log files. It's also a flexible format for passing messages between cooperating processes.
The JSON Lines format has three requirements:
1. UTF-8 Encoding
JSON allows encoding Unicode strings with only ASCII escape sequences, however those escapes will be hard to read when viewed in a text editor. The author of the JSON Lines file may choose to escape characters to work with plain ASCII files.
Encodings other than UTF-8 are very unlikely to be valid when decoded as UTF-8 so the chance of accidentally misinterpreting characters in JSON Lines files is low.
Like the JSON standard a byte order mark (U+FEFF) must NOT be included.
2. Each Line is a Valid JSON Value
The most common values will be objects or arrays, but any JSON value is permitted.
e.g. null
is a valid value but a blank line is not.
See json.org for a definition of JSON values.
3. Line Separator is '\n'
This means '\r\n'
is also supported because surrounding white space is
implicitly ignored when parsing JSON values.
The last character in a file following the last JSON value may be a line separator. In this case the line separator does not indicate the start of another JSON value.
Conventions
JSON Lines files may be saved with the file extension .jsonl
.
Stream compressors like gzip
or bzip2
are recommended for
saving space, resulting in .jsonl.gz
or .jsonl.bz2
files.
MIME type may be application/jsonl
, but this is not yet standardized; any help
writing the RFC would be greatly appreciated (see issue).
Text editing programs call the first line of a text file "line 1". The first value in a JSON Lines file should also be called "value 1".