JSON Lines

Documentation for the JSON Lines text file format

This page describes the JSON Lines text format, also called newline-delimited JSON. JSON Lines is a convenient format for storing structured data that may be processed one record at a time. It works well with unix-style text processing tools and shell pipelines. It's a great format for log files. It's also a flexible format for passing messages between cooperating processes.

The JSON Lines format has three requirements:

1. UTF-8 Encoding

JSON allows encoding Unicode strings with only ASCII escape sequences, however those escapes will be hard to read when viewed in a text editor. The author of the JSON Lines file may choose to escape characters to work with plain ASCII files.

Encodings other than UTF-8 are very unlikely to be valid when decoded as UTF-8 so the chance of accidentally misinterpreting characters in JSON Lines files is low.

Like the JSON standard a byte order mark (U+FEFF) must NOT be included.

2. Each Line is a Valid JSON Value

The most common values will be objects or arrays, but any JSON value is permitted. e.g. null is a valid value but a blank line is not.

See json.org for a definition of JSON values.

3. Line Separator is `'\n'`

This means '\r\n' is also supported because surrounding white space is implicitly ignored when parsing JSON values.

The last character in a file following the last JSON value may be a line separator. In this case the line separator does not indicate the start of another JSON value.

Conventions

JSON Lines files may be saved with the file extension .jsonl.

Stream compressors like gzip or bzip2 are recommended for saving space, resulting in .jsonl.gz or .jsonl.bz2 files.

MIME type may be application/jsonl, but this is not yet standardized; any help writing the RFC would be greatly appreciated (see issue).

Text editing programs call the first line of a text file "line 1". The first value in a JSON Lines file should also be called "value 1".

1. UTF-8 Encoding

2. Each Line is a Valid JSON Value

3. Line Separator is '\n'

Conventions

3. Line Separator is `'\n'`