JSON

JSON files can be used as data input by uploading a file, specifying an _s3 _path (s3a://<file_path>), or providing an HDFS location.

Multiline options can be set to false if the JSON structure is in single line. Default value for this is set to True.
JSON can be flattened, by checking the _Flatten Data _checkbox, if the JSON structure is in hierarchical format.
JSON elements can be selected by specifying element names in the columns section and can be renamed using them as keywords. E.g. if id, first__name and last_name are the only three elements to be selected from a JSON with many distinct elements: isbn as id, author.firstName as first_name and author._lastName as last_name.

Multi-line False sample:

{ "isbn": "123-456-222","lastname": "Doe","firstname": "Jane"}
{"isbn": "123-456-777","lastname": "Smith","firstname": "Jane"}

Multi-line True sample:

[
	{ "isbn": "123-456-222","lastname": "Doe","firstname": "Jane"},
	{"isbn": "123-456-777","lastname": "Smith","firstname": "Jane"}
]

Root level columns can be selected above in a multi-root JSON structure. For selection of nested columns, use the Select Columns shape from the palette.

PreviousXML NextParquet/AVRO

Last updated 6 months ago