Flattening

Caution

This page is a work in progress. The information may not be complete, and unlike the Spreadsheet Designer's Guide, the tests backing up the documented examples are not correct. Use with caution.

So far, all the examples you've seen have been about unflattening - taking a spreadsheet and producing a JSON document.

In this section you'll learn about flattening. The main use case for wanting to flatten a JSON document is so that you can manage the data in a spreadsheet from now on.

Flatten Tool provides flatten-tool flatten sub-command for this purpose.

Generating a spreadsheet from a JSON document

Generating a spreadsheet from a JSON document is very similar to creating a template.

$ flatten-tool flatten --root-list-path=cafe --main-sheet-name=cafe --schema=examples/receipt/cafe.schema examples/receipt/normalised/expected.json -o examples/flatten/simple/actual

One difference is that the default output name is flatten, and so the command above will generate a flatten/ directory with CSV files and a flatten.xlsx file in the current working directory.

The schema is the same as the one used in the user guide and looks like this:

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "definitions": {
        "TableObject": {
            "type": "object",
            "properties": {
                "id": {
                    "type": "string",
                    "title": "Identifier"
                },
                "number": {
                    "type": "integer",
                    "title": "Number"
                },
                "dish": {
                    "items": {
                        "$ref": "#/definitions/DishObject"
                    },
                    "type": "array",
                    "title": "Dish"
                }
            }
        },
        "DishObject": {
            "type": "object",
            "properties": {
                "id": {
                    "type": "string",
                    "title": "Identifier"
                },
                "name": {
                    "type": "string",
                    "title": "Name"
                },
                "cost": {
                    "type": "number",
                    "title": "Cost"
                }
            }
        }
    },
    "type": "object",
    "properties": {
        "id": {
            "type": "string",
            "title": "Identifier"
        },
        "name": {
            "type": "string",
            "title": "Name"
        },
        "address": {
            "type": "string",
            "title": "Address"
        },
        "table": {
            "items": {
                "$ref": "#/definitions/TableObject"
            },
            "type": "array",
            "title": "Table"
        }
    }
}

The input JSON file looks like this:

{
    "cafe": [
        {
            "id": "CAFE-HEALTH",
            "name": "Healthy Cafe",
            "table": [
                {
                    "id": "TABLE-1",
                    "number": "1",
                    "dish": [
                        {
                            "name": "Fish and Chips",
                            "cost": "9.95"
                        },
                        {
                            "name": "Pesto Pasta Salad",
                            "cost": "6.95"
                        }
                    ]
                },
                {
                    "id": "TABLE-2",
                    "number": "2"
                },
                {
                    "id": "TABLE-3",
                    "number": "3",
                    "dish": [
                        {
                            "name": "Fish and Chips",
                            "cost": "9.95"
                        }
                    ]
                }
            ]
        },
        {
            "id": "CAFE-VEG",
            "name": "Vegetarian Cafe",
            "table": [
                {
                    "id": "TABLE-16",
                    "number": "16",
                    "dish": [
                        {
                            "name": "Large Glass Sauvignon",
                            "cost": "5.95"
                        }
                    ]
                },
                {
                    "id": "TABLE-17",
                    "number": "17"
                }
            ]
        }
    ]
}

If you run the example above, Flatten Tool will generate the following CSV files for you, populated with the data from the input JSON file.

Warning

You can't use --use-titles with flatten.

sheet: cafe.csv
id name address
CAFE-HEALTH Healthy Cafe  
CAFE-VEG Vegetarian Cafe  
sheet: table.csv
id table/0/id table/0/number
CAFE-HEALTH TABLE-1 1
CAFE-HEALTH TABLE-2 2
CAFE-HEALTH TABLE-3 3
CAFE-VEG TABLE-16 16
CAFE-VEG TABLE-17 17
sheet: tab_dish.csv
id table/0/id table/0/dish/0/id table/0/dish/0/name table/0/dish/0/cost
sheet: dish.csv
id table/0/id table/0/dish/0/name table/0/dish/0/cost
CAFE-HEALTH TABLE-1 Fish and Chips 9.95
CAFE-HEALTH TABLE-1 Pesto Pasta Salad 6.95
CAFE-HEALTH TABLE-3 Fish and Chips 9.95
CAFE-VEG TABLE-16 Large Glass Sauvignon 5.95

Caution

If you forget the --root-list-path option and your data isn't under a top level key called main, Flatten Tool won't find your data and will instead generate empty a single empty sheet called main, which probably isn't what you want.

All flatten options

$ flatten-tool flatten -h
usage: flatten-tool flatten [-h] [-s SCHEMA] [-f OUTPUT_FORMAT]
                            [-m MAIN_SHEET_NAME] [-o OUTPUT_NAME]
                            [--root-list-path ROOT_LIST_PATH] [--rollup]
                            [-r ROOT_ID] [--use-titles]
                            input_name

positional arguments:
  input_name            Name of the input JSON file.

optional arguments:
  -h, --help            show this help message and exit
  -s SCHEMA, --schema SCHEMA
                        Path to a relevant schema.
  -f OUTPUT_FORMAT, --output-format OUTPUT_FORMAT
                        Type of template you want to create. Defaults to all
                        available options
  -m MAIN_SHEET_NAME, --main-sheet-name MAIN_SHEET_NAME
                        The name of the main sheet, as seen in the first tab
                        of the spreadsheet for example. Defaults to main
  -o OUTPUT_NAME, --output-name OUTPUT_NAME
                        Name of the outputted file. Will have an extension
                        appended if format is all.
  --root-list-path ROOT_LIST_PATH
                        Path of the root list, defaults to main
  --rollup              "Roll up" columns from subsheets into the main sheet
                        if they are specified in a rollUp attribute in the
                        schema.
  -r ROOT_ID, --root-id ROOT_ID
                        Root ID of the data format, e.g. ocid for OCDS
  --use-titles          Convert titles. Requires a schema to be specified.