Parsing JSON in Snowflake - Lateral & flatten


An interesting challenge with opportunities to delve into mixed blood Spanish-royal lineages, with a pinch of nested json. The input is a nested JSON file that looks like the one below, and the desired output is a wide table with clean attributes.

[
  {
    "Era": "Pre-Transition",
    "Houses": [
      {
        "House": "Austria",
        "Monarchs": [
          {
            "Name": "Felipe II",
            "Nickname": "el Prudente",
            "Birth": "1527-05-21",
            "Start of Reign": "1556-01-16",
            "End of Reign": "1598-09-13",
            "Duration": "42 years and 240 days",
            "Death": "1598-09-13",
            "Consort\\/Queen Consort": [
              "Maria I de Inglaterra",
              "Isabel de Valois",
              "Ana de Austria"
            ],
            "Place of Birth": "Valladolid",
            "Place of Death": "San Lorenzo de El Escorial",
            "Age at Time of Death": "71 years",
            "Burial Place": "Cripta Real del Monasterio de El Escorial"
          },
          ...
        

My full solution contains the final code in the form of a dbt docs page, but the TLDR is that I followed a three layer approach: Three layer approach The first model runs consecutive lateral flatten on house and monarch, the second (aux) model retrieves the unique list of keys (attributes), and the third one loops over the list of keys and creates the wide table. First model:

WITH input_ AS (
    SELECT $1 AS value FROM @{{ stage_name }}
),

temp AS (
    SELECT
        o_r.value:Era::varchar AS era,
        h.index AS house_index,
        h.value:"House"::varchar AS house_name,
        m.index AS monarch_index,
        m.value AS monarchs,
        array_to_string(object_keys(monarchs), ',') AS keys
    FROM input_ AS o_r,
        LATERAL flatten(o_r.value:Houses) AS h,
        LATERAL flatten(h.value:Monarchs) AS m
)

SELECT * FROM temp

And the output: wide table

Thanks for reading!

Find this and other FrostyFriday challenges with dbt in my repo .

Back to blog