Extracting data from nested JSON arrays in Snowflake database

I have a Snowflake table with one column called “Data” where each record contains JSON objects. I need help extracting specific values from nested arrays within these JSON structures.

Record 1: {
  "musician": {
    "id": "band_001",
    "name": "ThunderCats"
  },
  "metadata": {
    "genres": [
      {
        "genreID": "90201",
        "priority": "85"
      },
      {
        "genreID": "90155",
        "priority": "22"
      }
    ],
    "length": "195000",
    "trackCodes": [
      {
        "codeType": "UPC",
        "code": "upc_001"
      }
    ]
  }
}

Record 2: {
  "musician": {
    "id": "band_002",
    "name": "SkyWalkers"
  },
  "metadata": {
    "genres": [
      {
        "genreID": "90202",
        "priority": "76"
      },
      {
        "genreID": "90155",
        "priority": "18"
      }
    ],
    "length": "220000",
    "trackCodes": [
      {
        "codeType": "UPC",
        "code": "upc_002"
      },
      {
        "codeType": "UPC",
        "code": "upc_003"
      }
    ]
  }
}

My goal is to create a query that returns two columns: one with all UPC codes and another with their corresponding genres. Each record might have multiple codes and genres. I want the final result to show each code-genre combination as separate rows. How can I flatten these nested JSON arrays properly in Snowflake?

The FLATTEN approach works great, but watch your JSON path syntax. Use PARSE_JSON first if your data column’s stored as a string. Here’s a complete solution:

SELECT 
    codes.value:code::STRING as upc_code,
    genres.value:genreID::STRING as genre_id
FROM your_table_name,
LATERAL FLATTEN(input => PARSE_JSON(data):metadata.trackCodes) codes,
LATERAL FLATTEN(input => PARSE_JSON(data):metadata.genres) genres
WHERE codes.value:codeType = 'UPC';

The main changes: I added a WHERE clause to filter only UPC codes and used explicit ::STRING casting. This prevents data type mismatches and grabs only the code type you want. The PARSE_JSON wrapper handles cases where your data column has JSON as text instead of native JSON objects.

use LATERAL FLATTEN twice - once on trackCodes, then on genres. try SELECT f1.value:code, f2.value:genreID FROM your_table, LATERAL FLATTEN(input => data:metadata.trackCodes) f1, LATERAL FLATTEN(input => data:metadata.genres) f2. you’ll get the cartesian product of codes and genres, which sounds like what you want.