Understanding JSON_ARRAYAGG in MariaDB for Efficient JSON Array Creation


What is JSON_ARRAYAGG?

JSON_ARRAYAGG is a built-in aggregate function in MariaDB that allows you to construct a JSON array from the results of a query. It aggregates (combines) values from a single column or expression into a single JSON array.

Syntax

SELECT JSON_ARRAYAGG(expression) AS alias
FROM table_name
[WHERE condition];
  • alias: This is an optional name you can assign to the resulting JSON array.
  • expression: This can be a column name, a constant value, or an expression that evaluates to a single value for each row.

How it Works

  1. Evaluation
    MariaDB evaluates the expression for each row in the SELECT statement.
  2. Aggregation
    The results of the expression are collected and added to the JSON array being constructed.
  3. Output
    The final JSON array is returned as the value of the specified alias (if provided) or as the unnamed result of the query.

Example

SELECT JSON_ARRAYAGG(
    CONCAT('{"name": "', name, '", "city": "', city, '"}')
) AS customers
FROM customers
WHERE city = 'London';

This query constructs a JSON array like this (assuming there are two customers in London):

[{"name": "John Doe", "city": "London"}, {"name": "Jane Smith", "city": "London"}]

Key Points

  • The maximum length of the returned JSON array is controlled by the group_concat_max_len server system variable. You can adjust this variable if you expect very large arrays.
  • If the expression evaluates to NULL in all rows, the function returns NULL.
  • The order of elements in the resulting JSON array is not guaranteed.
  • Consider using prepared statements to prevent SQL injection vulnerabilities, especially if you're dynamically constructing the expression based on user input.
  • For more complex JSON object structures, you might need to use string concatenation functions like CONCAT to create the desired JSON format within the expression.


Simple Array of Column Values

This query retrieves a JSON array containing just the product_name values from the products table:

SELECT JSON_ARRAYAGG(product_name) AS product_names
FROM products;

Array with Filtering and Ordering

This example gets a JSON array of order_id and total_amount for orders placed after a specific date, ordered by total_amount descending:

SELECT JSON_ARRAYAGG(CONCAT('{"order_id": ', order_id, ', "total_amount": ', total_amount, '}')) AS orders
FROM orders
WHERE order_date > '2024-06-15'
ORDER BY total_amount DESC;

Array with Custom Formatting

This query constructs a JSON array with formatted product details, including id, name, and a discounted price (assuming a 10% discount):

SELECT JSON_ARRAYAGG(
    CONCAT('{"id": ', id, ', "name": "', name, '", "discounted_price": ', CAST(price * 0.9 AS DECIMAL(5,2)), '}')
) AS products
FROM products;

Filtering Based on Expression

This example retrieves a JSON array of category names for categories with at least 5 products:

SELECT JSON_ARRAYAGG(category_name) AS categories
FROM categories
WHERE (SELECT COUNT(*) FROM products WHERE category_id = categories.id) >= 5;


GROUP_CONCAT with CONCAT

This technique combines the GROUP_CONCAT function, which aggregates string values, with the CONCAT function to format the individual elements and delimit them appropriately for a JSON array.

SELECT CONCAT('[', GROUP_CONCAT(CONCAT('"', name, '"')), ']') AS names
FROM customers;

This query builds a JSON array of customer names, assuming the name column exists.

Caveats

  • Null values in the expression can lead to unexpected results. Be mindful of null handling.
  • GROUP_CONCAT has a maximum length limitation, which you might need to adjust using SET SESSION group_concat_max_len = value;.

User-Defined Functions (UDFs)

If you need more complex logic for constructing the JSON array or require specific formatting, you can create a custom UDF in MariaDB. This approach offers greater flexibility but requires additional development effort.

Application-Level Logic

For simple scenarios, it might be easier to handle JSON array creation in your application code instead of relying on the database layer. This allows you to use more control over the structure and formatting of the array. However, this approach separates the data retrieval from the data manipulation logic.

  • For basic scenarios where JSON manipulation is easier in your application
    Handle array creation in your application code.
  • For complex JSON array construction or specific formatting requirements
    Consider creating a UDF.
  • For older MariaDB versions or when you need more granular control over formatting
    Utilize GROUP_CONCAT with CONCAT.
  • For basic JSON array creation in newer MariaDB versions (10.5+)
    Use JSON_ARRAYAGG for its simplicity and efficiency.