BigQuery datasets: Routines
In BigQuery, routines refer to a resource type that encompasses stored procedures, user-defined functions (UDFs) (including remote functions), and table functions. They allow users to encapsulate and reuse logic for data processing and analysis tasks within BigQuery.
Source:
Contents:
Manage routines
Create
Depending on the routine type, run one of the following DDL statements:
Stored procedure:
CREATE PROCEDUREUser-defined function:
CREATE FUNCTIONTable function:
CREATE TABLE FUNCTION
List
SELECT
COLUMN_LIST
FROM
{ dataset | region }.INFORMATION_SCHEMA.ROUTINES;View the body of a routine
SELECT
routine_definition
FROM
{ dataset | region }.INFORMATION_SCHEMA.ROUTINES
WHERE
routine_name = routine_name;Delete
Depending on the routine type, run one of the following DDL statements:
Stored procedure:
DROP PROCEDUREUser-defined function:
DROP FUNCTIONTable function:
DROP TABLE FUNCTION
User-defined functions
A user-defined function (UDF) allows you to create a function using SQL expression or JavaScript code. It processes input columns and returns the result.
UDFs can be persistent or temporary. Persistent UDFs can be reused across queries, while temporary UDFs exist only for a single query. Persistent UDFs are safe for shared use, but they cannot mutate data or interact with external systems.
Use CREATE FUNCTION to create UDFs and DROP FUNCTION to delete persistent ones. Temporary UDFs expire after the query finishes.
Temporary UDFs
A Temporary User-Defined Function (UDF) in SQL is a custom function that is defined within a single query or session. It is not stored in the database schema and is only available for the duration of the session in which it is created. Temporary UDFs are useful for performing custom calculations or transformations that are specific to a single query or session.
-- Creating a temporary UDF
CREATE TEMP FUNCTION MyCustomFunction(arg1 TYPE, arg2 TYPE)
RETURNS RETURN_TYPE AS (
-- Function logic here
);
-- Using the temporary UDF in a query
SELECT MyCustomFunction(column1, column2) FROM TableName;Persistent UDFs
A Persistent User-Defined Function (UDF) in SQL is a custom function that is defined and stored in the database schema. Once created, it is available across multiple sessions and queries, similar to built-in functions. Persistent UDFs are useful for encapsulating complex logic or calculations that are used frequently across different queries and by multiple users.
-- Creating a persistent UDF within a specific dataset or schema CREATE FUNCTION DatasetName.MyCustomFunction(arg1 TYPE, arg2 TYPE) RETURNS RETURN_TYPE AS ( -- Function logic here ); -- Using the persistent UDF in a query SELECT DatasetName.MyCustomFunction(column1, column2) FROM TableName;Notes:
Replace
TYPEwith the actual data type of the arguments (e.g.,INT64,FLOAT64).Replace
RETURN_TYPEwith the actual data type of the function's return value.In the persistent UDF example,
DatasetNameshould be replaced with the actual name of the dataset or schema where the UDF will be stored.
Templated SQL UDF parameters
Templated SQL UDF parameters in BigQuery allow
ANY TYPEfor flexible argument types. MultipleANY TYPEparameters have no enforced relationship. Return type must be explicit or inferred, notANY TYPE. Incompatible argument types cause errors.
-- Define BigQuery UDF with ANY TYPE args
CREATE TEMP FUNCTION MyFlexibleFunction(arg1 ANY TYPE, arg2 ANY TYPE)
RETURNS <ReturnType> -- Define return type
LANGUAGE sql AS ("""
-- Compatible SQL logic for arg1, arg2
""");
-- Use UDF with various argument types
SELECT MyFlexibleFunction(10, 20); -- With INTEGER
SELECT MyFlexibleFunction('text1', 'text2'); -- With STRING
-- Ensure function logic matches argument types.Scalar subqueries
A Scalar Subquery in SQL returns a single value from one row and can be embedded in larger queries or SQL UDFs to dynamically compute values, enhancing query complexity and dynamism.
-- Create a temporary table for demonstration purposes
CREATE TEMP TABLE example_table AS (
SELECT 1 AS id, 'CategoryA' AS category UNION ALL
SELECT 2, 'CategoryB' UNION ALL
SELECT 3, 'CategoryA'
);
-- Define a SQL UDF that uses a scalar subquery
CREATE TEMP FUNCTION CountEntriesByCategory(categoryName STRING)
RETURNS INT64 AS (
-- Scalar subquery to count entries matching the specified category
(SELECT COUNT(*) FROM example_table WHERE category = categoryName)
);
-- Use the UDF to count entries for specific categories
SELECT
CountEntriesByCategory('CategoryA') AS count_category_a,
CountEntriesByCategory('CategoryB') AS count_category_b;Default project in SQL expressions
In SQL UDFs, reference BigQuery entities with their project ID unless they're in the UDF's project. For example, `CREATE FUNCTION` works without the project ID for local entities but requires it for external references. Include the project ID for entities outside the UDF's project.
See the official documentation to see the documentation for JavaScript UDF
Table functions
A table function, or table-valued function (TVF), is a type of user-defined function designed to return a table. It can be utilized in any context where a table is applicable. Unlike views, table functions can accept parameters, offering flexible data retrieval options.
Create TVF
Template example:
CREATE TEMP FUNCTION MyTableFunction(param1 TYPE, param2 TYPE)
RETURNS TABLE<column1_name column1_type, column2_name column2_type, ...>
AS (
-- SQL query that generates the table structure
SELECT
expression_as_column1,
expression_as_column2,
...
FROM
YourSourceTable
WHERE
conditions_based_on_params
);Use: You can call a table function in any context where a table is valid.
Delete TVF
DROP TABLE FUNCTION mydataset.MyTableFunctionAuthorize routines
Authorize table functions as routines to share query results while restricting access to underlying data. These routines allow specific users or groups to see computed aggregations or table lookups without direct table access.Limitations
Table functions in BigQuery accept only scalar values as parameters, not tables.
The body of table functions must contain only SELECT statements; modifications, DDL, and DML are not allowed. Use procedures for side effects.
Table functions and the tables they reference must be stored in the same location.
Remote functions
Functionality: Allows implementing functions in languages beyond SQL and JavaScript, or using libraries/services not permitted in BigQuery's UDF.
Integration: Facilitates GoogleSQL functionality with external software through integration with Cloud Functions and Cloud Run, supporting any language.
Workflow:
Create an HTTP endpoint in Cloud Functions or Cloud Run.
Set up a remote function in BigQuery.
Establish a CLOUD_RESOURCE connection.
Invoke the remote function in GoogleSQL queries like other user-defined functions.
Limitations:
Supports only certain data types: Boolean, Bytes, Numeric, String, Date, Datetime, Time, Timestamp, JSON.
Does not support ARRAY, STRUCT, INTERVAL, or GEOGRAPHY types.
Temporary or table-valued remote functions cannot be created.
Remote functions can't be used in materialized views.
Query results involving remote functions aren't cached due to their non-deterministic return value.
Repeated requests may occur due to network or internal errors.
Batching is not used when remote function evaluation is skipped for some rows, affecting efficiency.
If using cross-region dataset replication, the remote function can only be queried in its creation region.
For more information see the official documentation
SQL stored procedures
Basics:
Collection of SQL statements callable from queries or other procedures.
Supports input arguments and output values.
Stored within a BigQuery dataset.
Can access/modify data across multiple datasets.
May include multi-statement queries.
System Procedures:
Built-in, require no creation.
Documented in the System procedures reference.
Features:
Supports procedural language for variables and control flow.
Creation:
Use
CREATE PROCEDUREstatement.Encapsulate statements between
BEGINandEND.
-- Creation
CREATE OR REPLACE PROCEDURE your_dataset_name.simple_procedure()
BEGIN
SELECT "Hello, World!";
END;
-- Calling
CALL your_dataset_name.simple_procedure();Parameters:
Input: Pass values into procedures.
-- Creation CREATE OR REPLACE PROCEDURE your_dataset_name.procedure_with_input(IN input_param STRING) BEGIN SELECT FORMAT("Received input: %s", input_param); END; -- Calling CALL your_dataset_name.procedure_with_input('Sample Input');
Output (
OUT): Return values from procedures.-- Creation CREATE OR REPLACE PROCEDURE your_dataset_name.procedure_with_output(OUT output_param STRING) BEGIN SET output_param = 'Hello from output parameter'; END; -- Calling DECLARE output_var STRING; CALL your_dataset_name.procedure_with_output(output_var); SELECT output_var AS OutputValue;
Input/Output (
INOUT): Accepts and returns values.-- Creation CREATE OR REPLACE PROCEDURE your_dataset_name.procedure_with_inout(INOUT inout_param STRING) BEGIN SET inout_param = CONCAT(inout_param, ', modified'); END; -- Calling DECLARE inout_var STRING DEFAULT 'Initial Value'; CALL your_dataset_name.procedure_with_inout(inout_var); SELECT inout_var AS ModifiedValue;
Authorization:
Stored procedures can be authorized as routines, allowing shared query results without direct data access.
Calling Procedures:
Use
CALLto execute a stored procedure or a system procedure.
This summary encapsulates the creation, usage, and features of SQL stored procedures in BigQuery, focusing on procedure types, parameter handling, and authorization aspects.


