SQL-like query language for csv

Analytic Functions

Analytic functions calculate values of groups. Analytic Functions can be used only in Select Clause and Order By Clause

name description
ROW_NUMBER Return sequential numbers
RANK Return ranks
DENSE_RANK Return ranks without any gaps in the ranking
CUME_DIST Return cumulative distributions
PERCENT_RANK Return relative ranks
NTILE Return the number of groups
FIRST_VALUE Return the first value in a group
LAST_VALUE Return the last value in a group
NTH_VALUE Return the n-th value in a group
LAG Return the value in a previous row
LEAD Return the value in a following row
COUNT Return the number of values in a group
MIN Return the minimum value in a group
MAX Return the maximum value in a group
SUM Return the sum of values in a group
AVG Return the average of values in a group
STDEV Return the sample standard deviation of values
STDEVP Return the population standard deviation of values
VAR Return the sample variance of values
VARP Return the population variance of values
MEDIAN Return the median of values in a group
LISTAGG Return the concatenated string of values in a group
JSON_AGG Return the string formatted in JSON array of values in a group

Basic Syntax

analytic_function
  : function_name([args]) OVER ([partition_clause] [order_by_clause [windowing_clause]])

args
  : value [, value ...]

partition_clause
  : PARTITION BY value [, value ...]

windowing_clause
  : ROWS window_position
  | ROWS BETWEEN window_frame_low AND window_frame_high

window_position
  : {UNBOUNDED PRECEDING|offset PRECEDING|CURRENT ROW}

window_frame_low
  : {UNBOUNDED PRECEDING|offset PRECEDING|offset FOLLOWING|CURRENT ROW}

window_frame_high
  : {UNBOUNDED FOLLOWING|offset PRECEDING|offset FOLLOWING|CURRENT ROW}

value
value
order_by_clause
Order By Clause
offset
integer

Analytic Functions sort the result set by order_by_clause and calculate values within each of groups partitioned by partition_clause. If there is no partition_clause, then all records of the result set are dealt with as one group.

Definitions

ROW_NUMBER

ROW_NUMBER() OVER ([partition_clause] [order_by_clause])
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
integer

Returns the sequential numbers of records in a group.

RANK

RANK() OVER ([partition_clause] [order_by_clause])
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
integer

Returns the ranks of records in a group.

DENSE_RANK

DENSE_RANK() OVER ([partition_clause] [order_by_clause])
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
integer

Returns the ranks of records without any gaps in the ranking in a group.

CUME_DIST

CUME_DIST() OVER ([partition_clause] [order_by_clause])
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
float

Returns the cumulative distributions in a group. The return value is greater than 0 and less than or equal to 1.

PERCENT_RANK

PERCENT_RANK() OVER ([partition_clause] [order_by_clause])
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
float

Returns the relative ranks in a group. The return value is greater than or equal to 0 and less than or equal to 1.

NTILE

NTILE(number_of_groups) OVER ([partition_clause] [order_by_clause])
number_of_groups
integer
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
integer

Splits the records into number_of_groups, then returns the sequential numbers of the groups.

FIRST_VALUE

FIRST_VALUE(expr) [IGNORE NULLS] OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
primitive type

Returns the first value in a group. If IGNORE NULLS keywords are specified, then returns the first value that is not a null.

LAST_VALUE

LAST_VALUE(expr) [IGNORE NULLS] OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
primitive type

Returns the last value in a group. If IGNORE NULLS keywords are specified, then returns the last value that is not a null.

NTH_VALUE

NTH_VALUE(expr, n) [IGNORE NULLS] OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
n
integer
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
primitive type

Returns the n-th value in a group. If IGNORE NULLS keywords are specified, then returns the n-th value excluding null values.

LAG

LAG(expr [, offset [, default]]) [IGNORE NULLS] OVER ([partition_clause] [order by clause])
expr
value
offset
integer

The number of rows from current row. The default is 1.

default
value

The value to set when the offset row does not exist. The default is NULL.

partition_clause
Partition Clause
order_by_clause
Order By Clause
return
primitive type

Returns the value in a previous row. If IGNORE NULLS keywords are specified, then rows that expr values are null will be skipped.

LEAD

LEAD(expr [, offset [, default]]) [IGNORE NULLS] OVER ([partition_clause] [order by clause])
expr
value
offset
integer

The number of rows from current row. The default is 1.

default
value

The value to set when the offset row does not exist. The default is NULL.

partition_clause
Partition Clause
order_by_clause
Order By Clause
return
primitive type

Returns the value in a following row. If IGNORE NULLS keywords are specified, then rows that expr values are null will be skipped.

COUNT

COUNT([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
integer

Returns the number of non-null values of expr.

COUNT([DISTINCT] *) OVER ([partition_clause])
partition_clause
Partition Clause
return
integer

Returns the number of all values including null values.

MIN

MIN(expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
primitive type

Returns the minimum value of non-null values of expr. If all values are null, then returns a null.

MAX

MAX(expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
primitive type

Returns the maximum value of non-null values of expr. If all values are null, then returns a null.

SUM

SUM([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the sum of float values of expr. If all values are null, then returns a null.

AVG

AVG([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the average of float values of expr. If all values are null, then returns a null.

STDEV

STDEV([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the sample standard deviation of float values of expr. If all values are null, then returns a null.

STDEVP

STDEVP([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the population standard deviation of float values of expr. If all values are null, then returns a null.

VAR

VAR([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the sample variance of float values of expr. If all values are null, then returns a null.

VARP

VARP([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the population variance of float values of expr. If all values are null, then returns a null.

MEDIAN

MEDIAN([DISTINCT] expr) OVER ([partition_clause] [order_by_clause [windowing_clause]])
expr
value
partition_clause
Partition Clause
return
float

Returns the median of float or datetime values of expr. If all values are null, then returns a null.

Even if expr values are datetime values, this function returns a float or integer value. The return value can be converted to a datetime value by using the DATETIME function.

LISTAGG

LISTAGG([DISTINCT] expr [, separator]) OVER ([partition_clause] [order by clause])
expr
value
separator
string
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
string

Returns the string result with the concatenated non-null values of expr. If all values are null, then returns a null.

separator is placed between values. Empty string is the default.

JSON_AGG

JSON_AGG([DISTINCT] expr) OVER ([partition_clause] [order by clause])
expr
value
partition_clause
Partition Clause
order_by_clause
Order By Clause
return
string

Returns the string formatted in JSON array of expr.