SQL-like query language for csv

String Functions

name description
TRIM Return a string with all the specified leading and trailing characters removed
LTRIM Return a string with all the specified leading characters removed
RTRIM Return a string with all the specified trailing characters removed
UPPER Return a string with all characters mapped to their upper case
LOWER Return a string with all characters mapped to their lower case
BASE64_ENCODE Return a base64 encoding of a string
BASE64_DECODE Return a string represented by a base64 encoding
HEX_ENCODE Return a hexadecimal encoding of a string
HEX_DECODE Return a string represented by a hexadecimal encoding
LEN Return the number of characters of a string
BYTE_LEN Return the byte length of a string
WIDTH Return the string width of a string
LPAD Return a string left-side padded
RPAD Return a string right-side padded
SUBSTRING Return the substring of a string
SUBSTR Return the substring of a string using zero-based indexing
INSTR Return the index of the first occurrence of a substring
LIST_ELEM Return a element of a list
REPLACE Return a string replaced the substrings with another string
REGEXP_MATCH Verify a string matches with a regular expression
REGEXP_FIND Return a string that matches a regular expression
REGEXP_FIND_SUBMATCHES Return a string representing an array that matches a regular expression
REGEXP_FIND_ALL Return a string representing a nested array that matches a regular expression
REGEXP_REPLACE Return a string replaced substrings that match a regular expression with another strings
TITLE_CASE Returns a string converted to Title Case
FORMAT Return a formatted string
JSON_VALUE Return a value from json
JSON_OBJECT Return a string formatted in json object

Definitions

TRIM

TRIM(str)
str
string
return
string

Returns the string value that is removed all leading and trailing white spaces from str.

TRIM(str, charset)
str
string
charset
string
return
string

Returns the string value that is removed all leading and trailing characters contained in charset from str.

LTRIM

LTRIM(str)
str
string
return
string

Returns the string value that is removed all leading white spaces from str.

LTRIM(str, charset)
str
string
charset
string
return
string

Returns the string value that is removed all leading characters contained in charset from str.

RTRIM

RTRIM(str)
str
string
return
string

Returns the string value that is removed all trailing white spaces from str.

RTRIM(str, charset)
str
string
charset
string
return
string

Returns the string value that is removed all trailing characters contained in charset from str.

UPPER

UPPER(str)
str
string
return
string

Returns the string value replaced str with characters mapped to their upper case.

LOWER

LOWER(str)
str
string
return
string

Returns the string value replaced str with characters mapped to their upper case.

BASE64_ENCODE

BASE64_ENCODE(str)
str
string
return
string

Returns the Base64 encoding of str.

BASE64_DECODE

BASE64_DECODE(str)
str
string
return
string

Returns the string value represented by str that is encoded with Base64.

HEX_ENCODE

HEX_ENCODE(str)
str
string
return
string

Returns the hexadecimal encoding of str.

HEX_DECODE

HEX_DECODE(str)
str
string
return
string

Returns the string value represented by str that is encoded with hexadecimal.

LEN

LEN(str)
str
string
return
integer

Returns the number of characters of str.

BYTE_LEN

BYTE_LEN(str [, encoding])
str
string
encoding
string

“UTF8”, “UTF16” or “SJIS”. The default is “UTF8”.

return
integer

Returns the byte length of str.

WIDTH

WIDTH(str)
str
string
return
integer

Returns the string width of str. Half-width characters are counted as 1, and full-width characters are counted as 2.

LPAD

LPAD(str, len, padstr [, pad_type, encoding])
str
string
len
integer
padstr
string
pad_type
string

“LEN”, “BYTE” or “WIDTH”. The default is “LEN”.

encoding
string

“UTF8”, “UTF16” or “SJIS”. The default is “UTF8”.

return
string

Returns the string value of str padded with leading padstr to a length specified by len.

RPAD

RPAD(str, len, padstr [, pad_type, encoding])
str
string
len
integer
padstr
string
pad_type
string

“LEN”, “BYTE” or “WIDTH”. The default is “LEN”.

encoding
string

“UTF8”, “UTF16” or “SJIS”. The default is “UTF8”.

return
string

Returns the string value of str padded with trailing padstr to a length specified by len.

SUBSTRING

SUBSTRING(str FROM position [FOR len])
SUBSTRING(str, position [, len])
str
string
position
integer
len
integer
return
string

Returns the len characters in str starting from the position-th character using one-based positional indexing.

If position is 0, then it is treated as 1.
if len is not specified or len is longer than the length from position to the end, then returns the substring from position to the end.
If position is negative, then starting position is position from the end of the str.

SUBSTR

SUBSTR(str, position [, len])
str
string
position
integer
len
integer
return
string

This function behaves the same as SUBSTRING, but uses zero-based positional indexing.

INSTR

INSTR(str, substr)
str
string
substr
string
return
integer

Returns the index of the first occurrence of substr in str, or null if substr is not present in str.

LIST_ELEM

LIST_ELEM(str, sep, index)
str
string
sep
string
index
integer
return
string

Returns the string at index in the list generated by splitting with sep from str.

REPLACE

REPLACE(str, old, new)
str
string
old
string
new
string
return
string

Returns the string that is replaced all occurrences of old with new in str.

REGEXP_MATCH

REGEXP_MATCH(str, regexp [, flags])
str
string
regexp
string
flags
string

A string including the flags of regular expressions

return
ternary

Verifies the string str matches with the regular expression regexp.

REGEXP_FIND

REGEXP_FIND(str, regexp [, flags])
str
string
regexp
string
flags
string

A string including the flags of regular expressions

return
string

Returns the string that matches the regular expression regexp in str.

Examples

# Return the matched string.
$ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'cdef')"
+----------------------------------------+
| REGEXP_FIND('ABCDEFG abcdefg', 'cdef') |
+----------------------------------------+
| cdef                                   |
+----------------------------------------+

# Return the submatched string if there is a submatch expression.
$ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'c(de)f')"
+------------------------------------------+
| REGEXP_FIND('ABCDEFG abcdefg', 'c(de)f') |
+------------------------------------------+
| de                                       |
+------------------------------------------+

# Return the first matched string if there are multiple matched strings.
$ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'cdef', 'i')"
+---------------------------------------------+
| REGEXP_FIND('ABCDEFG abcdefg', 'cdef', 'i') |
+---------------------------------------------+
| CDEF                                        |
+---------------------------------------------+

REGEXP_FIND_SUBMATCHES

REGEXP_FIND_SUBMATCHES(str, regexp [, flags])
str
string
regexp
string
flags
string

A string including the flags of regular expressions

return
string

A string representing a JSON array.

Returns the string representing an array that matches the regular expression regexp in str.

Examples

# Return all the first matched strings including submatches.
$ csvq "SELECT REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i')"
+----------------------------------------------------------+
| REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i') |
+----------------------------------------------------------+
| ["CDEF","DE"]                                            |
+----------------------------------------------------------+

# Return only the submatched string.
$ csvq "SELECT JSON_VALUE('[1]', REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i'))"
+-----------------------------------------------------------------------------+
| JSON_VALUE('[1]', REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i')) |
+-----------------------------------------------------------------------------+
| DE                                                                          |
+-----------------------------------------------------------------------------+

REGEXP_FIND_ALL

REGEXP_FIND_ALL(str, regexp [, flags])
str
string
regexp
string
flags
string

A string including the flags of regular expressions

return
string

A string representing a nested JSON array.

Returns the string representing a nested array that matches the regular expression regexp in str.

Examples

# Return all the matched strings.
$ csvq "SELECT REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i')"
+---------------------------------------------------+
| REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i') |
+---------------------------------------------------+
| [["CDEF","DE"],["cdef","de"]]                     |
+---------------------------------------------------+

# Return only submatched strings as an array.
$ csvq "SELECT JSON_VALUE('[][1]', REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i'))"
+------------------------------------------------------------------------+
| JSON_VALUE('[][1]', REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i')) |
+------------------------------------------------------------------------+
| ["DE","de"]                                                            |
+------------------------------------------------------------------------+

REGEXP_REPLACE

REGEXP_REPLACE(str, regexp, replacement_value [, flags])
str
string
regexp
string
replacement_value
string
flags
string

A string including the flags of regular expressions

return
string

Returns the string replaced substrings that match the regular expression regexp with replacement_value in str.

TITLE_CASE

TITLE_CASE(str)
str
string
return
string

Returns a string with the first letter of each word in str capitalized.

FORMAT

FORMAT(format [, replace_value ... ])
format
string
replace_value
value
return
string

Returns a formatted string replaced placeholders with replace in format.

Format Placeholder

%[flag][width][.precision]specifier
flag
flag description
+ Print a plus sign for numeric values
‘ ’ (U+0020 Space) Print a space instead of a plus sign
- Pad on the right
0 Pad with zeros
width
integer

Width of the replaced string.

precision
integer

Number of digits after the decimal point for a float value, or max length for a string value.

specifier
specifier description
b Base 2 integer
o Base 8 integer
d Base 10 integer
x Base 16 integer with lower cases
X Base 16 integer with upper cases
e Exponential notation with lower cases
E Exponential notation with upper cases
f Floating point decimal number
s String representation of the value
q Quoted string representation of the value
i Quoted identifier representation of the value
T Type of the value
% ’%’

Quoted string and identifier representations are escaped for special characters.

JSON_VALUE

JSON_VALUE(json_query, json_data)
json_query
string

JSON Query to uniquely specify a value.

json_data
string
return
value

Returns a value in _json_data.

A JSON values are converted to following types.

JSON value csvq value
object string
array string
number integer or float
string string
true boolean
false boolean
null null

JSON_OBJECT

JSON_OBJECT([field [, field ...]])

field
  : value
  : value AS alias
value
value
alias
identifier
return
string

Returns a string formatted in JSON.

If no arguments are passed, then the object include all fields in the view.

Flags of regular expressions

flag description
i case-insensitive
m multi-line mode
s let . match \n
U swap meaning of x* and x*?, x+ and x+?, etc.