Command Usage
csvq [options] [subcommand] ["query"|argument]
Execution of Statements
There are three ways to execute a query or statements.
# Pass to the csvq command as an argument
$ csvq 'SELECT id, name FROM users'
# Load from file
$ cat statements.sql
VAR @id := 0;
SELECT @id := @id + 1 AS id,
name
FROM users;
$ csvq -s statements.sql
# Execute in the interactive shell
$ csvq
csvq > SELECT id, name FROM users;
+----+-------+
| id | name |
+----+-------+
| 1 | Louis |
| 2 | Sean |
+----+-------+
csvq > UPDATE users SET name = 'Mildred' WHERE id = 2;
1 record updated on "/home/mithrandie/docs/csv/users.csv".
csvq > SELECT id, name FROM users;
+----+----------+
| id | name |
+----+----------+
| 1 | Louis |
| 2 | Mildred |
+----+----------+
csvq > COMMIT;
Commit: file "/home/mithrandie/docs/csv/users.csv" is updated.
csvq >
Options
- –repository PATH, -r PATH
- Directory Path where files are located. The default is the current directory.
- –timezone value, -z value
- Default Timezone. The default is Local.
Local, UTC or a timezone name in the IANA TimeZone database(in the form of “Area/Location”. e.g. “America/Los_Angeles”).
The timezone database is required in order to use the timezone names. Most Unix-like systems provide the database. But if your system does not provide it and you have not installed Go Lang, then you must put the database file named zoneinfo.zip to the directory “$ZONEINFO” or “$GOROOT/lib/time/”.
- –datetime-format value, -t value
- Datetime Format to parse strings.
Format string is the same as the function DATETIME_FORMAT.
This option can be specified multiple formats using JSON array of strings.
- –ansi-quotes, -k
- Use double quotation mark (U+0022
"
) as identifier enclosure. - –strict-equal, -g
- Compare strictly that two values are equal for DISTINCT, GROUP BY and ORDER BY.
- –wait-timeout value, -w value
- Limit of the waiting time in seconds to wait for locked files to be released. The default is 10.
- –source FILE, -s FILE
- Load query or statements from FILE.
- –import-format value, -i value
- Default format to load files. The default is CSV.
value(case ignored) | format |
---|---|
CSV | Character separated values. Separetor can be changed by -D option. |
TSV | Tab separated values |
FIXED | Fixed-Length Format |
JSON | JSON |
JSONL | JSON Lines |
LTSV | Labeled Tab-separated Values |
Regardless of this option, files with the following extensions will be read in a specific format.
file extension | format |
---|---|
.csv | CSV |
.tsv | TSV |
.json | JSON |
.jsonl | JSON Lines |
.ltsv | Labeled Tab-separated Values |
- –delimiter value, -d value
- Field delimiter for CSV. The default is a comma(U+002C
,
).A delimiter must be one character. Special Characters can be used with backslash escaping.
- –allow-uneven-fields
- Allow loading CSV files with uneven field length.
- –delimiter-positions value, -m value
- Delimiter positions for Fixed-Length Format. The default is “SPACES”.
You can specify “SPACES” or a JSON Array of integers. “SPACES” splits lines automatically by spaces. JSON Array is used to specify delimiter positions explicitly.
Delimiter positions indicate the number of bytes from the start of the line. For example, JSON Array “[5, 10, 15]” splits “1234567890abcde” as “12345, 67890, abcde”
If you want to operate Single-Line Fixed-Length Format, then connect a JSON Array to “S”(U+0053) or “s”(U+0073). For example, “S[2, 3, 6]” imports “01aabc02bdef03cghi” as “(‘01’, ‘a’, ‘abc’), (‘02’, ‘b’, ‘def’), (‘03’, ‘c’, ‘ghi’)”.
- –json-query QUERY, -j QUERY
- QUERY for JSON.
- –encoding value, -e value
- File encoding. Following encodings are supported. The default is AUTO.
value(case ignored) | character encoding |
---|---|
AUTO | Detect file encoding automatically |
UTF8 | UTF-8. Detect BOM automatically |
UTF8M | UTF-8 with BOM |
UTF16 | UTF-16. Detect BOM and Endian automatically |
UTF16BE | UTF-16 Big-Endian |
UTF16LE | UTF-16 Little-Endian |
UTF16BEM | UTF-16 Big-Endian with BOM |
UTF16LEM | UTF-16 Little-Endian with BOM |
SJIS | Shift_JIS |
JSON Format is supported only UTF-8.
Whatever the value of this option is, if the first character in a file is a UTF-8 byte order mark, the file will be loaded as UTF-8 encoding.
- –no-header, -n
- Import the first line as a record.
First line of a CSV file is dealt with as the header line. In case “–no-header” option passed, fields are automatically named as “c” and following sequential number. e.g. “c1”, “c2”, “c3”, …
- –without-null, -a
- Parse empty fields as empty strings.
In most cases CSV fields are imported as string values, but no-quoted empty fields are imported as nulls. By using the “–without-null” option, no-quoted empty fields are imported as empty string values.
- –out FILE, -o FILE
- Export result sets of select queries to FILE.
If the output file is not specified, the result sets are written to standard output.
- –strip-ending-line-break, -T
- Strip line break from the end of files and query results.
- –format value, -f value
- Format of query results. The default is TEXT, but CSV is used for output to pipe.
value(case ignored) | format |
---|---|
CSV | Character separated values. Separetor can be changed by -D option. |
TSV | Tab separated values |
FIXED | Fixed-Length Format |
JSON | JSON |
JSONL | JSON Lines |
LTSV | Labeled Tab-separated Values |
GFM | Text Table for GitHub Flavored Markdown |
ORG | Text Table for Emacs Org-mode |
BOX | Text Table using Box-drawing characters |
TEXT | Text Table for console |
JSONH | Alias of “–format JSON –json-escape HEX” |
JSONA | Alias of “–format JSON –json-escape HEXALL” |
When this option is not specified, the file specified by the –out option will be output in a specific format if it has the following extensions.
file extension | format |
---|---|
.csv | CSV |
.tsv | TSV |
.json | JSON |
.jsonl | JSON Lines |
.ltsv | LTSV |
.md | GFM |
.org | ORG |
- –write-encoding value, -E value
- Character encoding of query results. The default is UTF8.
value(case ignored) | character encoding |
---|---|
UTF8 | UTF-8 |
UTF8M | UTF-8 with BOM |
UTF16 | An alias of UTF16BE |
UTF16BE | UTF-16 Big-Endian |
UTF16LE | UTF-16 Little-Endian |
UTF16BEM | UTF-16 Big-Endian with BOM |
UTF16LEM | UTF-16 Little-Endian with BOM |
SJIS | Shift_JIS |
- –write-delimiter value, -D value
- Field delimiter for query results in CSV format. The default is a comma(U+002C
,
). - –write-delimiter-positions value, -M value
- Delimiter positions for query results in Fixed-Length format. The default is “SPACES”.
If the field value is shorter than the length of the field, the missing part is padded with SPACE(U+0020).
For example, JSON Array “[5, 10, 15]” combines “123, abc, def” into “␣␣123abc␣␣def␣␣”. - –without-header, -N
- Export result sets of select queries without the header line.
- –line-break value, -l value
- Line break in query results. One of following values. The default is LF.
value(case ignored) | unicode character |
---|---|
CRLF | U+000D Carriage Return and U+000A Line Feed |
CR | U+000D Carriage Return |
LF | U+000A Line Feed |
- –enclose-all, -Q
- Enclose all string values in CSV.
- –json-escape, -J
- JSON escape type. The default is BACKSLASH.
value(case ignored) | description |
---|---|
BACKSLASH | Escape special characters with backslashes(U+005C \ ) |
HEX | Escape special characters with six-character sequence |
HEXALL | Escape all strings with six-character sequence |
- –pretty-print, -P
- Make JSON output easier to read in query results.
- –scientific-notation -SN
- Use Scientific Notation for large exponents in output.
- –east-asian-encoding, -W
- Count ambiguous characters as fullwidth. If not, then that characters are counted as halfwidth.
- –count-diacritical-sign, -S
- Count diacritical signs as halfwidth. If not, then that characters are counted as zero-width.
- –count-format-code, -A
- Count format characters and zero-width spaces as halfwidth. If not, then that characters are counted as zero-width.
- –color, -c
- Use ANSI color escape sequences.
- –quiet, -q
- Suppress operation log output.
- –limit-recursion
- Maximum number of iterations for recursive queries. “-1” means no limit. The default is 1000.
- –cpu, -p
- Hint for the number of cpu cores to be used. The default is the half of the number of cpu cores.
- –stats, -x
- Show execution time and memory statistics.
- Query Execution Time
- execution time of one query. select, insert, update, or delete queries are measured.
- TotalTime
- total execution time
- TotalAlloc
- cumulative bytes of allocated heap objects
- HeapSys
- bytes of heap memory obtained from the OS
- Mallocs
- cumulative count of heap objects allocated
- Frees
- cumulative count of heap objects freed
- –help, -h
- Show help
- –version, -v
- Print the version
If you want to pass “false” to a boolean command option, you can specify it as “–option-name=false”.
Some command options can also be specified in statements by using Set Flag Statements.
Determination of file format
Loading
First, the file extension to be loaded is checked. If the extension is displayed in the following table, the associated file format will be selected. Otherwise, the passed value by the “–import-format” option will be used to load.
extension | file format |
---|---|
.csv | CSV |
.tsv | TSV |
.json | JSON |
.jsonl | JSON Lines |
.ltsv | LTSV |
The following options are available for loading.
- –delimiter value, -d value
- –delimiter-positions value, -m value
- –json-query QUERY, -j QUERY
- –encoding value, -e value
- –no-header, -n
- –without-null, -a
You can also use Table Object Expressions to specify the format each file. Table Object Expression effects the first loading in a transaction. After the second loading, the specifications in the table object expression are ignored. You must use the ROLLBACK statement to discard all changes in the transaction if you reload the same file.
Updating
The table attributes that were determined when loading will be used to updating. You can modify table attributes by using SET ATTRIBUTE Statement).
Creating
First, the file extension to be loaded is checked. If the extension is displayed in the following table, the associated file format will be selected. Otherwise CSV is used as the format. You can modify table attributes by using SET ATTRIBUTE Statement).
extension | file format |
---|---|
.csv | CSV |
.tsv | TSV |
.json | JSON |
.jsonl | JSON Lines |
.ltsv | LTSV |
.md | GitHub Flavored Markdown |
.org | Emacs Org-mode |
Exporting query results with the “–out” option
The passed value by the “–format” option will be used to export.
The following options are available for exporting.
- –write-encoding value, -E value
- –write-delimiter value, -D value
- –write-delimiter-positions value, -M value
- –without-header, -N
- –line-break value, -l value
- –enclose-all, -Q
- –json-escape, -J
- –pretty-print, -P
- –east-asian-encoding, -W
- –count-diacritical-sign, -S
- –count-format-code, -A
Subcommands
subcommand | description |
---|---|
fields | Show fields in file |
calc | Calculate value from stdin |
syntax | Print syntax |
check-update | Check for updates |
help, h | Shows help |
Fields Subcommand
Show fields in a file.
csvq [options] fields CSV_FILE_PATH
Calc Subcommand
Calculate a value from stdin.
csvq [options] calc "expression"
In the expression, the value from stdin is represented as “c1”.
Example:
$ echo 'foo' | csvq calc "base64_encode(c1)"
Zm9v
$ echo 123 | csvq calc "c1 + 100"
223
Syntax Subcommand
Print syntax.
csvq [options] syntax [search_word ...]
Check Update Subcommand
Check for updates.
csvq [options] check-update [subcommand options]
Subcommand Options
- –include-pre-release
- Check including pre-release version.
Configurations
Before an execution of csvq, the following processings will be performed.
- Load Environment Configurations.
- Execute Pre-Load Statements.
- Overwrite Flags with Command Options.
Environment Configurations
Configuration Files are loaded in the following order.
- $XDG_CONFIG_HOME/csvq/csvq_env.json or $HOME/.config/csvq/csvq_env.json
- $HOME/.csvq_env.json
- $HOME/.csvq/csvq_env.json
- CURRENT_DIRECTORY/csvq_env.json
Download a JSON file from https://github.com/mithrandie/csvq/blob/master/csvq_env.json and put it to the above paths. If the files exists in the multiple paths, then all existing files will be loaded and configuration items are overwritten with the values in the file that is loaded later.
Configuration Items in JSON
Item | Format | default |
---|---|---|
datetime_format | array of strings | |
timezone | string | Local |
ansi_quotes | bool | false |
interactive_shell.history_file | string | .csvq_history |
interactive_shell.history_limit | number | 500 |
interactive_shell.prompt | string | |
interactive_shell.continuous_prompt | string | |
interactive_shell.completion | bool | true |
interactive_shell.kill_whole_line | bool | false |
interactive_shell.vi_mode | bool | false |
environment_variables | object{var_name: string} | |
palette.effectors | object{effect_name: effect_object} |
Interactive Shell
Except for Prompt and Continuous Prompt, items in this category are effective only on the following systems.
- darwin dragonfly freebsd linux netbsd openbsd solaris windows
History File
File name to save command history. If it is a relative path, then the path is interpreted as a relative path from your home directory. On UNIX-like systems, you can use environment variable such as $HOME or ${HOME} to specify the path. If not, ‘~’ can be used to represent home directory.
History Limit
Max length of command history. If history_limit is set to -1, then the command history is disabled.
Prompt, Continuous Prompt
Appearance of the prompt on the interactive shell.
You can embed some expressions in the appearance by using a Dollar Sign(U+0024 $
) and Curly Brackets(U+007B, U+007D {}
)
Completion
Whether to use completion.
Kill Whole Line
If true then keyboard shortcut “Ctrl+U” will remove the entire line. If not, the shortcut will remove before the current cursor position on the line.
Vi Mode
Whether to use vi-mode.
Effect Object
Effects
Effects are specified with an array of strings. Following strings are supported.
- Reset Bold Faint Italic Underline SlowBlink RapidBlink ReverseVideo Conceal CrossedOut
Colors
Foreground and background colors are specified with one of the following values.
- 8 Color (String)
- Following strings are supported.
- Black Red Green Yellow Blue Magenta Cyan White BrightBlack BrightRed BrightGreen BrightYellow BrightBlue BrightMagenta BrightCyan BrightWhite DefaultColor
- 256 Color (Number)
- Color number from 0 to 255.
- RGB Color (Array of 3 numbers)
- The first element is the intensity of red between 0 and 255, the second is green, and the third is blue.
Pre-Load Statements
Files in which statements are written will be loaded and executed in the following order.
- $XDG_CONFIG_HOME/csvq/csvqrc or $HOME/.config/csvq/csvqrc
- $HOME/.csvqrc
- $HOME/.csvq/csvqrc
- CURRENT_DIRECTORY/csvqrc
Special Characters
In command parameters and statements, following strings represent special characters.
string | unicode character |
---|---|
\a | U+0007 Bell |
\b | U+0008 Backspace |
\f | U+000C Form Feed |
\n | U+000A Line Feed |
\r | U+000D Carriage Return |
\t | U+0009 Horizontal Tab |
\v | U+000b Vertical Tab |
\” | U+0022 Double Quote |
\’ | U+0027 Single Quote (in strings only) |
\` | U+0060 Grave Accent (in identifiers only) |
\\ | U+005c Backslash |
Escaped Characters in JSON output
character | BACKSLASH | HEX | HEXALL |
---|---|---|---|
U+0022 Quotation Mark | \” | \u0022 | \u0022 |
U+005C Backslash | \\ | \u005C | \u005C |
U+002F Solidus | \/ | \u002F | \u002F |
U+0008 Backspace | \b | \u0008 | \u0008 |
U+000C Form Feed | \f | \u000C | \u000C |
U+000A Line Feed | \n | \u000A | \u000A |
U+000D Carriage Return | \r | \u000D | \u000D |
U+0009 Horizontal Tab | \t | \u0009 | \u0009 |
U+0000 - U+001F | \uXXXX | \uXXXX | \uXXXX |
- U+FFFF | N/A | N/A | \uXXXX |
U+10000 - | N/A | N/A | \uXXXX\uXXXX (UTF-16 surrogate pair) |
Return Code
code | description |
---|---|
0 | Normally terminated |
1 | Errors inside the csvq |
2 | Incorrect command usage |
4 | Syntax errors |
8 | Timeout errors |
16 | I/O errors |
32 | Errors outside the csvq |
64 | The default of triggered errors by TRIGGER ERROR statements |
128+n | Terminated by signal “n” |