SQLite Collating Sequences
Description
At the time of comparison of two strings, in SQLite, it uses a collating sequence or collating function to understand which string is greater or whether the two strings are equal. SQLite has three built-in collating functions: BINARY, NOCASE, and RTRIM.
- BINARY - Compares string data using memcmp(), regardless of text encoding.
- NOCASE - It is almost same as binary, except the 26 upper case characters of ASCII are folded to their lower case equivalents before the comparison is performed.
- RTRIM - The same as binary, except that trailing space characters, are ignored.
The rules for determining which collating function to use for a binary comparison operator (=, <, >, <=, >=, !=, IS, and IS NOT) are as follows and in the order shown:
- If any of the operands has an explicit collating function assignment using the postfix COLLATE operator, then the explicit collating function is used for comparison, with precedence to the collating function of the left operand.
- If any of the operands is a column, then the collating function of that column is used with precedence to the left operand.
- Otherwise, the BINARY collating function is used for comparison.
Collation Sequence Examples
Here a table test has been created with four rows.
CREATE TABLE test(
coln INTEGER PRIMARY KEY,
cola, /* collating sequence BINARY */
colb COLLATE BINARY, /* collating sequence BINARY */
colc COLLATE RTRIM, /* collating sequence RTRIM */
cold COLLATE NOCASE /* collating sequence NOCASE */
);
Now insert some rows.
INSERT INTO test VALUES(1,'pqr','pqr', 'pqr ','pqr');
INSERT INTO test VALUES(2,'pqr','pqr', 'pqr', 'PQR');
INSERT INTO test VALUES(3,'pqr','pqr', 'pqr ', 'Pqr');
INSERT INTO test VALUES(4,'pqr','pqr ','PQR', 'pqr');
sqlite> select * from test; coln cola colb colc cold ---------- ---------- ---------- ---------- ------- 1 pqr pqr pqr pqr 2 pqr pqr pqr PQR 3 pqr pqr pqr Pqr 4 pqr pqr PQR pqr
The above table shows that the 3rd column of 1st row contain two spaces at the end, the forth column of 2nd row all the character are capital, the 3rd column of 3rd row contain single space at the end and 4th column of 3rd row contain first character capital, the 3rd column of 4th row contain all the character arer capital.
Text comparison cola=colb is performed using the BINARY collating sequence.
SELECT coln FROM test WHERE cola = colb ORDER BY coln;
Here is the result.
coln ---------- 1 2 3
Here in the above result cola and colb are matching for 1st three rows because the colb column for the 4th row have space at the end and the colb column collating sequence is BINARY, so it will match.
Text comparison cola=colb is performed using the RTRIM collating sequence.
SELECT coln FROM test WHERE cola = colb COLLATE rtrim ORDER BY coln;
Here is the result.
coln ---------- 1 2 3 4
Here in the above result cola and colb are matching for all the rows because the collating sequence is rtrim defined in the select statement.
Text comparison cold=cola is performed using the NOCASE collating sequence.
SELECT coln FROM test WHERE cold = cola ORDER BY coln;
Here is the result.
coln ---------- 1 2 3 4
Here in the above result shows that the column cold and cola are matching for all the rows because the cold column is case insensitivity.
Text comparison cola=cold is performed using the BINARY collating sequence.
SELECT coln FROM test WHERE cola = cold ORDER BY coln;
Here is the result.
coln ---------- 1 4
Here in the above result shows the column cola and cold are matching for only 1st and 4th rows.
Text comparison 'pqr'=colc is performed using the RTRIM collating sequence.
SELECT coln FROM test WHERE 'pqr' = colc ORDER BY coln;
Here is the result.
coln ---------- 1 2 3
Here in the above result shows that the column colc only matching with the expression as defined because this column is collating by rtrim.
Text comparison colc='pqr' is performed using the RTRIM collating sequence.
SELECT coln FROM test WHERE colc = 'pqr' ORDER BY coln;
Here is the result.
coln ---------- 1 2 3
Here in the above result shows that the column colc only matching with the expression as defined because this column is collating by rtrim.
Grouping is performed using the NOCASE collating sequence (Values 'pqr', 'PQR', and 'Pqr' are placed in the same group).
SELECT count(*) FROM test GROUP BY cold ORDER BY 1;
Here is the result.
count(*) ---------- 4
Grouping is performed using the BINARY collating sequence. 'pqr' and 'PQR' and 'Pqr' form different groups
SELECT count(*) FROM test GROUP BY (cold || '') ORDER BY 1;
Here is the result.
count(*) ---------- 1 1 2
Sorting or column colc is performed using the RTRIM collating sequence.
SELECT coln FROM test ORDER BY colc, coln;
Here is the result.
coln ---------- 4 1 2 3
Sorting of (colc||'') is performed using the BINARY collating sequence.
SELECT coln FROM test ORDER BY (colc||''), coln;
Here is the result.
coln ---------- 4 2 3 1
Sorting of column colc is performed using the NOCASE collating sequence.
SELECT coln FROM test ORDER BY colc COLLATE NOCASE, coln;
Here is the result.
coln ---------- 2 4 3 1
Previous:
Union
Next:
SQLite Aggregate Function AVG()
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://w3resource.com/sqlite/sqlite-collating-function-or-sequence.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics