Oracle REGEXP_COUNT function

Last update on August 07 2024 11:38:13 (UTC/GMT +8 hours)

Description

The Oracle REGEXP_COUNT function is used to count the number of times that a pattern occurs in a string. It returns an integer indicating the number of occurrences of a pattern. If no match is found, then the function returns 0.

Uses of Oracle REGEXP_COUNT Function

Count Pattern Occurrences: Determine how many times a specific pattern appears in a string.

Case-Sensitive and Insensitive Searches: Perform both case-sensitive and case-insensitive pattern matching.

Flexible Start Position: Specify a starting position for the search within the string.

Modify Matching Behavior: Utilize various match parameters to alter the function's behavior for complex pattern matching.

Text Analysis: Useful for text analysis and data parsing tasks where pattern frequency needs to be determined.

Multiline Support: Handle multi-line strings by using the appropriate match parameters.

Whitespace Handling: Ignore or include whitespace characters in pattern matching based on requirements.

Syntax:

REGEXP_COUNT (source_char, pattern [, position [, match_param]])

Parameters:

Name	Description	Data Types
source_char	The string to search.	CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB.
pattern	pattern is the regular expression.	CHAR, VARCHAR2, NCHAR, or NVARCHAR2.
position	Optional. The position is a positive integer indicating the character of source_char where the search will start. If omitted, it defaults to 1 which is the first position in the string.
match_param	Optional. It lets you allow to modify the matching behavior for the REGEXP_COUNT function.

Here is the list of pattern formatting operator:

Operator	Description
\d	A digit character.
\D	A nondigit character.
\w	A word character.
\W	A nonword character.
\s	A whitespace character.
\S	A non-whitespace character.
\A	Matches only at the beginning of a string, or before a newline character at the end of a string.
\Z	Matches only at the end of a string.
*?	Matches the preceding pattern element 0 or more times (nongreedy).
+?	Matches the preceding pattern element 1 or more times (nongreedy).
??	Matches the preceding pattern element 0 or 1 time (nongreedy).
{n}?	Matches the preceding pattern element exactly n times (nongreedy).
{n,}?	Matches the preceding pattern element at least n times (nongreedy).
{n,m}?	Matches the preceding pattern element at least n but not more than mtimes (nongreedy).

Here is the list of Regular Expression Operators and Metasymbols:

Operator	Description
\	The backslash character can have four different meanings depending on the context. It can: Stand for itself, Quote the next character, Introduce an operator, Do nothing
*	Matches zero or more occurrences
+	Matches one or more occurrences
?	Matches zero or one occurrence
\|	Alternation operator for specifying alternative matches
^	Matches the beginning of a string by default. In multiline mode, it matches the beginning of any line anywhere within the source string.
$	Matches the end of a string by default. In multiline mode, it matches the end of any line anywhere within the source string.
.	Matches any character in the supported character set except NULL
[ ]	Bracket expression for specifying a matching list that should match any one of the expressions represented in the list. A non-matching list expression begins with a circumflex (^) and specifies a list that matches any character except for the expressions represented in the list. To specify a right bracket (]) in the bracket expression, place it first on the list (after the initial circumflex (^), if any). To specify a hyphen in the bracket expression, place it first on the list (after the initial circumflex (^), if any), last in the list, or as an ending range point in a range expression.
( )	Grouping expression, treated as a single subexpression
{m}	Matches exactly m times
{m,}	Matches at least m times
{m,n}	Matches at least m times but no more than n times
\n	The backreference expression (n is a digit between 1 and 9) matches the nth subexpression enclosed between '(' and ')' preceding the \n
[..]	Specifies one collation element, and can be a multicharacter element (for example, [.ch.] in Spanish)
[: :]	Specifies character classes (for example, [:alpha:]). It matches any character within the character class.
[==]	Specifies equivalence classes. For example, [=a=] matches all characters having the base letter 'a'.

Here is the list of values for match_param:

Value	Description
'c'	Perform case-sensitive matching.
'i'	Perform case-insensitive matching.
'n'	Allows the period character (.) to match the newline character. By default, the period is a wildcard.
'm'	expression is assumed to have multiple lines, where ^ is the start of a line and $ is the end of a line, regardless of the position of those characters in expression. By default, an expression is assumed to be a single line.
'x'	Whitespace characters are ignored. By default, whitespace characters are matched like any other character.

Applies to

Oracle 12c, Oracle 11g

Examples: Oracle REGEXP_COUNT function

The following example counts the number of times the character 't' appears in a string.

SELECT REGEXP_COUNT ('The web development Tutorial', 't') FROM dual;

Sample Output:

REGEXP_COUNT('THEWEBDEVELOPMENTTUTORIAL','T')
---------------------------------------------
                                            2

The above example will return 2 because it is counting the number of occurrences of 't' in the string. Here we did not specify a match_parameter value, the REGEXP_COUNT function will perform a case-sensitive search which means that the 'T' characters will not be included in the count.

In the following example, we have provided a start_position of 1 and a match_parameter of 'i', the query will return 4 as the result. This time, both 't' and 'T' values would be included in the count.

SELECT REGEXP_COUNT ('The web development Tutorial', 't',1,'i') FROM dual;

Sample Output:

REGEXP_COUNT('THEWEBDEVELOPMENTTUTORIAL','T',1,'I')
---------------------------------------------------
                                                  4

The following example will return the number of times that the word 'The' appears in the string.

SELECT REGEXP_COUNT ('Match the string for multiple character', 'The',1,'i') FROM dual;

Sample Output:

REGEXP_COUNT('MATCHTHESTRINGFORMULTIPLECHARACTER','THE',1,'I')
--------------------------------------------------------------
                                                             1

Previous: LENGTH
Next: REGEXP_INSTR