UUID (Universally Unique Identifier) is a 128-bit number used to identify information in computer systems. It is typically represented as a string of 32 hexadecimal digits, divided into 5 hyphen-separated groups. UUIDs are used to uniquely identify objects and records, such as files, events, and logins. They are also used for various other purposes, such as for creating random passwords and generating session keys. In this article let’s understand how we can create a regex for UUID and how regex can be matched for a valid UUID.
Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.
Structure of a UUID
A IFSC Code should have the following criteria and structure-
- It should be a 128-bit number.
- It should be 36 characters (32 hexadecimal characters and 4 hyphens) long.
- It should be displayed in five groups separated by hyphens (-).
Regex for checking if UUID is valid or not
We will have to consider UUID for all 5 versions that are actively used in today’s context. v1, v2 etc., may be old, but, they are still being used globally at massive scale in systems and processes.
Regular Expression for UUID validation for all versions (v1-v5)-
/^[0-9A-F]{8}-[0-9A-F]{4}-[1-5][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
If you are looking at version specific regex, you can use the below regex for each version-
Regex for UUID v1
/^[0-9A-F]{8}-[0-9A-F]{4}-[1][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
Regex for UUID v2
/^[0-9A-F]{8}-[0-9A-F]{4}-[2][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
Regex for UUID v3
/^[0-9A-F]{8}-[0-9A-F]{4}-[3][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
Regex for UUID v4
/^[0-9A-F]{8}-[0-9A-F]{4}-[4][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
Regex for UUID v5
/^[0-9A-F]{8}-[0-9A-F]{4}-[5][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
Test string examples for the above regex-
Input String | Match Output |
---|---|
asd-asd-asd-asd-asd-asd | does not match |
1d5a52f1-5432-2144-8cad-9b1eda4a3a3d | matches |
1231234532311231 | does not match |
adda88a2-aaaa-1234-8cad-1cce2a2a3a3e | matches |
Here is a detailed explanation of the above regex-
/^[0-9A-F]{8}-[0-9A-F]{4}-[1-5][0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/gmi
^ asserts position at start of a line
Match a single character present in the list below [0-9A-F]
{8} matches the previous token exactly 8 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
A-F matches a single character in the range between A (index 65) and F (index 70) (case insensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [0-9A-F]
{4} matches the previous token exactly 4 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
A-F matches a single character in the range between A (index 65) and F (index 70) (case insensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [1-5]
1-5 matches a single character in the range between 1 (index 49) and 5 (index 53) (case insensitive)
Match a single character present in the list below [0-9A-F]
{3} matches the previous token exactly 3 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
A-F matches a single character in the range between A (index 65) and F (index 70) (case insensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [89AB]
89AB matches a single character in the list 89AB (case insensitive)
Match a single character present in the list below [0-9A-F]
{3} matches the previous token exactly 3 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
A-F matches a single character in the range between A (index 65) and F (index 70) (case insensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [0-9A-F]
{12} matches the previous token exactly 12 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
A-F matches a single character in the range between A (index 65) and F (index 70) (case insensitive)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
Hope this article was useful to check if the string is a valid UUID or not. In this article, we delved into the world of Universally Unique Identifiers (UUIDs) and explored their significance in uniquely identifying various objects and records in computer systems. We also learned about the structure of UUIDs and how regular expressions (regex) can be used to validate them. By understanding the intricacies of regex and its application in UUID validation, we’ve equipped ourselves with a valuable tool for ensuring data integrity and accuracy in our coding endeavors.