GSTIN stands for Goods and Services Tax Identification Number in India. It is a standard registration number for a person who have registered for Goods and Services Tax(GST). You need to register to GSTIN once you cross a threshold of turnover. In this article let’s understand how we can create a regex for GSTIN and how regex can be matched for GSTIN number.
Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.
Structure of GSTIN
- It should be 15 characters long.
- The first 2 characters should be a number.
- The next 10 characters should be the PAN number of the taxpayer.
- The 13th character (entity code) should be a number from 1-9 or an alphabet.
- The 14th character should be Z.
- The 15th character should be an alphabet or a number.
Regex for checking if GSTIN is valid
Regular Expression-
/^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$/gm
Test string examples for the above regex-
Input String | Match Output |
---|---|
06AAD2V1160H122 | does not match |
13BZWCV3512J1ZB | matches |
222222222222222 | does not match |
06AADCV1460H1ZI | matches |
Here is a detailed explanation of the above regex-
/^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$/gm
^ asserts position at start of a line
Match a single character present in the list below [0-9]
{2} matches the previous token exactly 2 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
Match a single character present in the list below [A-Z]
{5} matches the previous token exactly 5 times
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
Match a single character present in the list below [0-9]
{4} matches the previous token exactly 4 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
Match a single character present in the list below [A-Z]
{1} matches the previous token exactly one time (meaningless quantifier)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
Match a single character present in the list below [1-9A-Z]
{1} matches the previous token exactly one time (meaningless quantifier)
1-9 matches a single character in the range between 1 (index 49) and 9 (index 57) (case sensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
Z matches the character Z with index 9010 (5A16 or 1328) literally (case sensitive)
Match a single character present in the list below [0-9A-Z]
{1} matches the previous token exactly one time (meaningless quantifier)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Hope this article was useful to match GSTIN regex pattern. In conclusion, understanding and implementing the GSTIN regex pattern is crucial for ensuring the validity of Goods and Services Tax Identification Numbers in India. Regex provides a powerful tool for validating complex patterns and can be used effectively to verify GSTINs, contributing to accurate tax compliance and record-keeping.