A hostname is a name that identifies a computer or other device on a network. It is typically used to distinguish one device from another on a network and to locate resources on the network. A hostname is typically composed of a local name and a domain name. The local name identifies the device on the network, and the domain name identifies the network to which the device belongs. For example, the hostname “computer1.example.com” consists of the local name “computer1” and the domain name “example.com”. In this article let’s understand how we can create a regex for hostname and how regex can be matched for hostname.
Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.
Structure of a HostName
The hostname should have the following criteria and structure-
- It may or maynot contain
www.
or a optionally a subdomain - then it must be followed by domain name
- then it will be followed by top level domain(TLD) like .com, .net, .io etc.,
Regex for checking if HostName is valid or not
Regular Expression-
/^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$/igm
Test string examples for the above regex-
Input String | Match Output |
---|---|
.as10 | does not match |
www.google.com | matches |
#@$some .qwq.eras | does not match |
something.debugpointer.com | matches |
Here is a detailed explanation of the above regex-
/^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$/igm
^ asserts position at start of a line
1st Capturing Group (([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
2nd Capturing Group ([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])
1st Alternative [a-zA-Z0-9]
Match a single character present in the list below [a-zA-Z0-9]
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
2nd Alternative [a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]
Match a single character present in the list below [a-zA-Z0-9]
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
Match a single character present in the list below [a-zA-Z0-9\-]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
\- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [a-zA-Z0-9]
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
3rd Capturing Group ([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])
1st Alternative [A-Za-z0-9]
Match a single character present in the list below [A-Za-z0-9]
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
2nd Alternative [A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]
Match a single character present in the list below [A-Za-z0-9]
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
Match a single character present in the list below [A-Za-z0-9\-]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
\- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
Match a single character present in the list below [A-Za-z0-9]
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
$ asserts position at the end of a line
Global pattern flags
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Hope this article was useful to check if the string is a valid domain name or not. In conclusion, understanding the structure of a hostname is essential for effectively navigating networks and locating resources. Regular expressions (regex) offer a powerful tool for validating and matching hostnames. By exploring the detailed regex pattern provided in this article, you can efficiently determine whether a given string is a valid domain name. Regex’s versatility makes it a valuable asset for various programming tasks, text processing, and data validation.