An Amazon Resource Name (ARN) is a identifier used to identify an AWS resource. ARNs are unique across AWS accounts and can be used to identify resources across different services, including Amazon S3, Amazon EC2, and Amazon RDS. They are used for access control and for tagging resources, and are commonly used in AWS Identity and Access Management (IAM) policies. ARNs have the format: arn:aws:service:region:account-id:resource, where “service” is the name of the service, “region” is the region where the resource is located, and “account-id” is the ID of the AWS account that owns the resource. In this article let’s understand how we can create a regex for AWS ARN and how regex can be matched for AWS ARN values.
Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.
Structure of an AWS ARN
A pattern to parse Amazon Web Services ARNs into their varying components-
- Partition
- Service
- Region
- AccountID
- ResourceType (optional – empty string is missing)
- Resource
Here are a few valid ARN structures or ARN values-
- arn:partition:service:region:account-id:resource
- arn:partition:service:region:account-id:resourcetype/resource
- arn:partition:service:region:account-id:resourcetype:resource
Regex for checking if its a valid AWS ARN
Regular Expression-
/^arn:(?P<Partition>[^:\n]*):(?P<Service>[^:\n]*):(?P<Region>[^:\n]*):(?P<AccountID>[^:\n]*):(?P<Ignore>(?P<ResourceType>[^:\/\n]*)[:\/])?(?P<Resource>.*)$/gm
Test string examples for the above regex-
Input String | Match Output |
---|---|
zero | does not match |
some-random-string | does not match |
arn:partition:service:region:account-id:resourcetype/resource | matches |
arn:partition:service:region:account-id:resource | matches |
arn:partition:service:region:account-id:resourcetype:resource | matches |
Here is a detailed explanation of the above regex-
/^arn:(?P<Partition>[^:\n]*):(?P<Service>[^:\n]*):(?P<Region>[^:\n]*):(?P<AccountID>[^:\n]*):(?P<Ignore>(?P<ResourceType>[^:\/\n]*)[:\/])?(?P<Resource>.*)$/gm
^ asserts position at start of a line
arn: matches the characters arn: literally (case sensitive)
Named Capture Group Partition (?P<Partition>[^:\n]*)
Match a single character not present in the list below [^:\n]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
Named Capture Group Service (?P<Service>[^:\n]*)
Match a single character not present in the list below [^:\n]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
Named Capture Group Region (?P<Region>[^:\n]*)
Match a single character not present in the list below [^:\n]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
Named Capture Group AccountID (?P<AccountID>[^:\n]*)
Match a single character not present in the list below [^:\n]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
Named Capture Group Ignore (?P<Ignore>(?P<ResourceType>[^:\/\n]*)[:\/])?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Named Capture Group ResourceType (?P<ResourceType>[^:\/\n]*)
Match a single character not present in the list below [^:\/\n]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
Match a single character present in the list below [:\/]
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
Named Capture Group Resource (?P<Resource>.*)
. matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Hope this article was useful to check the validity of an AWS ARN value using regex. In conclusion, understanding the structure of an Amazon Resource Name (ARN) and creating a regular expression (regex) to validate its format is crucial for efficiently managing AWS resources. ARNs play a significant role in access control, resource identification, and tagging within AWS. Regex is a powerful tool that aids in searching, manipulating text, and validating user input. By learning how to use regex for AWS ARNs, you can enhance your AWS Identity and Access Management (IAM) policies and ensure the accuracy of resource references.