Subnet Inspection
This guide helps prevent failures caused by CSP platform updates in subnets with insufficient available IPs by identifying usable subnets and maintaining a minimum threshold of available IP addresses.
Utilizing Subnet IP Availability Metrics
If Datadog is integrated with CSPs like AWS or Azure, you can use metrics related to available IP addresses in subnets.
Set alerts to notify relevant stakeholders when the number of available IP addresses drops below 5.
Subnet Metric Collection by CSP
AWS
Metric | Description | Notes |
---|---|---|
aws.vpc.subnet.available_ip_address_count | Number of available IP addresses in the subnet | Main metric |
aws.vpc.subnet.total_ip_address_count | Total number of IP addresses in the subnet |
Prerequisite: Requires enabling metric collection
To collect
aws.vpc.subnet.*
metrics, a ticket must be submitted to Datadog to enable data collection for the account.These metrics are collected via EC2
Describe*
API calls, so EC2 resource access must be enabled.
Metrics Reference Guide : Amazon VPC
Azure
Metric | Description | Notes |
azure.network_virtualnetworks.subnets.available_addresses | Number of available addresses in the subnet | Main metric |
azure.network_virtualnetworks.subnets.signed_addresses | Number of assigned addresses in the subnet |
Metrics Reference Guide : Microsoft Azure Virtual Network
Alert Configuration
Create alerts based on threshold metrics.
- Navigation : Monitors > New Monitor > Metric Monitor
Metric Monitor Configuration
Choose the Detection Method
: Select Threshold Alert for subnet monitoring.Threshold Alert : Compares metric values to static thresholds.
Change Alert: Compares changes over time.
Anomaly Detection: Detects unusual behavior based on historical data.
Qutliers Alert : Detects outliers among grouped resources.
Forecast Alert : Predicts future behavior and compares with thresholds.
Watchdog : Datadog AI automatically detects issues.
Define the metric
: Select a metric and configure query/formula, grouping, and evaluation period.
AWS Metric - aws.vpc.subnet.available_ip_address_countAzure Metric - azure.network_virtualnetworks.abaiable_subnet_addresses
Specify the metric.
For group by, select
subnet
for AWS orsubnet_name
for Azure.
You may also add tags (such asaccount
,subscription
, etc.) to include additional information you'd like to check when alerts are triggered.Select the aggregation function and evaluation window.
- If you chooseaverage / last 5 minutes
,
the system calculates the average of the last 5 minutes of data every minute and compares it to the threshold.
※ The evaluation frequency depends on the selected evaluation window:If the window is less than 24 hours → evaluated every 1 minute
Less than 48 hours → every 10 minutes
48 hours or more → every 30 minutes
Set Alert Conditions
Set the comparison operator for threshold evaluation. For subnet monitoring, set it to "below or equal to".
Supported operators:
below
,above
,below or equal to
,above or equal to
.
Set the alert threshold to 5.
For Nodata alerts, choose between "Do not notify" and "Notify".
If set to "Notify", a Nodata alert will be triggered when no data is received.
You can configure a separate recovery threshold for alerts triggered by the alert threshold.
If not set, the alert will be cleared once the value exceeds the alert threshold.You can configure automatic alert recovery.
If the alert condition persists, it will be cleared automatically after the specified time.
However, if the condition still persists after being cleared, the alert will be triggered again.Apply a waiting period before applying the monitor to newly added targets.
Set an evaluation delay time to account for collection intervals and network delays.
AWS: 10 minute Delay
Azure: 2 minute Delay
Set whether to calculate only when data is fully collected during the evaluation window.
do not require: calculation proceeds if there is at least one data point
require: calculation proceeds only if the data is complete in the evaluation window
Notify your team: Notification Settings
Alert Title : This is the subject of the message sent when an alert is triggered.
- Example:[Warning] Subnet is running low on available IPs
Alert Message
- This is the content of the message sent when an alert is triggered.
- 예시Use Message Template Variables
You can check how to use templates and variables in the alert title and message body.
Reference for available variables : https://docs.datadoghq.com/monitors/notify/variables/?tab=is_alertNotify your services and your team members
Integrated channels such as Opsgenie, Slack, Teams, webhook, and email will be displayed.
Select the channels or email addresses to notify when the alert is triggered.Content displayed (message content settings)
You can choose whether to include automatically attached content such as the query or snapshot in the alert message.Include Triggering tags in notification title
This adds the tags of the affected resource to the alert title in the notification.Aggregation Settings
If a group was selected in "Set alert conditions," this will be automatically set as a multi-alert.Renotification Settings
If the alert (or Nodata) condition continues, renotification will be sent at the interval you select.Tags Settings
You can set tags for monitors that are used in the Monitor list, Downtime scheduling, etc.Priority Settings
Set the severity (importance) of the alert from P1 to P5.
Set the priority according to standardized criteria.
Define permission and audit notifications: Set monitor edit permissions and change notifications
Restrict editing
Set the permission to edit the alert.
When you select a role, all users with that role will have permission to edit.
Test Notifications
Click this button to send a test alert with the current settings to the selected channels.
Create
Click this button to save the configured settings.
AWS Integration
EC2 Filtering
To avoid charges caused by collecting data from EC2 instances without a Datadog Agent installed after AWS Integration, tag-based filtering is available.
Additional Billing after AWS Integration
EC2 instances collected through AWS Integration are subject to billing, but instances with the Datadog Agent installed will not be billed twice.
Billing may also apply for Fargate and Lambda resources.
Limit Metric Collection to Specific Resources
Supports filtering AWS metrics collected from specific services such as EC2 and Lambda based on tags.
Target Services : EC2, Lambda, ELB, Application ELB, Network ELB, RDS, SQS, CloudWatch custom metrics
Settings Path : Integrations > Amazon Web Services > Select Account > Metric Collection Tab
How to Configure: Filtering can be done using either blacklist or whitelist methods
Blacklist : Excludes resources that contain specified tags.
Example) !datadog:noWhitelist : Collects only resources with specified tags. When multiple conditions are added, they are applied in an OR relationship.
Example) datadog:monitored,env:production,instance-type:c1.*You can also use a mix of blacklist and whitelist filtering.
Uppercase letters are converted to lowercase, and spaces are replaced with underscores (
_
).
Example: TagTeam:Frontend App
should be applied asteam:frontend_app
.