[ Datadog ] Graph values change depending on the selected Time Range.

Print

When configuring a dashboard or monitor, there are frequent inquiries about why graph values differ depending on the selected time range.  This explanation is intended to clarify that behavior. 


When building dashboards or monitors, metrics are typically graphed using queries like the example shown above. 


If you check the same query with different time ranges, the highest value within the same time frame on the graph may vary. 


2 min


1 hour


4 hour

 


This happens because when metrics are visualized in a graph, there is a limit to the number of points that can be displayed (maximum 1,500). 

For example, if actual data is collected at 1-second intervals: 

To visualize 1 hour of data, you would need 60 seconds * 60 minutes = 3,600 points.

To visualize 4 hours of data, you would need 60 * 60 * 4 = 14,400 points.


However, showing all points would lead to system overload and may not be feasible, so Datadog automatically applies a rollup to aggregate values and optimize the graph display. 


The table below shows the minimum rollup interval (point interval) depending on the selected time range and display type. 


Using the example of data collected every second: 

When displaying 1 hour of data on a line graph, it would normally require 3,600 points. But because the minimum interval is 20 seconds, data is aggregated every 20 seconds, resulting in about 180 points being displayed.

For 4 hours, about 240 points will be shown.


The default aggregation function for these intervals is average.

For 1 hour, the average value is shown every 20 seconds.

For 4 hours, the average value is shown every minute.

This means each displayed point represents the average of 20 or 60 actual data points, respectively.


Since more data is aggregated per point, the average will naturally differ depending on the time range.


So what if you want to see the exact peak values over a month? 

You can manually configure the rollup to aggregate using the max value instead of the average. 


In your query, select a function and configure the rollup aggregation method and the time interval for aggregation. 


Note: Even if rollup is manually configured, the aggregation interval cannot be shorter than the automatic rollup interval. 


For example, if the selected time range is 1 hour, and you configure manual rollup with a 10-second interval, aggregation will still occur every 20 seconds due to the automatic rollup rule.

If you set it to aggregate every 1 minute, which is longer than the automatic interval, it will be applied as configured.


Rather than selecting a function icon, you can also configure this via the metric query string by appending .rollup(aggregation_function, interval_in_seconds) at the end of the query. 

avg:system.cpu.user{*} by {host}.rollup(max,60)


* Note: Common confusion with metric query configuration – group aggregation functions. 

Sometimes, users misunderstand the aggregation function shown in the query as the rollup function. So here’s an explanation to clarify this point. 


The "avg by" in the query refers to a group-by aggregation function. 


Normally, when setting a query, you configure group by using tags. 

This query means you want to display the system.cpu.user metric for each host.

Group-by aggregation functions are used to display grouped data by tag.


For example, if the metric is collected with tags such as host, service, and name: 


If you group by host, the graph will display 5 lines. Each host is treated as the smallest unit, and changing the group-by function (avg by, max by, min by, sum by, etc.) will still display the same values. 


If you group by service instead, the graph will display 3 lines.

To display 5 pieces of data as 3 lines, aggregation is required. The result will vary depending on the selected group-by aggregation function.


If no tag is specified in group-by, a single line will be displayed by aggregating data from all targets using that metric. 


In conclusion:

  • Rollup aggregation functions are used to merge data points over time intervals.

  • Group aggregation functions are used to merge data points by tag groupings.



Reference Datadog Docs guide

- Query시 Aggregate And Rollup : https://docs.datadoghq.com/dashboards/querying/#aggregate-and-rollup

- Rollup : https://docs.datadoghq.com/dashboards/functions/rollup/

 

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.