I see both COUNT and D_COUNT as options for aggregation. Which one should I use?
COUNT and D_COUNT, distinct counts, are ways of aggregating data. Data aggregation is the process of combining a list of items into one data point. For example, SUM is an aggregation where the listed values are added together.
COUNT counts the number of items being aggregated. D_COUNT counts the number of unique items there are being aggregated. Consider the list below.
If you aggregate the list above by COUNT, the list will show the result of 4 as there are four items. If you aggregate the same list by D_COUNT, the list will provide a result of 3 as two items are the same. The most common use for COUNT and D_COUNT in Explore is when aggregating ticket IDs. D_COUNT will ensure each ticket only shows up once, COUNT allows a ticket to be counted multiple times.
If a report has attributes in rows or columns, COUNT and D_COUNT are aggregated within each cell, not for the report as a whole. From the example above, imagine that "Cat," "Dog," and "Bird" are tags, and they are arranged on three tickets:
- Ticket 1: Cat
- Ticket 2: Cat, Dog
Ticket 3: Bird
COUNT and D_COUNT for tickets both return three since there are three tickets. However, if tags are added under rows, the total changes:
- Cat: 2 tickets
- Dog: 1 ticket
- Bird: 1 ticket
The total is 4, even though there are only three tickets. This is because two unique tickets have the "Cat" tag. If a report has multiple rows or columns, the sum of D_COUNT values can be higher than the D_COUNT without rows or columns.
For more information on the different Explore aggregators, see the article: Choosing metric aggregators.
Thanks for the explanation Bryan. Just to confirm for a query on a total number of SOLVED tickets. If a ticket ID is SOLVED multiple times example, ticket ID SOLVED twice (open, solved, open, solved) will ...
- COUNT count the unique ticket ID twice
- D-COUNT count the unique ticket ID once
It would depend on what metric/s or dataset you're using, but you're right, in general, if you expect a Ticket ID to appear more than once in a query, then it would be safe to opt for D_COUNT when you need to count each unique ID only once. You may also find this section helpful: COUNT and D_COUNT.
If I'm counting the number of rated satisfaction tickets, should I use COUNT or D_COUNT? Is COUNT counting the number of ticket IDs in this case? If so, COUNT or D_COUNT shouldn't make a difference, right?
You are correct. In your case, using COUNT or D_COUNT shouldn't make a difference. A possible case for using D_COUNT for satisfaction is when using Ticket Updates dataset and calculated metrics like in this recipe - Explore recipe: Determining satisfaction scores for your agents
Please sign in to leave a comment.