When asked how to aggregate data in Qlik products in the quickest way, the answer is “it depends”. While the key factor is the uniqueness/ cardinality of the aggregating dimensions, there are other elements at play.
In general, though, the fastest way to aggregate in the data load script (after loading the data into memory) is:
- When aggregating by a low cardinality dimension in a small data set, resident load and run a group by immediately (this is also the fewest lines of script)
- When aggregating by a higher cardinality dimension, or on one that requires a lot of sorting prior to aggregation, resident load and sort the table by the high cardinality dimension as the first step. Then resident load this table and run your group by as a second step.
The short version: use approach 2 as the default, unless your data is very simple.
