Creating an Aggregate

A Birst application may involve a large amount of data that can affect query performance. The report query may have to navigate to tables containing millions of rows. In this scenario, you can create aggregates that reduce the time it takes to run queries. However, the use of aggregates will add to load times and to the size of the database. See Aggregates.

To create an aggregate

  1. Go to Admin- Customize Space - Manage Aggregates.
  2. Click Create. The Add Aggregate dialog box opens.
    For example, here is an aggregate definition that has already been created:

  3. Enter a name for the new aggregate.
  4. The aggregate is enabled by default. To disable it, uncheck the Enabled check box.
    Tip: Disabling an aggregate can be helpful for testing so that you can compare which table or tables a logical query hits when the aggregate is enabled versus disabled.
  5. Enter the processing group name in the Processing Group field. Leave the field blank if you do not want to assign it to a specific processing group. In both cases, the aggregate will be executed at the end of processing ensuring that it hits the most recent data.
    If you specify the processing group for a new aggregate, it must be a processing group associated with a data source. Later, you can put it in a separate processing group just for aggregates. See Aggregates.
    The /REBUILD option can be used to drop the aggregate table and build the table entirely from scratch, as opposed to the normal process which adds new records to the existing aggregate table. To rebuild the aggregate table every time, add /REBUILD to the end of the processing group name (with no spaces). For example:
    Processing Group1/REBUILD

  6. Enter the logical query into the Logical Query field.
    Tips: When creating the aggregate logical query:
  • To help with syntax, you can copy a query from a report in Designer or in Visualizer.
  • The aggregate logical query should only contain the logical columns that will be used in a report.
  • Remove any duplicate logical column names from the query. No two columns should have the same name, otherwise the aggregate execution will fail.
  • Remove column aliases (e.g., 'F7').
  • Remove all column expressions including STAT(), LOOKUPVALUE(), GETVARIABLE(), and math. For example, if you had the expression SUBSTRING([Categories.CategoryName],0,3) you would only include [Categories.CategoryName].
  • If a report is using a column selector, you need to manually add each field in the column selector to the query.
  • If there are any report level filters, you may want to add them to the aggregate query.
  • Include dashboard prompt fields as a part of the aggregate query if they are not present at the report level.
  1. Optionally, add an Incremental Filter. During each load, this filter will be applied to a delete against the aggregate (that is, all records that satisfy the filter will be deleted). Then, an insert of new records will be applied for records that satisfy the filter. This allows for flexible incremental updating of an aggregate. It is useful for cases where you may want to not only update today’s data, but also update the aggregate for the last few weeks or months as hierarchies or values may result in changes to previously aggregated data.
    For example:
    [Time.Year/Month]>'2012/01'.
    If you use an incremental filter only (without a build filter) the incremental filter will always be ignored the first time the aggregate is built. In subsequent loads, data is first deleted from the aggregate based on the incremental filter condition and then re-inserted using the latest data in the warehouse.
  2. Optionally, add a Build Filter. The build filter is used to apply a filter only at the time that the aggregate is built, but not when it is used. This means queries without the filter will hit the aggregate despite the fact that the table only contains filtered data. For example:
    [OrderDate: Sum: Quantity] IS NOT NULL
    If using both a build filter and incremental filter, the incremental filter will be ignored the first time the aggregate is built. The build filter will be used to filter the data set to build the aggregate for the first time. During subsequent building of the aggregate, data is first deleted from the aggregate using both incremental and build filters then reloaded using the same filters against the latest data in the warehouse.
    To initially create the aggregate using an incremental filter along with a build filter, you can include the incremental filter as a part of the build filter for the initial building of the aggregate (as shown in the example below). Then, remove the incremental filter from the build filter expression for subsequent building of the aggregate.

  3. If the space has a QueryConnection specified to an in-memory database, then the Connection field becomes active. Select the connection to use to run the aggregate.
  4. Click OK. Birst creates the logical schema and the Manage Aggregates tab lists the new aggregate.
  5. Process data for the processing group to which the aggregate belongs (or for all processing groups) in order to create the physical table. Go to Process Data - Process New Data and click Process Now.

    Tip: If you want to process only the aggregate and not the other data sources related to the processing group, you can use a command to create the physical aggregate table. Go to Manage Space - Command Window and enter the following command:
    persistaggregate aggregate_name

    Birst creates the aggregate table in the background using the query you specified in the aggregate definition. You can see the details of the aggregate creation in the Last Load Log file. After the command execution has finished, the Space Owner receives an auto-generated email from Birst regarding the success or failure of the command.

After creating an aggregate table, perform testing using the Navigate Query function. Query navigation enables you to identify reasons why a logical query may or may not be hitting an aggregate table as opposed to a base table. To test, you can disable an aggregate (so that it is not used) and paste the query you are using for the aggregate into the Query Admin and click Navigate Query then compare those results with the results you get if you enable the aggregate. See Reading Query Navigation Results for more information.

To modify or delete an aggregate, select the aggregate and click either the Modify or Delete button at the top of the page.

Important: Aggregates that were imported as part of a package will appear greyed out and italicized on the Manage Aggregates page. Although they can be used in the space, they cannot be modified.

See Also
Aggregates
Running a Logical Query
Reading Query Navigation Results
Infor Logical Query Language Functions