Initial query is executing for a long time

The first query executed for data object or data objects referenced in a query initiates a Compass data storage activity. For Compass data storage, all data in the “raw” or original Data Lake that has not been stored in the Compass data storage is converted. The Compass data storage starts before the query executes, so the Compass query executes on the most recent data in Data Lake.

For example, 1000 data objects are sent to Data Lake, and a query is executed for the data object. All 1000 objects are stored in the optimized Compass data storage before the query executes. This happens regardless of the WHERE condition on a query. This condition only happens one time; after the data is stored in the Compass data store, it stays in the store unless it is cleared using an administration stored procedure.

To minimize the effect of the data storage time, run queries more frequently. In this way, fewer data objects must be stored in Compass storage each time a query is executed.

When you are testing query syntax or do not necessarily require the current Data Lake data, use the hint to “skip reformatting”. The hint skips the process to store Data Lake data into Compass data storage.

See Query hints.