Possible causes of mismatches

The Data Lake is a large data platform that provides eventually consistent data platform capabilities. This can result in point-in-time reconciliations that are temporarily inconsistent with a system-of-record.

There are several potential causes of data mismatch. This table shows the most common causes:

Cause Observation

Objects published to Data Lake by the application are not yet indexed in Data Lake.

Data object ingested and Instance count Data Lake lower than Data object sent and Instance count Application.

Row count Compass possibly lower than Row count Application.

Application failed to publish data, but reported objects as sent to Data Lake.

Data object ingested and Instance count Data Lake lower than Data object sent and Instance count Application

Row count Compass possibly lower than Row count Application

Records were deleted in application database, but information was not sent to Data Lake.

Row count Compass higher than Row count Application.

Records were inserted into the application database, but information was not sent to Data Lake.

Row count Compass lower than Row count Application.

Purge of data objects

Data object ingested and Instance count Data Lake lower than Data object sent and Instance count Application.

Row count Compass possibly lower than Row count Application.

Corrupt objects

Row count Compass possibly lower than Row count Application.

Compass API or other retrieval method failed to provide the correct data because of another defect in the system.

Row count mismatched

Inaccurate metadata model in Data Catalog.

Row count mismatched