Deleting data from Data Lake
You can remove unwanted data objects that are stored in Data Lake.
Purged objects cannot be retrieved from Data Lake. Purging of data objects can be done in these ways:
- In Atlas, select one or more data objects and click the button or icon.
- In Purge, use the advanced filters to search for data objects to purge.
- In Purge, use the unique data object ID(s) to purge.
- Use the Delete APIs available in the Data Lake endpoint.
After a purge event starts it takes some time to complete.
When a purge process completes, the matching reformatted Compass
data is cleared. The Compass data is cleared for the affected
object names since the oldest store date of the purged data objects. Running the next
Compass query over the same object names can take some time.
Existing data objects must be reformatted and made available in Compass again. If a purge process fails, an error is displayed. You
must clean the Compass cache by running the
clear_data
stored procedure and then perform the purge again.
You cannot revert an active purge event. Nonetheless, a purging process can be stopped by a user to prevent any further data objects from being purged. Any objects that were already purged before stopping the process cannot be restored and are permanently removed from Data Lake. If the event is not stopped, the purge activity continues until all objects that are defined by the purging parameters are removed from Data Lake.
Stopping a purge event results in a partially completed purge. You can verify the status of a purge event in the purge logs.