Batch production
Select the Batch Production tab to access the batch production quest canvas. This tab is active when you have exported one of the optimize model activity configurations from the design quest that you intend to use in the batch mode pipeline.
After the batch production quest is created, you can continue to modify the pipeline if necessary. When the batch quest is initially created, it is assumed that the single or multiple input datasets have the same schema as the single and multiple datasets in the corresponding Dataset Collection activity that is marked in the design quest. The same schema assumes the same variables of the same types. Note that:
- Data post-processing supports Scripting activity and ingesting results back into the Data Lake.
- Other activities are not editable.
One or multiple Scripting and Ingest to Data Lake activities can be applied. Scripting is used to format the output results as per the business need, while the Ingest to Data Lake activity sends the result data to a Data Lake object.
After configuring the activities, save and run the batch quest. Upon successful completion, the results should align with those produced in the design quest. The output of the Scripting activity can be used to define the schema in the data catalog, and this configuration is then ingested back into Data Lake.
When using a custom algorithm, the Scripting activity is optional. The output from the Setup Model can be directly ingested into Data Lake. Custom algorithms may generate single or multiple output files, based on the output file definitions provided in the custom algorithm code. A maximum of 10 output files can be specified.
For detailed guidance, refer to the instructions on the Custom Algorithm detail page in the application.
When updating an existing Batch Production quest, you can choose from two supported options. These options are available when you select the icon from the relevant Results activity. These are the options:
- . This option updates only the optimized model configuration. Any existing scripting and Ingest to Data Lake activities are preserved without changes after the Batch quest is updated. Choose this option when the model has changed, but the rest of the batch production workflow remains the same.
- . This option updates the complete batch production workflow. All existing scripting and Ingest to Data Lake activities are removed and must be reconfigured. Choose this option when broader changes to the batch production workflow are required.
In summary, you can either preserve the existing flow and update only the model, or rebuild the batch production flow entirely, based on the scope of the required changes.