Scripts input

Input to a script is typically a Spark dataset.

You can create a Pandas dataset from the input variable, by specifying this (Python) comment line: #:request:params-api:pandas

Both Spark or Pandas datasets can be returned from the script through the output variable. It is not required to convert the output to Pandas/Spark, if you want it returned. Pandas to Spark conversion may require a lot of memory when the number of columns is high. Therefore, the script has higher probability to fail.