SparkContext shut down due to Result Tables?

Following scenario:
User did joins on quite large datatables and set result table config to 999.999.999 -
Following error occurred:

  • Instances crashed

  • Did the error (“Exception in processing: IllegalStateException: SparkContext has been shutdown”) happen due to a huge amount of data in the Result Table processor?

  • What is the maximum amount of data that can and should be displayed within the RT?

  • Is it possible to set a server limit for result tables to prevent the scenario?

Hi Kristina,

  • The reason the error happened can only be confirmed after having a very close look to the logs of ONE DATA server but yes, such a scenario can lead to a SparkContext shutdown (possibly because Spark could not obtain the needed memory and the whole instance ran out of memory).
  • There is no specific maximum amount of data that can/should be displayed within the RT. It mainly depends on the number of columns in the data, the average data size in each record/row, the amount of memory and CPU available to Spark and of course, because this is a result table and result table results are saved to the database (if I remember correctly), the resources of the database server play a role here too.
  • There is no special property in ONE DATA server to specify a limit but maybe Spark properties could help: Configuration - Spark 3.0.2 Documentation (please take care to look up the right Spark version)

Best regards,
Jean Pierre

Just some additions to what JayP wrote:

  • You should not use the result table for more than a few thousand rows - everything else should be put into datatables
    • difference in performance is negligible compared to parquet and postgres datatables
    • result tables heavily clutter up the ONE DATA-internal database
    • large and / or many result tables cause your workflow jobs to take significantly longer to load (both on the server and client)
  • an upper limit on the number of collected rows is being considered, exactly for the reasons you described - the only problem is that it will be breaking change