I am currently working on a forecasting project, where I use the GroupedForecastingProcessor to generate forecasts. This Processor has multiple models which each have a number of parameters such as:
- tree depth
- features used for training
- lag-parameters for ARIMAX
Is there a way to access these parameters and save them with the output of the forecasting so that I can compare the performance of multiple model runs in light of the parameters I used?
I know that one can at least use variables for some of the parameters. But I don’t know how to apply this approach to features, as these are chosen through checkboxes in the Processor. Besides, I don’t really like the variable approach.
Thanks in advance!
Generally there are two types of ONE DATA model processors:
- “Advanced Methods” processors, like your GroupedForecastingProcessor
- “Train Model” & “Model Application” processors
The former train and apply the model directly on the training and test dataset and do not save the calculated models.
The latter do this in a two-step approach: With the “Train model” processor you can train a model and save it in ONE DATA (then also meta-informations like hyperparameters are saved with the model, which you can see in the Model section of ONE DATA). Afterwards with the “Model Application” processor you can load such a stored model and apply it to your testset.
Maybe for you it is worth a shot to have a look at this “Train Model” processor (you can even train multiple modeltypes in one go).
Your second question relating to dynamically selecting different kind of features (columns in your dataset) is not possible in stock ONE DATA as far as I know. However there would be a possible workaround with the ONE DATA API depending on your exact usecase.
I hope I shed a little bit light on the models in ONE DATA, if you have further questions or something is unclear just reach out to me
Technically, it is possible to read out the parameters that are set in the processor configuration.
Since the whole WF configuration is JSON based and can be obtained via REST API, there’s always a way to custom-taylor things to your needs. The effort might be quite high in this case (depending on the usability you want to achieve). If it’s just for having a parameter “history” regardless of the UX, then you can use the WF Jobs to get the corresponding WF version and for each execution the configuration of the processor as JSON and store it.
Don’t worry, the parameter history is preserved within the jobs and will wait for consumption by any script (or maybe even workflow?) you will at some point be able to throw at it.
Using our (currently internal) API documentation for ONE DATA Core/Classic is a good starting point.
I totally agree to the fact that we could use an integrated way of keeping track of certain configurations. Maybe, there’s a generic enough way to achieve your goal and you are able to share it with the community.
@Flogge @christoph.pernul Thank you for your responses. I was able to dig a bit in the API documentation and also got a nice intro to it. A nice UX does not play a big role at the moment. I mainly want to be able to compare forecast accuracy in context of the model parameters that I used. I’ll go ahead and make my first steps on API-land and will let you know, when I find something worth sharing.