Bugfix - load_model for RF and XGB Classifiers#1358
Bugfix - load_model for RF and XGB Classifiers#1358mail4umar merged 6 commits intovertica:masterfrom
Conversation
| try: | ||
| first_tree = self._compute_trees_arrays(self.get_tree(0), self.X, True) | ||
| unique_values = set() | ||
| for j in range(len(first_tree[4])): # first_tree[4] is the value array |
There was a problem hiding this comment.
Maybe we should have a getter specifically for these classes rather than indexing into the tree to get them. If for some reason the index of the array of classes ever changed it would be easier to just change one getter function. Also just a bit suspicious that we can count on the array being at index 4 without checking anything here. Thoughts?
There was a problem hiding this comment.
There are multiple uses of this funciton _compute_trees_arrays inside ensemble.py. Each class of algorithm is treating it differently. And all of them are relying on this array structure. That is why a getter function would not help too much.
For now I have created a getter function to find the classes. But that still leaves many instances of where we are using the indices because we are relying on this function. it is actually a helper function in itself.
In a latter PR, we may want to totally overhaul the code. but this PR is only for the bug resolution.
A customer pointed an issue in #1351
When you load a
RFClassifiermodel, it cannot predict the probabilities.The issue was that it was not able to get the classes.
I created a specific unit test to capture this. When I ran the test for all ML models, I found a similar issue with XGB. I have implemented a fix for
XGBClassifieras well.Fixed #1357