I am training an XGBoost model using essentially this code:
X, y = read_csv_data(source=csvFile, target_column_index=dpp.HEADER.target_column_index, output_dtype='O') print('Training Set [%s] Shape = [%s]' %(csvFile, str(X.shape))) self.model = trainer.train(X,y, dpp.HEADER, dpp.build_feature_transform(),dpp.build_label_transform()) d=self.model.transform(X) filter_arr=[False] for row in y[1:]: filter_arr.append(len(row)>0) print('Filtered array len = ', len(filter_arr)) dtrain = xgb.DMatrix(d[filter_arr], label=y[filter_arr]) param = {'max_depth':2, 'eta':1 ,"objective":"reg:squarederror"} self.bst = xgb.train(param, dtrain) pickle.dump(self.bst, open('model.pickle.dat', 'wb'))
I have around 3,800 rows of data and around 250 columns. I expected a fairly complex model.
But when I managed to visualize it using graphviz with this code:
from xgboost import plot_tree import matplotlib.pyplot as plt import pickle model = pickle.load(open("model.pickle.dat", "rb")) plot_tree(model) plt.show()
I actually got just this:
This clearly doesn't look right. What am I missing here?
https://stackoverflow.com/questions/65386722/why-does-my-xgboost-model-look-like-it-only-has-1-node December 21, 2020 at 10:06AM
没有评论:
发表评论