2020年12月20日星期日

Why does my xgboost model look like it only has 1 node?

I am training an XGBoost model using essentially this code:

    X, y = read_csv_data(source=csvFile,                                                    target_column_index=dpp.HEADER.target_column_index,                           output_dtype='O')            print('Training Set [%s] Shape = [%s]' %(csvFile, str(X.shape)))            self.model = trainer.train(X,y,                                                                               dpp.HEADER,                                                                                  dpp.build_feature_transform(),dpp.build_label_transform())      d=self.model.transform(X)      filter_arr=[False]      for row in y[1:]:          filter_arr.append(len(row)>0)        print('Filtered array len = ', len(filter_arr))        dtrain = xgb.DMatrix(d[filter_arr], label=y[filter_arr])      param = {'max_depth':2, 'eta':1 ,"objective":"reg:squarederror"}      self.bst = xgb.train(param, dtrain)      pickle.dump(self.bst, open('model.pickle.dat', 'wb'))  

I have around 3,800 rows of data and around 250 columns. I expected a fairly complex model.

But when I managed to visualize it using graphviz with this code:

from xgboost import plot_tree  import matplotlib.pyplot as plt  import pickle    model = pickle.load(open("model.pickle.dat", "rb"))    plot_tree(model)  plt.show()  

I actually got just this:

enter image description here

This clearly doesn't look right. What am I missing here?

https://stackoverflow.com/questions/65386722/why-does-my-xgboost-model-look-like-it-only-has-1-node December 21, 2020 at 10:06AM

没有评论:

发表评论