I am using the Learning API version of xgboost. I want to get the coefficients of a linear model using this, but it results in an error AttributeError: 'Booster' object has no attribute 'coef_'. The Learning API documentation doesn't appear to address how to retrieve coefficients.
###xtrain and ytrain are numpy arrays
dtrain = xgb.DMatrix(xtrain, label=ytrain)
dtest = xgb.DMatrix(xtest, label=ytest)
param = {'eta':0.3125, 'objective': 'binary:logistic' 'nthread':8, 'eval_metric':'auc', 'booster':'gblinear', 'max_depth':12}
model = xgb.train(param, dtrain, 60, [(dtrain, 'train'), (dtest, 'eval')], verbose_eval = 5, early_stopping_rounds = 12)
print(model.coef_) #results in an error
I tried building an equivalent version of the above model using XGBRegressor as it does have the attribute coef_, but this model does return predictions that are very different. I looked at previous answers on this topic (1, 2), which seem to imply that n_estimators is effectively the same as num_boost_round and that would provide the same predictions. But despite accounting for this, the predictions are very different based on the parameters below. This model turns out to be extremely conservative. Also, from the documentation, nthread is the same as n_jobs. I don't see any other differences in the parameters of the two.
model = XGBRegressor(n_estimators = 60, learning_rate = 0.3125, max_depth = 12, objective = 'binary:logistic', booster = 'gblinear', n_jobs = 8)
model = model.fit(xtrain, ytrain, eval_metric = 'auc', early_stopping_rounds = 12, eval_set = [(xtest, ytest)])
predictions = model.predict(xtrain, ntree_limit = 0) # need to include ntree_limit because of bug associated with early_stopping_rounds for gblinear
My questions are:
- Is there a way to get coefficients for a model built using
xgb.trainfor a linear model, and if so how may I do it? - If not, why is
XGBRegressorgiving me different results?