I trained a logistic regression model for multi classification on text data. I wanted to generate a sample prediction from the model but I am getting this error
ValueError: X has 30 features per sample; expecting 100000 Here is the code that vectorizes the text data
tfidf_pipeline = Pipeline([ ('tfidf' ,TfidfVectorizer(max_features=50000, ngram_range=(1, 3), stop_words = 'english', strip_accents= 'ascii',))]) preprocessor_pipeline = ColumnTransformer( transformers=[ ('short_description', tfidf_pipeline,'short_description'), ('details', tfidf_pipeline,'details'), ]) Here is the code I am trying to run but getting the latter above error
d = {'short_description' : ['[mitigated] [ubl5] ssd slam station not working'], 'details' : ['ssd slam station not working, unable to take slam from the station.']} df_test = pd.DataFrame(data=d) X = df_test[['short_description', 'details']] X_prep = preprocessor_pipeline.fit_transform(X) y_p = lr.predict(X_prep) https://stackoverflow.com/questions/67223461/calling-predict-on-an-example-from-an-already-trained-logistic-regression-model April 23, 2021 at 11:08AM
没有评论:
发表评论