2021年4月4日星期日

How to set test options when running weka using python-weka-wrapper?

I'm exploring python's weka wrapper with JRip classifier. I loaded the dataset, buildt a model and extracted the rules without any major problem.

Now, as far as I know, cross-validation with 10 folds is the default option when using Weka Explorer, as shown in the image below.

Weka Explorer

So I assume that if I run the same JRip classifier but written in python, the default mode would be a 10 fold Cross-Validation, but I'm not sure. My code is as follows:

import weka.core.jvm as jvm    from weka.core.converters import Loader  from weka.classifiers import Classifier,Evaluation  from random import randint    jvm.start()  url = 'C:/Data/train_dataset.csv'  loader = Loader(classname = 'weka.core.converters.CSVLoader')  data = loader.load_file(url)  data.class_is_last()  seed = randint(1,99e6)  optimizations = 15   options = f'-F 3 -N 2.0 -O {optimizations} -S {seed}'.split()  jrip = Classifier(classname = 'weka.classifiers.rules.JRip',options=options)  jrip.build_classifier(data)  ruleset = jrip.jwrapper.getRuleset()  for i in range(ruleset.size()):      rule = ruleset.get(i)      print(rule.toString(data.class_attribute.jobject))  

The code is pretty standard, extracted from mostly examples from weka's website (never used Weka in Python before).

I also read about Evaluation class that has a crossvalidate_model method, but I'm not sure if this is what I'm looking for and how to correctly use it.

How can I build a model and apply different settings, or know which settings I'm actually using, in the python script? For example, if I want to increase the number of folds or use other settings like Percentage split, Supplied test set or Use training set?

https://stackoverflow.com/questions/66910918/how-to-set-test-options-when-running-weka-using-python-weka-wrapper April 02, 2021 at 04:06AM

没有评论:

发表评论