I'm exploring python's weka wrapper with JRip classifier. I loaded the dataset, buildt a model and extracted the rules without any major problem.
Now, as far as I know, cross-validation with 10 folds is the default option when using Weka Explorer, as shown in the image below.
So I assume that if I run the same JRip classifier but written in python, the default mode would be a 10 fold Cross-Validation, but I'm not sure. My code is as follows:
import weka.core.jvm as jvm from weka.core.converters import Loader from weka.classifiers import Classifier,Evaluation from random import randint jvm.start() url = 'C:/Data/train_dataset.csv' loader = Loader(classname = 'weka.core.converters.CSVLoader') data = loader.load_file(url) data.class_is_last() seed = randint(1,99e6) optimizations = 15 options = f'-F 3 -N 2.0 -O {optimizations} -S {seed}'.split() jrip = Classifier(classname = 'weka.classifiers.rules.JRip',options=options) jrip.build_classifier(data) ruleset = jrip.jwrapper.getRuleset() for i in range(ruleset.size()): rule = ruleset.get(i) print(rule.toString(data.class_attribute.jobject))
The code is pretty standard, extracted from mostly examples from weka's website (never used Weka in Python before).
I also read about Evaluation class that has a crossvalidate_model method, but I'm not sure if this is what I'm looking for and how to correctly use it.
How can I build a model and apply different settings, or know which settings I'm actually using, in the python script? For example, if I want to increase the number of folds or use other settings like Percentage split
, Supplied test
set or Use training set
?
没有评论:
发表评论