I am a beginner of fastai and trying to build a model referring to Using RoBERTa with fast.ai for NLP.
I was trying to customize the tokenizer (as the code below):
from fastai.text import * from fastai.metrics import * from transformers import RobertaTokenizer class FastAiRobertaTokenizer(BaseTokenizer): """Wrapper around RobertaTokenizer to be compatible with fastai""" def __init__(self, tokenizer: RobertaTokenizer, max_seq_len: int=128, **kwargs): self._pretrained_tokenizer = tokenizer self.max_seq_len = max_seq_len def __call__(self, *args, **kwargs): return self def tokenizer(self, t:str) -> List[str]: """Adds Roberta bos and eos tokens and limits the maximum sequence length""" return [config.start_tok] + self._pretrained_tokenizer.tokenize(t)[:self.max_seq_len - 2] + [config.end_tok]
But got an error message:
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-6-41070aae72d1> in <module> ----> 1 class FastAiRobertaTokenizer(BaseTokenizer): 2 """Wrapper around RobertaTokenizer to be compatible with fastai""" 3 def __init__(self, tokenizer: RobertaTokenizer, max_seq_len: int=128, **kwargs): 4 self._pretrained_tokenizer = tokenizer 5 self.max_seq_len = max_seq_len NameError: name 'BaseTokenizer' is not defined
- fastai version: 2.1.8
- torch version: 1.7.1
- transformers version: 3.4.0
Did anyone get the same issue before?
https://stackoverflow.com/questions/65373407/fastai-text-nameerror-name-basetokenizer-is-not-defined December 20, 2020 at 02:59AM
没有评论:
发表评论