in the latest version of spacy, roberta transformer cannot be initialize due to piece_encoder #13801
Unanswered
smamidik
asked this question in
Help: Coding & Implementations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm training transformer+ner pipes. I'm interested in en language only based transformer AKA Roberta.
So I took the example config from spacy-curated-transformers and modified the transformer to the config in en_core_web_trf and ran into the error below, it's complaining the list cannot be split. i didn't dig into the code as to what is vocab_merges object is what is merges implies. I'm not supplying any vocab.
For the transformer model = roberta (v1 and v2), I have tried all combinations of piece_encoder and piecer_encoder_loader and it didn't help.
Any help in debugging this issue is appreciated.
on a side note en_core_web_trf 3.8 has compatibility issues and it would not let install latest spacy-curated-transformers
PACKAGES:
CONFIG:
ERROR
--update
given the below code in hf_loader, I checked tokernizer , merges is a list of string . not sure where the problem is.
Beta Was this translation helpful? Give feedback.
All reactions