Instructions to use bertin-project/bertin-base-ner-conll2002-es with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bertin-project/bertin-base-ner-conll2002-es with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="bertin-project/bertin-base-ner-conll2002-es")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("bertin-project/bertin-base-ner-conll2002-es") model = AutoModelForTokenClassification.from_pretrained("bertin-project/bertin-base-ner-conll2002-es") - Notebooks
- Google Colab
- Kaggle
This checkpoint has been trained for the NER task using the CoNLL2002-es dataset.
This is a NER checkpoint created from Bertin Gaussian 512, which is a RoBERTa-base model trained from scratch in Spanish. Information on this base model may be found at its own card and at deeper detail on the main project card.
The training dataset for the base model is mc4 subsampling documents to a total of about 50 million examples. Sampling is biased towards average perplexity values (using a Gaussian function), discarding more often documents with very large values (poor quality) of very small values (short, repetitive texts).
This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.
Team members
- Downloads last month
- 130