Instructions to use eli4s/Bert-L12-h240-A12 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use eli4s/Bert-L12-h240-A12 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="eli4s/Bert-L12-h240-A12")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("eli4s/Bert-L12-h240-A12") model = AutoModelForMaskedLM.from_pretrained("eli4s/Bert-L12-h240-A12") - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
This model was pretrained on the bookcorpus dataset using knowledge distillation.
The particularity of this model is that even though it shares the same architecture as BERT, it has a hidden size of 240. Since it has 12 attention heads, the head size (20) is different from the one of the BERT base model (64).
The knowledge distillation was performed using multiple loss functions.
The weights of the model were initialized from scratch.
PS : the tokenizer is the same as the one of the model bert-base-uncased.
To load the model & tokenizer :
from transformers import AutoModelForMaskedLM, BertTokenizer
model_name = "eli4s/Bert-L12-h240-A12"
model = AutoModelForMaskedLM.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name)
To use it as a masked language model :
import torch
sentence = "Let's have a [MASK]."
model.eval()
inputs = tokenizer([sentence], padding='longest', return_tensors='pt')
output = model(inputs['input_ids'], attention_mask=inputs['attention_mask'])
mask_index = inputs['input_ids'].tolist()[0].index(103)
masked_token = output['logits'][0][mask_index].argmax(axis=-1)
predicted_token = tokenizer.decode(masked_token)
print(predicted_token)
Or we can also predict the n most relevant predictions :
top_n = 5
vocab_size = model.config.vocab_size
logits = output['logits'][0][mask_index].tolist()
top_tokens = sorted(list(range(vocab_size)), key=lambda i:logits[i], reverse=True)[:top_n]
tokenizer.decode(top_tokens)
- Downloads last month
- 7