Save and Load fine-tuned Huggingface Transformers model from local disk

Transformers provide APIs to download and train state-of-the-art pre-trained models easily. Pre-trained models can reduce your computing costs, and save you the time and resources required to train a model from scratch.

These models support common tasks in different modalities, such as text classification, named entity recognition, question answering, and language modeling, etc. This provides the flexibility to train a model in three lines of code in one framework and load it for inference in another.

Text classification is a common NLP task that assigns a label or class to text. One of the most popular forms of text classification is sentiment analysis, which assigns a label like positive, or negative to a sequence of text.

This guide will show you how to:

Finetune DistilBERT on the IMDb dataset to determine whether a movie review is positive or negative.
Save and load fine-tuned model to the local disk.
Use your saved fine-tuned model for inference.

Load IMDb dataset

Start by loading the IMDb dataset from the Hugging Face Datasets library:

!pip install transformers datasets evaluate

from datasets import load_dataset

imdb = load_dataset("imdb")

Load Transformers Model

Transformers provide a simple and unified way to load pre-trained instances. You can load the model using an AutoModel. Selecting the correct AutoModel for the task, for text or sequence classification, you should load AutoModelForSequenceClassification:

id2label = {0: "NEGATIVE", 1: "POSITIVE"}
label2id = {"NEGATIVE": 0, "POSITIVE": 1}

from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer

model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
)

An AutoClass is a shortcut that automatically retrieves the architecture of a pre-trained model from its path. You only need to select the appropriate AutoClass for your task.

Tokenizer

Before you can train a model on a dataset, it needs to be preprocessed into the expected model input format. Whether your data is text, images, or audio, they need to be converted and assembled into batches of tensors. Transformers provide a set of preprocessing classes to help prepare your data for the model.

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

A tokenizer is responsible for preprocessing text into an array of numbers as inputs to a model. The most important thing to remember is you need to instantiate a tokenizer with the same model name to ensure you’re using the same tokenization rules a model was trained with.

from transformers import DataCollatorWithPadding

def preprocess_function(examples):
    return tokenizer(examples["text"], truncation=True)


tokenized_imdb = imdb.map(preprocess_function, batched=True)


data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

The main tool for preprocessing textual data is a tokenizer. A tokenizer splits text into tokens according to a set of rules. The tokens are converted into numbers and then tensors, which become the model inputs. Any additional inputs required by the model are added by the tokenizer.

Hyperparameters

TrainingArguments contain the model hyperparameters, which you can change like learning rate, batch size, and the number of epochs to train for. The default values are used if you don’t specify any training arguments:

training_args = TrainingArguments(
    output_dir="my_model",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=1,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=False,
)

The output_dir is where the model predictions and checkpoints will be saved.

Training model

Transformers provide a Trainer class for PyTorch, which contains the basic training loop and adds additional functionality for features like distributed training, mixed precision, and more.

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_imdb["train"],
    eval_dataset=tokenized_imdb["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

trainer.train()

Save fine-tuned model to the local disk

A transformers model consists of multiple components: The architecture, or configuration, which specifies what layers the model contains, how they’re connected, and a set of weights values. An optimizer is defined by compiling the model and a set of losses and metrics.

The Transformers API makes it possible to save all of these pieces to disk at once, saving everything into a single archive in the PyTorch or TensorFlow saved model format.

Saving the architecture/tokenizer only, typically as a JSON file.
Saving the weights values only. This is generally used when training the model.

Once your model is fine-tuned, you can save it with its tokenizer using PreTrainedModel.save_pretrained():

pt_save_directory = "/content/transformer/pt_save_pretrained"
tokenizer.save_pretrained(pt_save_directory)
model.save_pretrained(pt_save_directory)

Load Saved fine-tuned model from the local disk

Save a model and its configuration file to a directory, so that it can be re-loaded using the from_pretrained() class method.

pt_model = AutoModelForSequenceClassification.from_pretrained("/content/transformer/pt_save_pretrained")

Inference

The pipeline() is the easiest way to use a pre-trained model for inference. Start by creating an instance of the pipeline() and specifying a task you want to use it for. You can use the pipeline() for any of the previously saved fine-tuned model.

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="/content/transformer/pt_save_pretrained")
classifier("We are very happy to show you the Transformers library.")