Recently I have been looking into Transformer based machine learning models for natural language tasks. The field of NLP has changed tremendously in the last few years and I have been fascinated by the new architectures and tools that come out at the same time. Transformer models are one such architecture.

As the frameworks and tools to build transformer models keep evolving, the documentation often becomes stale and blog posts are often confusing. So for any one topic, you may find multiple approaches which can confuse beginners.

So as I am learning these models, I am planning to document the steps to do a few of the essential tasks in the simplest way possible. This should help any beginner like me to pick up transformer models.

In this two-part series, I will be discussing how to train a simple model for email spam classification using a pre-trained transformer BERT model. This is the second post in the series where I will be discussing fine-tuning the model for spam detection. You can read all the posts in the series here.

Data Preparation and Tokenization

Please make sure you have gone through the first part of the series where we discussed about how to prepare our data using bert tokenization. You can find the same in the below link.

Email Spam Detection using Pre-Trained BERT Model: Part 1 - Introduction and Tokenization.

Model Fine Tuning

Once the tokenization is done, we are now ready to fine-tune the model.

A pre-trained model comes with a body and head. In most of the use cases, we only retrain the head part of the model. So that’s why we call it fine-tuning rather than retraining. You can read more about the head and body of a transformer model at the below link.

1.Download Model

As we did with the tokenizer, we will download the model using hugging face library.

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=2)

The above downloads a dummy sequence classification model head which needs to be tuned with data.

2. Training Arguments

Training arguments are where you set various options for given model training. For simplicity, we are going to use default ones.

from transformers import TrainingArguments,Trainer
training_args = TrainingArguments(output_dir="test_trainer")

3. Evaluation Metrics

For our training, we are going to use accuracy as an evaluation metric. The below code sets up a method to calculate the same from the model.

import numpy as np
import evaluate
metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

In the above code, np.argmax line converts logits returned from model prediction to labels so that, they can be compared with actual labels.

4. Trainer

Let’s create trainer with below code.

trainer = Trainer(

Trainer API of hugging face handles all the batching and looping needed for fine-tuning the model.

5. Run the Train

Once trainer object is created, we can run the train the model using train method call.


Find Accuracy on Testing Dataset

Once the model is trained, we can find how well our model is doing using accuracy on test dataset.

predictions_output = trainer.predict(tokenizer_datasets_test)

In above code, we are using trainer.predict method to predict on our test dataset.

accuracy_score = compute_metrics((predictions_output.predictions,tokenizer_datasets_test['label']))

Then we find the accuracy score using same function we defined at the time of train. The output will be

{'accuracy': 0.97}

As you can see we are getting 97% accuracy which is really good.


Complete code for the post is in below google colab notebook.

You can also access python notebook on github.


In this post, we saw how to fine-tune a pre-trained model using hugging face API. These two posts give you end to end flow of fine-tuning a transformer model.