In this tutorial, we will build a language model to predict the next word based on the previous word in the sequence. This tutorial demonstrates how to predict the next word with eager execution in TensorFlow Keras API.

Download and Prepare data

In this tutorial, we will use Shakespeare dataset. You can use any other dataset that you like. Our model is very simple to give one word as input from sequences and the model will learn to predict the next word in the sequence.

For example:

The first step is assigned a unique integer to each word in the sequence is and convert the sequences of words to sequences of integers. Keras provides the Tokenizer API that can be used to encoding sequences. First, the Tokenizer is fit on the source text to develop the mapping from words to unique integers. Then sequences of text can be converted to sequences of integers by calling the texts_to_sequences() function.

Next, we need to create sequences of words to train the model with one word as input and one word as output.

Then split the sequences into the input “X” and output elements “Y”. This is straightforward as we only have two columns in the data.

Create Input Pipelines

In this tutorial, we will use TensorFlow Dataset API to feed data into the model. It enables you to build complex input pipelines from simple, reusable pieces.

Create the Model

We will use Keras functional API which is the right way for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers. We use the Model Subclassing API which gives us full flexibility to create the model and change it however we like. We use Embedding layer GRU layer and Fully connected layer.

Save checkpoints during training

You can save checkpoint during—and after—training. This means a model can resume where it left off and avoid long training times. This way you can use a trained model without having to retrain it, or pick-up training where you left of—in case the training process was interrupted.

Train the model

We will use a custom training loop with the help of GradientTape(). We initialize the hidden state of the model with zeros and shape == (batch_size, number of RNN units). We do this by calling the function defined while creating the model.

Next, we iterate over the dataset(batch by batch) and calculate the predictions and the hidden states associated with that input.

Predict next Word

Finally, it is time to predict next word using our train model.

predict next word in tensorflow keras