No more writing training routine unless you really have to. You can define your training as
from pytorch_lightning import Trainertrainer = Trainer( gpus=1, logger=[logger], max_epochs=5 ) trainer.fit(model)
The job of a
is to do your training routine.
loggervariable there? you can use Tensorboard to manage your logs and I recommend you use it. Do
pip install tensorboardbefore you use it in your local.
In this screenshot, I defined the
from pytorch_lightning.loggers import TensorBoardLoggerlogger = TensorBoardLogger('tb_logs', name='my_model')
Pytorch Lightning will make a log dir, named
and yyou can refer that log directory for your Tensorboard (if you are running your Tensorboard separately from Jupyter notebook).
tensorboard --logdir tb_logs/
Besides constructor and
you will be able to define more functions
configure_optimizer. Expect to return a Pytorch optimizer from
def configure_optimizers(self): return Adam(self.parameters(), lr=0.01)
train_step. Given a batch and batch number, define how will we feed the input to the model.
def training_step(self, batch, batch_idx): x, y = batch.text.T, batch.label y_hat = self(x) loss = self.loss_function(y_hat, y) return dict( loss=loss, log=dict( train_loss=loss ) )
In this example, notice that I do a little transformation using transpose. It is possible to do all kind of transformations before feeding into the model, but I suggest you do the heavy transformations outside this function so that it will be clean.
I have also define the
as part of the model and “hardcoded” it using Cross Entropy. If you do not want that, you can use
torch.functional as F
then call your functional loss function, such as
. Another thing you can do is to let the model constructor to accept loss function as parameter.
train_dataloader. Define how you wanted to load your training data loader.
is an API that helps you with batching the input. Though, to my knowledge, Pytorch Lightning will run
for batch_idx, batch in enumerate(train_dataloader)
(not exactly like this, but similar). This means you are free to define anything here that is iterable.
test_step. Given a batch and batch number, define how will we feed the input to the model for test. It is important to note that we do not have to feed to loss function in this step, because we are running with no gradient.
test_dataloader. Define how you wanted to load your test data loader
test_epoch_end. Given all test outputs, define some action that you wanted to do with the test outputs. If you do not want to define this, then you can, but it will show warning when you have defined
test_dataloaderbecause then you are basically do nothing to your test data.
Previously, I have described my exploration to use torchtext . Now I wanted to improve even more of my productivity on the experiment part, which includes training, testing, validating, metric logging. All of these can be achieved by using Pytorch Lightning.
I will take the IMDB sentiment classification dataset , that has been available in the Torchtext package.
IMDB sentiment classification dataset is a text classification task, given a review text predict if it is a positive or negative review. There is an official short tutorial from torchtext , however, that tutorial does not cover the training part. I will use some of the tutorial codes and connect them with training using Pytorch Lightning.
This dataset contains 3 classes: unknown, positive (labeled as “pos”), negative (labeled as “neg”) . So, we know that we will need to define an output that could predict 3 classes. It is a classification task so that I will use CrossEntropy loss.
Now to load the data you can do
from torchtext.data import Field from torchtext.datasets import IMDBtext_field = Field(sequential=True, include_lengths=True, fix_length=200) label_field = Field(sequential=False)train, test = IMDB.splits(text_field, label_field)
Since the IMDB review is not in uniform length, using a fixed-length parameter will help you to pad/trim the sequence data .
You can access your sample data using
to peek what is inside the train and test variable.
Pre-trained word embedding is usually trained to different data that we used. Thus it will use different “encoding” from token to integer that we currently have.
will re-map the current integer encoding that comes from the current dataset, in this case, the IMDB dataset, with pre-trained encoding. For example, if token
in our vocabulary is
is token number
in pre-trained word embedding then it will be automatically mapped to the correct token number.
from torchtext.vocab import FastTexttext_field.build_vocab(train, vectors=FastText('simple')) label_field.build_vocab(train)
Label field in IMDB dataset will be in the form of
, so that it will still need to build its own vocab but without word embedding.
Iterator works a bit like Dataloader, it helps with batching and iterating the data in 1 epoch. We can use BucketIterator to help us iterate with a specific number of batch and convert all of those vectors into a device, where the device can be
from torchtext.data import BucketIteratordevice = 'cuda' if torch.cuda.is_available() else 'cpu' batch_size = 32train_iter, test_iter = BucketIterator.splits( (train, test), batch_size=batch_size, device=device )
Now we are ready to define our model.
Defining the model with Pytorch Lightning is as easy as William has explained .
It is better to make sure that your model can accept passed input correctly before doing the full training, like this.
sample_batch = next(iter(train_iter)) model(sample_batch.text.T)
Let me explain why I did the transformations.
Each batch object, from an iterator, has
field is actually a tuple of the real word vector and actual length vector of a review. Real word vector will be at size fixed_length x batch_size
, while the actual length vector will be at size batch_size
. In order to feed the model with the word vector, I need to: take the first tuple and rotate it so that it will produce batch_size x fixed_length
We are now ready to train our model!
from pytorch_lightning import Trainer from pytorch_lightning.loggers import TensorBoardLoggermodel = MyModel(text_field.vocab.vectors) logger = TensorBoardLogger('tb_logs', name='my_model') trainer = Trainer( gpus=1, logger=logger, max_epochs=3 ) trainer.fit(model)
and it’s done! It will show the progress bar automatically so you don’t have to do tqdm anymore.
for batch_idx, batch in tqdm(enumerate(train_loader)):
After training, you can do testing by 1 line
If you are thinking why this test method only returns one object? Then probably you are thinking of scikit-learn’s train and test split. In Pytorch, the “test” part is usually defined as “validation”. So you might want to define
In my opinion, using Pytorch lightning and Torchtext does improve my productivity to experiment with NLP deep learning models. Some of the aspects I think make this library very compelling are backward compatibility with Pytorch, Torchtext friendly, and leverage the use of Tensorboard.
If you are somehow hesitant because you think it will be an overhead to use a new library, then do not worry! You can install first, use the
and write the usual Pytorch code. It will still work because this library does not cause any additional headaches.
It was fairly easy to use Torchtext along with Pytorch Lightning. Both libraries run on Pytorch and do have high compatibility with native Pytorch. Both have additional features that do not intersect but complement each other. For example, Torchtext has easy interfaces to load Dataset like IMDB or YelpReview. Then you can use Pytorch Lightning to train whatever model you wanted to define and log to Tensorboard or MLFlow.
Using Tensorboard instead of manually printing your losses and other metrics helps me eliminate unnecessary errors in printing losses on the training loop. It will also eliminate the need to visualize loss vs epoch plot at the end of the training.
 Pytorch Lightning Documentation. https://pytorch-lightning.readthedocs.io/en/stable/introduction_guide.html
 Falcon, W. From PyTorch to PyTorch Lightning — A gentle introduction. https://towardsdatascience.com/from-pytorch-to-pytorch-lightning-a-gentle-introduction-b371b7caaf09
 Falcon, W. Pytorch Lightning vs PyTorch Ignite vs Fast.ai. https://towardsdatascience.com/pytorch-lightning-vs-pytorch-ignite-vs-fast-ai-61dc7480ad8a
 Sutiono, Arie P. Deep Learning For NLP with PyTorch and Torchtext. https://towardsdatascience.com/deep-learning-for-nlp-with-pytorch-and-torchtext-4f92d69052f
 Torchtext Datasets Documentation. https://pytorch.org/text/datasets.html