Tokenization-tutorial-code

This repository contains the code used in my tokenization tutorial video. Tokenization consists of splitting a piece of text into smaller components called tokens. If the text data is split into smaller sentences, it is known as sentence tokenization and if the text data is split into words, then it is known as word tokenization.