the facebook fastText model is a great tool for text classification and text representation learning.
Reada summary of the General Language Understanding Evaluation (GLUE) benchmark dataset.
Readknowledge distilling is a method to compress a large model into a smaller one.
Readword2vec is a classic model for generating word embeddings, it's important to understand how it works.
Read