—> With our learnings from T5 and GPT it’s proven that everything will be generative. So let’s dig deep into text generation!
Evaluation of Text Generation: A Survey
—> We need no annotators, we can just do data augmentation
Data Augmentation using Pre-trained Transformer Models
—> Distillation is now a part of the pipeline - robust and smaller models
Knowledge Distillation: A Survey
—> We have found recipes for handling very long text and get O(n) transformers - A lot of text is more than a few lines long which is a trouble for O(n^2) transformers
BigBird: Transformers for Longer Sequences
—> Training LMs via adversarial methods is better+faster than masking tokens
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
—> Better evaluation methods for NLP will lead to more robust models
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
—> Ability to do sample efficient NLP is a super-power
Revisiting Few-sample BERT Fine-tuning
—> Low resource NLP on minority languages needs to be prioritized
These are surprisingly short and precise summaries.
Thanks Pratik.