NLP Model Selection v1.1

Making model selection easy

A lot has happened in NLP from the time I released

I received quite positive feedback and have decided to keep it updated.

Here's what got added recently 👻

Models suitable for mobile

  • MobileBERT (5.3x faster than BERT)

  • SqueezeBERT (4.3x faster than BERT)


  • Helsinki-NLP/opus

Long text (>512)

  • BigBird (Linear compute with sparse attention)

Semantic Search

  • LaBSE (93 language support)

  • LASER (103 language support)

  • Dense Passage Retrieval + Transformer Reader


  • PEGASUS (SOTA on summarisation)

Domain models

  • Replaced BioBERT with ouBioBERT as it supersedes it for medical tasks

Have suggestions? Create an issue @


Poor Man's BERT

This paper finds that if we remove the last 6 layers of BERT, we still get the same performance. No need for fancy distillation. Both are of the same size 66M. This clearly means we love over-engineering. We over-engineered BERT. Then we over-engineered ways to compress BERT 🤫

