A lot of good stuff happened in indic NLP this year.

Other multilingual papers

Since MuRIL is showing a great gain on the Tatoeba dataset compared to mBERT, it seems best for neural search in Indic language. (I am not clear on LaBSE Vs MuRIL for Indian data)

Possible applications in India

  • News search

    • Indic websites

    • English websites but indic search

  • FAQ chatbot in indic language

    • Customer support for commercial websites

    • Customer support for govt websites

  • Zero-shot article classification via similarity between title and categories

  • Unsupervised recommendation engine via neural search

    • News articles

    • Social content

      • Twitter

      • Sharechat

Models can be improved by finetuning with the domain and task data.

Some Indic talks at the recent event Forum for information retrieval (FIRE 2020) - schedule by IDRBT, Hyderabad.

