The General Language Understanding Evaluation (GLUE) benchmark is used for training, evaluating, and analyzing natural language understanding systems. Organizations developing natural language processing models can evaluate their models using this benchmark. Until recently, Microsoft’s MT-DNN-SMART model was at the top of GLUE leaderboard followed by Google’s T3. Now, China’s Baidu has beaten both Microsoft and Google with its ERNIE (Enhanced Representation through kNowledge IntEgration) model scoring a record 90.1.
“While language understanding remains a difficult challenge, our results on GLUE indicate that pre-training language models with continual training and multi-task learning are a promising direction for NLP research. We will keep improving the performance of the ERNIE model via the continual pre-training framework,” wrote Baidu Research team.
Baidu is now using ERNIE model for real-world applications. For example, Baidu is now using ERNIE model for question answering feature in its search engine leading to 16% improvement in user satisfaction for search results.