Skip to content
Libro Library Management System
RoBERTa: A Robustly Optimized BERT Pretraining Approach cover
Bibliographic record

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Authors
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
Publication year
2019
OA status
unknown
Print

Need access?

Ask circulation staff for physical copies or request digital delivery via Ask a Librarian.

Abstract

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results highlight the importance of previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code.

Copies & availability

Realtime status across circulation, reserve, and Filipiniana sections.

Self-checkout (no login required)

  • Enter your student ID, system ID, or full name directly in the table.
  • Provide your identifier so we can match your patron record.
  • Choose Self-checkout to send the request; circulation staff are notified instantly.
Barcode Location Material type Status Action
No holdings recorded.

Digital files

Preview digitized copies when embargo permits.

  • No digital files uploaded yet.

Links & eResources

Access licensed or open resources connected to this record.