Happy to share our new paper “Language model acceptability judgements are not always robust to context” https://arxiv.org/abs/2212.08979! We prepend several kinds of context to minimal linguistic #acceptability test pairs and find #LMs (#OPT, #GPT2) can still achieve strong performance on BLiMP & SyntaxGym, except in some interesting cases. 🧵 [1/7]
#acceptability #lms #opt #gpt2