Can a pretrained neural language model still benefit from linguistic symbol structure? Some upper and some lower bounds In this presentation I introduce one way in which deep-learning-based language models (LMs) and symbolic linguistics can potentially be reconciled. The contributions I talk about include: an almost stupidly simple vector encoding of labeled and unlabeled linguistic structure, which performs much faster than and equally as effective as established methods; a comparison of different linguistic representations on the task of next-word prediction; as well as an analysis of robustness against noise. I conclude that if we had human-like linguistic knowledge resources for large amounts of data, we could indeed achieve drastic improvements to LM perplexity, which are even robust to certain types of "well-behaved" errors. However, it remains unclear if automatic parsers can be good enough to only produce well-behaved errors and avoid bad ones, and if yes, if the effort is worth it. This is joint work with Emmanuele Chersoni, Nathan Schneider, and Lingpeng Kong.