Nils Rethmeier

Learning, evaluating and explaining transferable text representations from grumpy Big Small Data.

The pretraining elephant in the room.

Modern NLP heavily relies on Big Data pretraining for self-supervised to supervised transfer learning. This approach follows a “bigger compute and data beats efficient algorithms” mentality coined “The bitter lesson” — R. Sutton. Ironically, progress in compute scale-up is achieved by miniaturization and more effective resource usage, where we are very aware of our worlds physical limits and hence clever solutions, rather than brute force, dominate. So instead, why not experiment with questions like: “Will we ever have enough data and do we need to … pre-learn (on) everything?”, “If not, how to model and pretrain from small or insufficient data?”, “How can we evaluate pretraining and transfer more in depth to better understand neural learning”, and “Can we explain or measure neural transfer without probes and labels?”. The talk will feed discussion via the following subjects. Small-scale pretraining and its effects on few, zero-shot and long-tail transfer. Interpretability (IAI) for measuring zero-shot and self-to-supervised transfer — without probes. IAI guided transfer pruning and model component importance or simple redundancy analysis. Low-resource zero-shot domain (de-adaptation). The talk concludes to encourage discussions about how greenAI and AI inclusion relate to these issues.


Nils is a 2nd year PhD student at Copenhagen University and the German Research Center for AI (DFKI) in Berlin. His research interests are self-supervised representations, low-resource learning, XAI, generalization, (continual) transfer learning and medical AI.

Presentation Materials

Talk Video
Slides