I entered the NAACL conference venue, just a couple blocks away from my place in Seattle, with the absolute lowest of expectations. I have attended two other conferences this year entirely online: ACL and LREC. At both, I had minimal interactions with other human beings beyond “standing” at my poster, and most of that time no one was even there to ask about my work. Hence, zero expectations.
NAACL was the best conference experience I’ve ever had, no doubt about it. The posters and talks were fascinating, I learned a lot from the tutorials, the socials were both fun and a great way to meet people, and there was great food. I can easily say that I met more people at NAACL than I did in all my previous conferences combined. More than the paper talks or presentations, the value of a conference is in being able to easily meet new people who offer a diversity of perspectives. It was also great to experience how approachable even the biggest names in NLP are.
A big theme of the conference seemed to be the uncertain role of symbolic systems, often inspired by linguistics, in an overwhelmingly neural field. There was a somewhat terse back-and-forth at the panel between Emily Bender and Chris Manning which struck at some of the fundamental points of debate in the field: what are we trying to do as computational linguists by training big language models, and could those things be done in a different way? We’ve seen a big convergence in AI in the past couple years on the topic of model architecture; every field has benefited from training transformer models on massive amounts of unlabelled data and finetuning them on (or prompting them about) data-limited and specific tasks. But maybe this convergence comes at the cost of dismissing alternatives. (I’m not sure I explained that very clearly.)
Probably the most important thing I took away is that NLP is much bigger than computational linguistics. I consider myself a linguist; I find language absolutely fascinating and its existence and development pretty much insane. Understanding how the complex and intricate system that makes up language works is satisfying. But, linguistics is a very confused field—the more I learn about, the less I can get what it’s trying to say or whether its story of language is coherent. It’s hard to say what its place is at the cutting edge of NLP, where unsupervised systems reign supreme now. Seeing the wealth of development in e.g. multimodal ML, which was barely a thing two years ago in my view, has convinced me that looking beyond linguistics is exciting too.
Anyway, I don’t think my narrative of this conference is going to form a coherent whole either, so I’ll just note down random experiences that I remember from the conference.
- Eating pho with Justus Mattern (differential privacy), one of the limited number of undergrads at NAACL, before the conference. Us and Rohan Pandey (semantic parsing/linguistics in NLP) ended up attending much of the conference and socials together.
- Walking aimlessly around Seattle with Machel Reid (low-resource MT, LM things), also before the conference.
- Talking to Dzmitry Bahdanau (first author on the paper that introduced attention; a real fanboy moment) at the MoPOP social. I am proud of maintaining surprisingly high-level conversation which I learned a lot from—one thing I’ll keep thinking about is his mention of the “high-risk, high-reward” challenge of beating transformers.
- Talking to Venelin Kovatchev about Bulgarian for half an hour, and randomly more throughout the conference.
- Having lunch at a Sichuanese restaurant with Eleanor Jiang and some others. This was funny to me because I ended up meeting more ETH-affliated people at NAACL than I did while actually at ETH, due to the work-from-home policy last summer there.
- Learning about embeddings in hyperbolic spaces from Jun Takeuchi’s excellent poster at the SRW. This is an idea I had could have never even known to look for, and the presentation was very well done.
- Meeting Matyáš Boháček, a 10th-grader (!!) who does research in NLP and ML broadly. As someone who got into research early too but not that early, this kind of thing is very humbling.
- Talking to Paul Smolensky (the inventor of optimality theory; again a fanboy moment) about the panel and randomly chatting like normal people throughout the conference.
- Talking to Shaily Bhatt during the workshops. In general, it’s cool to see the exciting new work and opportunities going on in India, especially on NLP for Indian langauges. Also was really great to finally meet Monojit Choudhury in-person at the social.
- Seeing Ruibo Liu’s paper on AI alignment. I have thoughts™ on alignment as a research programme, but anyway this made me think a lot about how we evaluate generative language models on free-text generation.
- Visiting the AI2 social (which we somehow got into despite not being able to RSVP) and joining random circles of people and talking about everything. I really enjoyed talking to Ashish Sabharwal and Zhijing Jin. AI2 also had a great selection of cheeses.
- Playing some game whose name I do not know (it was a bit like curling on a table?) at the DADC rooftop social, with great views of Seattle. I talked to Martijn Bartelds (Dutch dialectology) and a bunch of UCSC people.
- The talk for “Do Prompt-Based Models Really Understand the Meaning of their Prompts?” by Albert Webson and Ellie Pavlick. I had seen an earlier version of this at MASC-SLL. This affirms, to a degree, the intuitive suspicions that arise about the effectiveness of prompting. But of course, we have work pointing the other way too now, with chain-of-thought prompting.
In general, the conference left me with a lot of new ideas and uncertainties. I was definitely affirmed about my choice of sticking towards research. It was exciting to see all the new things happening at NAACL! There is a unique wonder of being at the cutting edge of a rapidly-maturing field. There is a dizzing array of directions to work towards, in uncharted territory, even for the least experienced researchers. At the same time, the results of new work are immediately apparent due to the inherent scalability of software. To me, this is the best of both worlds.
On the other hand, I am increasingly disappointed in the place of linguistics in this new work. Language is a beautiful and complex system, grounded in social interaction, but it does not feel like the direction that NLP is going in really is benefitting from the work of linguists. I think my disillusionment is not with NLP though. NLP is successful. Computational linguistics is successfully using the engineering breakthroughs of NLP to better understand language. But the way in which this is progressing is totally divorced from the field of linguistics, which makes linguistics feel like a bit of a dead end.
Anyway, NAACL was great, and these are exciting times in the field.