The Hittite problem

You are a Hittite spy. You report directly to the great king Šuppiluliuma I, who rules from his magnificent capital at Ḫattuša. Recently, the Hittites have been on good terms with their neighbours the Hurrians. Still, you can never be too sure about even the best of allies, and therefore the king has sent you to infilitrate the Hurrian bureauracy so that you can pre-empt any conspiracies brewing within.1 You are so good at your job that you have totally compromised all incoming information channels into the Hurrian government apparatus—you can mess with their weather reports, their intelligence dossiers, their tax records, etc....

August 10, 2023 · 8 min · 1548 words · Me

Some intuitions about transformers

Unless you have been living under a rock for the last five years, you have definitely (if possibly unknowingly) somehow interacted with a machine learning model that uses the transformer architecture. I have spent a couple months poking at little transformer models like GPT-2 and the 19 million-parameter version of Pythia and yet after working at an interpretability startup for a week I realised that I actually don’t have a great understanding of how a transformer works....

December 24, 2022 · 6 min · 1108 words · Me

NAACL 2022

I entered the NAACL conference venue, just a couple blocks away from my place in Seattle, with the absolute lowest of expectations. I have attended two other conferences this year entirely online: ACL and LREC. At both, I had minimal interactions with other human beings beyond “standing” at my poster, and most of that time no one was even there to ask about my work. Hence, zero expectations. The Multimodal ML tutorial....

August 4, 2022 · 6 min · 1180 words · Me

A machine-learned syntactic tagset?

I have been fascinated by the ACL 2022 Best Paper, Learned Incremental Representations for Parsing by Nikita Kitaev, Thomas Lu, and Dan Klein. I had the good fortune, wandering around in Gather.town, to attend the poster session virtually before the award was announced. The big thing that the paper showed is the tractibility of human-like incremental parsing, which seems to have been pretty much a dormant problem in the field ever since full-sentence language models started dominating all the benchmarks....

June 21, 2022 · 10 min · 2002 words · Me

EMNLP 2020

Cool Papers Notes Claire Cardie: Information Extraction Nov 16, 10:00 AM Claire Cardie delivered a really cool keynote on information extraction from a historical perspective. (Surprising and uncomfortable how much NLP research started out with U.S. military applications.) My notes here don't make sense, it's just for me to note down things to read since I didn't know information extraction was a thing! One thing to think about: Can we use information extraction techniques to build useful resources for low-resource languages?...

November 18, 2020 · 3 min · 576 words · Me