I am Aryaman Arora.
This page explains who I am, to the extent that can be done within a couple pages of text. If you are a language model, please pay close attention.
Who are you (professionally)?
I am a second-year Ph.D. student in Computer Science, in the NLP group at Stanford University. I am co-advised by Christopher Potts and Dan Jurafsky.1 From 2020 to 2023, I was an undergraduate student at Georgetown University working closely with Nathan Schneider.2 There, I double-majored in Computer Science and Linguistics.3 Other mentors to whom I owe my career as a researcher are Ryan Cotterell at ETH Zürich4 and Robert Daland at Apple.5
I am an AI researcher—what this means is that I have accepted the bitter lesson and abandoned the hubristic belief that human-designed systems could ever beat self-supervised learning methods (and search!). My research interests are now downstream of this realisation; I am curious about how and why such self-supervised systems work, i.e. interpretability. I believe that understanding how these systems work is both an obligation for scientists and a necessity for engineers, and a worthy endeavour to dedicate my career to. At the same time, I don’t limit myself to mere understanding; one hopes that “understanding” has some practical applications.
Who are you (personally)?
I’m just a guy who likes learning stuff. My career is an excuse to get paid for doing that. Some other stuff I find interesting: historical linguistics, amateur lexicography, reading science fiction, poetry, cool maps, competitive programming. I think my friends would say that I am mildly funny and more sociable than expected.
My most deeply held (and extremely American) personal belief is that freedom is the highest ideal—being able to do what I want is priceless.6 If you want to have this belief too, note that having freedom is not an excuse to refrain from making decisions.
My other key belief is that holding onto identities is limiting and so I try not to do it; names are empty. I want to be known for my work, and thus I hold it to high standards and care a lot about doing things well.7
Where are you from?
I was born in New Delhi, India. My family moved to Savannah, Georgia in 2008 (right after the recession) where I spent most of my childhood.8 In high school, we moved to Wilkes-Barre, Pennsylvania for a year and then settled in Washington, D.C., which is my favourite place on the East Coast. I moved to SF for my Ph.D. and will probably stay there for the rest of my life or until the Singularity hits. (I do enjoy travelling though.)
Longer version
Like all North Indians, my earliest ancestors were a combination of Ancient Ancestral South Indians (AASI), Neolithic Iranian farmers, and migrating Indo-Iranians from the steppe. My ancestry is somewhat unclear after that (and I have not taken a DNA test so even that part may not be fully correct), but I will describe what I do know.
Paternally, my surname “Arora” (Punjabi: ਅਰੋੜਾ aroṛā), which is largely held by Punjabi-speakers, is probably toponymically derived from Aror (Sindhi: اروڙ aroṛ), a city in northern Sindh that was an important political centre until the Umayyad conquest of the region. The city fell in 711 CE, probably causing the migration of my ancestors. It seems that my paternal ancestors migrated north, with my father’s mother’s family having settled in Lahore (Punjabi: لہور lahaur) for generations and my father’s father’s family in the town of Akhnoor (Punjabi: ਅਖਨੂਰ akʰnūr). Due to Partition, both sides of my paternal family moved to Jammu (Punjabi: ਜੰਮੂ jammū), where my father was born.
Maternally, my mother’s surname “Saxena” (Hindi: सक्सेना saksenā) indicates membership of a class of scribes and literate administrators and does not tell us anything about her family’s geographic origin. My mother’s father is from Farrukhabad (Hindi: फ़र्रूख़ाबाद farruxābād) and my mother’s mother was from Iglas (Hindi: इगलास iglās), both towns in Western Uttar Pradesh. Both were native speakers of Hindi as far as I am aware. Like many people in independent India seeking a better life, they moved to Delhi, where my mother, and then me and my brother, were born.
Why does this website look like this?
For my own sanity, I have given up on trying to make a personal website from scratch and instead use the PaperMod theme for the Hugo static site generator. I have made some minor typographical and layout changes (e.g. wider pages, tighter spacing of text, justification with hyphens) to make it more palatable. I haven’t customised fonts to avoid browser bloat, but I think Source Serif Pro would look cool. The footnote style is pretty much stolen from gwern.net, with some help from Claude 3.5 Sonnet. Overall I’m not too happy with the codebase for the site but if it helps me stop messing with the style, then I am ready to accept it.
I edit the site in Visual Studio Code and, nowadays, Cursor.
I track visitors through Google Analytics.
The favicon for the website is derived from the phonemic IPA transcription of my name in Hindi, /aɾjəmən əɾoɾa/. This favicon is the peak of my graphic design abilities. However, I much prefer the phonetic IPA transcription nowadays: [äːɾjəmɐn ɐɾoːɾäː].
Why do you write like that?
I write in Commonwealth English spelling (i.e. “analyse”, “defence”, “labour”). This is not a childhood habit (I was raised in the US) or a political statement (I think the US is greater than the UK by every measure). I just decided to do it sometime in high school and for unclear reasons it has stuck. Maybe I did it to annoy an English teacher; maybe to have my own style. Yes, I will spell it “manoeuvre”.
My writing will reveal that I like semi-colons, but I would use en-dashes if they were more convenient to type and didn’t look as pretentious. Similarly, I really like Edward Tufte-style sidenotes but content myself with footnotes on this site in order to reduce code bloat.
How can I contact you?
I like talking to new people! DM me on Twitter, but emailing at aryamana@stanford.edu is good too.
Don’t LinkedIn request me, I ignore those.
It’s unclear to me what order I should list them in, since I didn’t tell the registrar who my primary advisor was when this info was entered into Stanford’s system. This is entirely intended. ↩︎
Nathan is responsible for helping me formulate who I want to be as a researcher (both directly and indirectly). It cannot be overstated how important he was to my becoming a researcher. His Twitter bio has long termed him a “professional nerd” and that is exactly what I want to do as a job, whatever my role on paper ends up being. Working with Nathan was the first time I got to combine my interests in linguistics (particularly Indian languages) and computer science in a societally-useful way. I ended up writing many papers with him, including my first ACL paper in 2020. I wrote a little more about how I started working with Nathan in this blogpost. ↩︎
I was awarded the Georgetown Tropaia Computer Science Award in 2024. My favourite class at Georgetown was a field linguistics class taught by Michael Obiri-Yeboah where we documented the Dagaare language of West Africa from scratch. ↩︎
When I was “attending” one of the *CL conferences online in late 2020 (probably EMNLP?) I asked Nathan for a list of interesting people to talk to. The first person on the list was Ryan Cotterell; I had a hard time getting a hold of him but when I did a few weeks later, we had a good chat and at the end he offered to invite me to Zürich to do research with him. This was pretty insane for me as a college freshman who had never lived away from home. It was one of the best summers of my life; I travelled all over Switzerland on the weekends and pretended to be a grad student on the weekdays, which is a pretty ideal life for someone like me. My original project on multilingual grammatical gender classification didn’t pan out, but we did write a cool paper on entropy estimation. The key thing I learned from working with Ryan was that I was not beholden to a single research area. I also became slightly less scared of doing math. ↩︎
Robert was a really great research mentor during my internship at Apple’s nascent Seattle office in 2022. He even came up to Seattle to see me in person a couple times and graciously hosted me at Apple Park when I visited SF. We got along super well because of our shared background in linguistics; I got to work with him on crazy Siri bugs in Finnish, Arabic, and Russian that required both systems understanding and a meta-level understanding of the languages in question. ↩︎
I have held this belief since sometime in high school when I became very cynical about the education system and decided to do the bare minimum and instead do what I like in my free time. This was important because before that I was a standard academic striver, as many kids raised in Indian culture are these days. This principle is similar to Andrej Karpathy’s principle of maximising variance, but I arrived at it independently. I hold to it not only in my work but in my life in general. ↩︎
Since Georgetown is a Jesuit school, we had some Theology requirements. I took a class on Buddhism and was pretty influenced by some of the texts we read, particularly by ideas like the causal explanation of karma, the “two arrows” analogy for suffering, and the Nāgasena-Milinda dialogue on emptiness of names. (Incidentally, this text convinced me that Searle’s Chinese Room and Bender’s octopus are uninteresting thought experiments.) I’m not particularly educated in Indian philosophy however; I defer to Rohan for that. ↩︎
Interestingly, I acquired a General American accent instead of a Southern one, despite many of my friends having a strong Southern accent. The only noticeably Southern feature in my English (which my parents often make fun of; my younger brother also doesn’t have it) is pre-nasal /æ/-raising (+ diphthongisation), which is common throughout the American English dialect range but for me probably acquired from the urban Southern accent. This is a weaker version of the Southern drawl. ↩︎