vibing

Aryaman Arora

Second-year Ph.D. student at Stanford NLP.

About

At Stanford, I’m advised by Dan Jurafsky and Christopher Potts. My research is focused on interpretability.

I want to understand how neural networks (particularly language models) work. To that end, I’m excited about causal approaches and insights from linguistics—however, I’m not committed to any particular line of work and am always excited to try new things!

I completed my B.S. in Computer Science and Linguistics at Georgetown University, where I worked with Nathan Schneider. I interned at ETH ZΓΌrich with Ryan Cotterell working on information theory, as well as at Apple and Redwood Research. (See my CV for more.)

Hit me up on Twitter or at aryamana [at] stanford [dot] edu.

Greatest hits

AxBench: Steering LLMs? Even simple baselines outperform sparse autoencoders.
Zhengxuan Wu*, Aryaman Arora*, Atticus Geiger, Zheng Wang, Jing Huang, Dan Jurafsky, Christopher D. Manning, Christopher Potts.
ICML Spotlight, 2025
ReFT: Representation finetuning for language models.
Zhengxuan Wu*, Aryaman Arora*, Zheng Wang, Atticus Geiger, Dan Jurafsky, Christopher D. Manning, Christopher Potts.
NeurIPS Spotlight, 2024
CausalGym: Benchmarking causal interpretability methods on linguistic tasks.
Aryaman Arora, Dan Jurafsky, Christopher Potts.
ACL Outstanding Paper Senior Area Chair Award, 2024
Estimating the entropy of linguistic distributions.
Aryaman Arora, Clara Meister, Ryan Cotterell.
ACL, 2022

More...

News

  • 2025-05-01 AxBench will be a spotlight paper at ICML 2025 πŸ‡¨πŸ‡¦ (this will be the second time me and Zen are presenting a spotlight paper together in Vancouver…)
  • 2024-11-04 Bayesian scaling laws for in-context learning w/ Noah Goodman
  • 2024-09-26 ReFT will be a spotlight paper at NeurIPS 2024 πŸ‡¨πŸ‡¦
  • 2024-08-16 CausalGym won an outstanding paper award at ACL 2024 πŸ‡ΉπŸ‡­
  • 2024-06-21 Presented pyvene and IruMozhi at NAACL 2024 πŸ‡²πŸ‡½
  • 2024-04-05 New interp-inspired ultra-efficient finetuning method out: ReFT (repo, tweet).
  • 2024-03-13 We released the paper for pyvene, a new library for intervening on the internal states of neural networks!
  • 2024-02-19 My first lead-author project as a Ph.D. student is out: CausalGym: Benchmarking causal interpretability methods on linguistic tasks.
  • 2023-09-14 Moved to the San Francisco Bay Area πŸŒ‰ to start my Ph.D. 🫑
  • 2023-07-31 Back from the Leiden University Summer School in Languages and Linguistics in the Netherlands!
  • 2023-02-08 Accepted to the Ph.D. program at Stanford CS!