Aryaman Arora » Papers
  
  These are all of my publications. I also like to keep track of and read papers I'm acknowledged in.
  2025
  Mechanistic evaluation of Transformers and state space models
    Aryaman Arora, Neil Rathi, Nikil Roashan Selvam, Róbert Csórdas, Dan Jurafsky, Christopher Potts
    arXiv:2505.15105, 2025 [paper] [code]
  
  Improved representation steering for language models
    Zhengxuan Wu*, Qinan Yu*, Aryaman Arora, Christopher D. Manning, Christopher Potts
    NeurIPS, 2025 Spotlight [paper] [code]
  
  Detecting foreign content in self-generated text: A recognition study of large language models
    Shengyu Zhu, Tamika Bassman, Dat Tran, Aryaman Arora
    NeurIPS LLM Evaluation Workshop, 2025
  
  Bayesian scaling laws for in-context learning
    Aryaman Arora, Dan Jurafsky, Christopher Potts, Noah D. Goodman
    COLM, 2025 [paper] [code]
  
  AxBench: Steering LLMs? Even simple baselines outperform sparse autoencoders
    Zhengxuan Wu*, Aryaman Arora*, Atticus Geiger, Zheng Wang, Jing Huang, Dan Jurafsky, Christopher D. Manning, Christopher Potts
    ICML, 2025 Spotlight [paper] [code]
  
  2024
  Causal abstraction: A theoretical foundation for mechanistic interpretability
    Atticus Geiger, Duligur Ibeling, Amir Zur, Maheep Chaudhary, Sonakshi Chauhan, Jing Huang, Aryaman Arora, Zhengxuan Wu, Noah Goodman, Christopher Potts, Thomas Icard
    JMLR, 2025 [paper]
    
  ReFT: Representation finetuning for language models
  Zhengxuan Wu*, Aryaman Arora*, Zheng Wang, Atticus Geiger, Dan Jurafsky, Christopher D. Manning, Christopher Potts
  NeurIPS, 2024 Spotlight [paper] [code]
  
  CausalGym: Benchmarking causal interpretability methods on linguistic tasks
  Aryaman Arora, Dan Jurafsky, Christopher Potts
  ACL, 2024 Outstanding Paper Award Senior Area Chair Award [paper] [code]
  
  pyvene: A library for understanding and improving PyTorch models via interventions
  Zhengxuan Wu, Atticus Geiger, Aryaman Arora, Jing Huang, Zheng Wang, Noah D. Goodman, Christopher D. Manning, Christopher Potts
  NAACL: System Demonstrations, 2024 [paper] [code]
  
  IruMozhi: Automatically classifying diglossia in Tamil
  Kabilan Prasanna, Aryaman Arora
  NAACL: Findings, 2024 [paper] [code]
  
  Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
  Nay San, Georgios Paraskevopoulos, Aryaman Arora, Xiluo He, Prabhjot Kaur, Oliver Adams, Dan Jurafsky
  SIGTYP, 2024 [paper] [code]
  
  A reply to Makelov et al. (2023)’s “interpretability illusion” arguments
  Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, and Noah D. Goodman
  arXiv:2401.12631, 2024 [paper] [code]
  
  2023
  Towards vision-language mechanistic interpretability: A causal tracing tool for BLIP
  Vedant Palit*, Rohan Pandey*, Aryaman Arora, Paul Pu Liang
  5th Workshop on Closing the Loop Between Vision and Language (CLVL), 2024 [paper] [code]
  
  SIGMORPHON–UniMorph 2023 Shared Task 0: Typologically diverse morphological inflection
  Omer Goldman, Khuyagbaatar Batsuren, Salam Khalifa, Aryaman Arora, Garrett Nicolai, Reut Tsarfaty, Ekaterina Vylomova
  SIGMORPHON, 2023 [paper] [code]
  
  Jambu: A historical linguistic database for South Asian languages
  Aryaman Arora, Adam Farris, Samopriya Basu, Suresh Kolichala
  SIGMORPHON, 2023 [paper] [code]
  
  Unified syntactic annotation of English in the CGEL framework
  Brett Reynolds, Aryaman Arora, Nathan Schneider
  LAW, 2023 [paper] [code]
  
  CGELBank Annotation Manual v1.0
  Brett Reynolds, Nathan Schneider, Aryaman Arora
  arXiv:2305.17347, 2023 [paper] [code]
  
  Investigating induction heads in a small transformer language model
  Aryaman Arora
  MASC-SLL, 2023 [paper] [code]
  
  Localizing model behavior with path patching
  Nicholas Goldowsky-Dill, Chris MacLeod, Lucas Sato, Aryaman Arora
  arXiv:2304.05969, 2023 [paper] [code]
  
  2022
  Information theory in linguistics: Methods and applications
  Ryan Cotterell, Richard Futrell, Kyle Mahowald, Clara Meister, Tiago Pimentel, Adina Williams, Aryaman Arora
  COLING: Tutorials, 2022 [paper]
  
  CGELBank: CGEL as a framework for English syntax annotation
  Brett Reynolds, Aryaman Arora, Nathan Schneider
  arXiv:2210.00394, 2022 [paper] [code]
  
  SIGMORPHON–UniMorph 2022 Shared Task 0: Generalization and typologically diverse morphological inflection
  Jordan Kodner, ..., Aryaman Arora, ..., Ekaterina Vylomova
  SIGMORPHON, 2022 [paper] [code]
  
  The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
  Khuyagbaatar Batsuren, Gábor Bella, Aryaman Arora, ..., Ryan Cotterell, Ekaterina Vylomova
  SIGMORPHON, 2022 [paper] [code]
  
  Universal Dependencies for Punjabi
  Aryaman Arora
  LREC, 2022 [paper] [code]
  
  MASALA: Modelling and analysing the semantics of adpositions in linguistic annotation of Hindi
  Aryaman Arora, Nitin Venkateswaran, Nathan Schneider
  LREC, 2022 [paper] [code]
  
  UniMorph 4.0: Universal Morphology
  Khuyagbaatar Batsuren, Omer Goldman, ..., Aryaman Arora, ..., Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova
  LREC, 2022 [paper] [code]
  
  A CGEL-formalism English treebank
  Aryaman Arora, Nathan Schneider, Brett Reynolds
  MASC-SLL, 2022 [code]
  
  Estimating the entropy of linguistic distributions
  Aryaman Arora, Clara Meister, Ryan Cotterell
  ACL, 2022 [paper] [code]
  
  Computational historical linguistics and language diversity in South Asia
  Aryaman Arora, Adam Farris, Samopriya Basu, Suresh Kolichala
  ACL, 2022 [paper]
  
  DIPI: Dependency parsing for Ashokan Prakrit historical dialectology
  Adam Farris*, Aryaman Arora*
  Towards a comparative historical dialectology: evidence from morphology and syntax, Deutschen Gesellschaft für Sprachwissenschaft, 2022 [code]
  
  2021
  For the purpose of curry: A UD Treebank for Ashokan Prakrit
  Adam Farris*, Aryaman Arora*
  UDW, SyntaxFest, 2021 [paper] [code]
  
  Bhāṣācitra: Visualising the dialect geography of South Asia
  Aryaman Arora, Adam Farris, Gopalakrishnan R, Samopriya Basu
  LChange, 2021 [paper] [code]
  
  Kholosi Dictionary
  Aryaman Arora, Ahmed Etebari
  Zenodo, 2021 [paper] [code]
  
  Adposition and case supersenses v1.0: Guidelines for Hindi–Urdu
  Aryaman Arora, Nitin Venkateswaran, Nathan Schneider
  arXiv:2103.01399, 2021 [paper] [code]
  
  SNACS annotation of case markers and adpositions in Hindi
  Aryaman Arora, Nitin Venkateswaran, Nathan Schneider
  SCiL, 2021 [paper] [code]
  
  2020
  PASTRIE: A corpus of prepositions annotated with supsersense tags in Reddit International English
  Michael Kranzlein, Emma Manning, Siyao Peng, Shira Wein, Aryaman Arora, Nathan Schneider
  LAW, 2020 [paper] [code]
  
  SNACS annotation of case markers and adpositions in Hindi
  Aryaman Arora, Nathan Schneider
  SIGTYP, 2020 [paper] [code]
  
  Supervised grapheme-to-phoneme conversion of orthographic schwas in Hindi and Punjabi
  Aryaman Arora, Luke Gessler, Nathan Schneider
  ACL, 2020 [paper] [code]
  
  2019
  Quasi-passive lower and upper extremity robotic exoskeleton for strengthening human locomotion
  Aryaman Arora, John R. McIntyre
  Sustainable Innovation, 2019 [paper]