A machine-learned syntactic tagset?

I have been fascinated by the ACL 2022 Best Paper, Learned Incremental Representations for Parsing by Nikita Kitaev, Thomas Lu, and Dan Klein. I had the good fortune, wandering around in Gather.town, to attend the poster session virtually before the award was announced. The big thing that the paper showed is the tractibility of human-like incremental parsing, which seems to have been pretty much a dormant problem in the field ever since full-sentence language models started dominating all the benchmarks....

June 21, 2022 · 10 min · 2002 words · Me

Secret verb forms in Hindi

Hindi is the best-studied language in South Asia. It would not be the worst thing if every linguist working on Hindi decided to take a break and pick any other language of the region to study. Nonetheless, Hindi does not set a relatively high bar for linguistic investigation when compared to other languages of the world; there is plenty that simply hasn't been described in any work by a linguist, let alone analysed or explained....

May 23, 2022 · 4 min · 726 words · Me

The -kk- verbal extension in Indo-Aryan

After the fragmentation of Sanskrit, one of the innovative features that developed across the Indo-Aryan language family are the "pleonastic" suffixes, including (but not limited to) -kk-, -ḍ-, -r-, -l(l)-, and nominal diminutives -ka- (m.) and -ikā- (f.). Pleonastic means serving no semantic purpose; basically, the consensus has been that these suffixes merely served as phonological extensions to distinguish words after the collapse of many phonotactic distinctions from Sanskrit to Middle Indo-Aryan....

May 3, 2022 · 3 min · 496 words · Me

*ll in Indo-Aryan

ṭ ~ ṭh ~ l ~ ll aṅkōṭa, aṅkōṭha, aṅkōla, aṅkōlla113 ’the small tree Alangium hexapetalum' l ~ ll *avala, *avalla819 ‘contrary’ avalīyatē, *ullīyatē833 ‘stoops; hides oneself; sticks to’ (r)dr ~ ll ārdrá, *ālla1340 ‘wet’ ārdraka, *āllaka1341 ‘ginger’ *āllabhr̥ṣṭa1408 ‘moist crop of maize’ Other *allaḍa724 ‘childish’ *allā725 ’name of a tree or plant' cullī *apa-cullī420a ‘side or secondary stove’ *ā-cullī1075 ‘small or secondary stove’ *lulla *ā-lulla1391 ‘maimed’ vallī amlavallī583 ’the plant Pythonium bulbiferum Schott'

February 21, 2022 · 1 min · 74 words · Me

Reflexive causatives in Hindi

One of the first unfamiliar distinctions that a learner of Sanskrit will encounter is parasmaipada vs. ātmanepada verbs. They have two different sets of morphological endings—a pain on top of the different endings for the 10 root classes—but often no obvious difference in meaning. Does it actually matter whether I use parasmaipada or ātmanepada forms? Why not stick to one to halve the number of endings I need to learn?...

January 12, 2022 · 6 min · 1207 words · Me