Cool Papers


Claire Cardie: Information Extraction

Nov 16, 10:00 AM

Claire Cardie delivered a really cool keynote on information extraction from a historical perspective. (Surprising and uncomfortable how much NLP research started out with U.S. military applications.) My notes here don't make sense, it's just for me to note down things to read since I didn't know information extraction was a thing!

One thing to think about: Can we use information extraction techniques to build useful resources for low-resource languages? I'm thinking extracting data from the currently unstructured DSAL dictionaries for example, or the recent effort to do so from The Linguistic Survey of India.

  • NER: Akbik et al. (2018, 2019) [CoNLL 03]
  • relation extraction/classification: Soares et al. (2019)
  • Miwa & Bansal (2016), Zhang et al. (2017), Wang et al. (2018), Luan et al. (2019), Wadden et al. (2019)
  • event extraction (CNNs, RNNs)
    • ACE same sentence tho
  • Ralph Grishman

BOAF: Semantics

Nov 17, 4:00 PM
  • Siva Reddy, Dipanjan Das, Ellie Pavlick, Matt Gardner, Chris Potts.
  • Move to contextual representations is a better approximation of how linguistics thinks about language (Chris Potts), explicit linguistic structures [I suppose things like POS tags, dependencies, etc., "explicit things like a parser"] is going to be declining as we have better models that don't need that information and can learn it implicitly (Matt Gardner).
  • [I am reminded of Ethan A. Chi's work on extracting syntactic representations from mBERT].
  • Nathan Schneider: Can NLP work towards helping linguistics too? (Seems like we only talk about the other way).
  • Dipanjan Das: We have not yet fully explored the capabilities of transformers and other masked LMs, we should not discount them and say diminishing returns are inevitable. Example given of multi-digit arithmetic skills appearing with greater inputs. [but GPT-3 is huge! at what point do we draw the line? humans do more with far less.]
  • Ellie Pavlick: We can't pick up world knowledge or even full level human-level language just from masked LMs ("hints" on the internet can be picked up), there's got to be a more efficient [multimodal?] way. But it is very possible that an NN can have human-level language skill, just not the current masked LM approach.
  • multi-hop reasoning
  • Why only text-in text-out? "Fundamentally awkward" way to think about language (Pavlick). Need embodying in the world. [What the heck even is language? Why are we using text as a proxy? I wonder why NLP work doesn't deal with, like, speech directly (outside ASR which is just a way to interpret into text).]
  • "ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators." Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
  • Linguistic vs. world knowledge, not a division actually reflected in masked language models which are learning both together. Lead to rethinking linguistics?
  • Potts: Deep learning isn't gonna replace linguistics!


  • Black-box explanation methods (LME, SHAP, Partial dependence) are when you don't have the training data, glass-box (EBM) when you do.
  • Accuracy vs. intelligibility tradeoff? No longer the case necessarily, thanks to explainable boosting machines. Comparable to full-complexity models.
  • EBMs are a type of generalised additive models (GAM) i.e. sums of functions or sums of functions of pairwise interactions.