Introduction to NLP with Maury Courtland (USC)!

2019-06-11
Organizer: Triangle Computer Vision and Machine Learning Group
Click here for registration info

Maury Courtland will give an introduction to key methods and developments in NLP, geared toward the ML practitioner. This will be an awesome kickoff to our 'Summer of Multi-modal', which will be looking at multimodal AI. Description:

This month, I'll be presenting Natural Language Processing 101. This 30,000 foot fly-by of NLP will cover both the theoretical background of dealing with language (and some speech) data, some famous common problems that NLP aims to solve, some traditional techniques for addressing those problems, and of course, a selection of some recent deep learning models (though not so many that there's nothing left to talk about in the future!). This introduction will lay the foundation and gauge interest in several subfields of NLP that may be explored (among others) in future meetings.

Theoretical Background:
A whirlwind tour of linguistics from sampling the speech signal to modeling meaning and propositions with text data. In the interest of time, this will be at an extremely high level, but it should provide enough of a starting point to be able to think about and discuss the ways to process language data. How do we process information that unfolds over time? How do we define information? How do we obtain continuous fluid relationships from discrete static entities (letters, words, phrases, etc.)?

Some Common Problems:
Machine Translation - Training or programming a machine to translate between 2 languages.
Sentiment Analysis and Emotion Detection - Determining how people feel about the topic they are discussing.
Grammar Parsing, Grammar Checking, and Part-of-speech Taggers - Evaluating whether a given language sample is a valid sequence in a target language.
Word Sense Disambiguation - Which meaning of "bank" (money storage vs. river edge), "plant" (production facility vs. flora), "bass" (fish vs. singer, etc.) does the user want to convey?

Some Traditional Approaches:
Statistical Translation and Phrase Tables: https://members.loria.fr/EGalbrun/resources/Gal09_phrase.pdf
Naive Bayes Classifiers: https://en.wikipedia.org/wiki/Naive_Bayes_classifier
Chart Parsing: https://en.wikipedia.org/wiki/Chart_parser
Knowledge Bases such as Wordnet: https://wordnet.princeton.edu/

Deep Learning Approaches:
LSTMs: https://www.mitpressjournals.org/doi/10.1162/neco.1997.9.8.1735
GRUs: https://arxiv.org/pdf/1406.1078.pdf
Attention Redux (see last month): https://arxiv.org/pdf/1409.0473.pdf, https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf
Pre-trained Language Models: https://arxiv.org/pdf/1810.04805.pdf, https://arxiv.org/pdf/1802.05365.pdf, https://arxiv.org/pdf/1801.06146.pdf

And of course, there will be plenty of time for Q&A, NLP project advice, etc. so come with your curiosity and questions. We hope to see you at our NLP 101 event!

Pizza will be provided by LifeOmic!

• Important to know

Ground floor, first door on the right as you come in the main entrance of 3800 Paramount Pkwy, Morrisville.

----

Poster: triangletech