IBM Watson

  • IBM Watson running.
  • Works offline, based on indexed web documents.
  • Does something similar to a Google Search and then statistical analysis to find the right association (Jeopardy)
  • Very broad knowledge, not necessarily deep "knowledge".
  • Jeopardy: https://www.youtube.com/watch?v=i-vMW_Ce51w
  • Skips the voice recognition.

Parts-of-speech tagging

  • Use wordnet for POS and synonmys.
  • Ambiguity - parse and check for legality of sentence. e.g. a dog bit me, bit could be verb/noun.
  • NLP - do we adapt to new words like 'twerk' and other new slang.
    • Can possibly use bootstrapping to find new knowledge when processor finds words that it does not know.
    • There may be many sentences which aren't correct grammatically in the first place. Should maybe infer from context?
  • Ambiguity example: I saw the man on the hill with the telescope.


In [5]:
import nltk

sentence = "the dog bit me"
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
[('the', 'DT'), ('dog', 'NN'), ('bit', 'NN'), ('me', 'PRP')]
In [8]:
import nltk

sentence = "I saw the man on the hill with the telescope" 
            # this sentence is prone to multiple interpretations.
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
[('I', 'PRP'),
 ('saw', 'VBD'),
 ('the', 'DT'),
 ('man', 'NN'),
 ('on', 'IN'),
 ('the', 'DT'),
 ('hill', 'NN'),
 ('with', 'IN'),
 ('the', 'DT'),
 ('telescope', 'NN')]

Other NLP Systems

  • Eliza - specific role, used to appear convincing to user. (http://web.stanford.edu/class/linguist238/p36-weizenabaum.pdf)
  • Source code: http://www.kurzweilai.net/forums/topic/eliza-chatbot-source-code
    • Has responses for "no keyword" in sentence.
    • Has responses to user sign on, topic changing, repetition by user, transposing "I am" -> "You are", etc.
    • Time based prompt for when user is unresponsive.
    • A bunch of responses with placeholders filled in with user text from previous response.
    • ELIZA isn't really an NLP system in the modern sense.
  • An early formal study of NLP: Shrdlu by Terry Winograd (1973)
    • http://www.csee.ogi.edu/~gormanky/courses/CS662/PDFs/winograd_1973.pdf
    • Assumes for example, a system which operates on blocks of different colors on a table.
      • E.g. "Pick up a big red block", "grasp the pyramid" and so on.
      • Some may be ambiguous, and the system may not be able to execute the command.
      • May ask questions to the user in order to disambiguate blocks.
    • Different in that it is using scene-specific data, and demonstrates more of an understanding in some sense than the statistical method used by Watson.
  • Siri
    • It does pattern-based recognition and forwards some parts to wolframalpha.
    • Users tend to adapt to the system and it's not necessary to have deep language understanding to build useful systems.