CSCE 470 Lecture 34

From Notes
Jump to navigation Jump to search

« previous | Sunday, November 17, 2013 | next »


Exam Preparation

Big-8 Topics:

  1. Vector Space Retrieval (TF-IDF + Cosine)
  2. Evaluation (Precision, Recall, F-Measure, NDCG)
  3. Statistical Properties of Text (Heaps, Zipfs)
  4. Link Analysis (PageRank, Hubs and Authorities)
  5. Clustering (K-Means, maybe hierarchical)
  6. Classification (Rocchio, KNN, Naive Bayes)
  7. Recommenders (Content-Based, Collaborative Filtering)
  8. Extra (Learning to Rank, Location and Geo)


Question Answering

Dan Jurafsky

Google gives you 10 blue links to nagivate through. Most times, we just want a single answer.

Example:


Complex Questions

  • In children with an acute febrile illnese, what is the efficacy of acetaminophen in reducing fever?
  • What do scholars think about Jefferson's position on dealing with pirates?


"Factoid" Questions

  • Who wrote "The Universal Declaration of Human Rights"?
  • How many calories are there in two slices of apple pie?
  • What is the average age of the onset of autism?
  • Where is Apple Computer based?

two approaches

  1. IR (relevant to our class)
    • TREC
    • IBM Watson
    • Google
  2. Knowledge-based (more of an AI topic)
    • Siri
    • Evi


IR Factoid Q/A

  1. Find a bunch of relevant documents
  2. pull out passage windows containing query terms
  3. filter top words that are close by


Further analysis:

  • coarse topics
    1. description
    2. location
    3. abbreviation
    4. entity
    5. numeric
    6. human
  • finer topics underneath each coarse topic