CSCE 470 Lecture 34
Jump to navigation
Jump to search
« previous | Sunday, November 17, 2013 | next »
Exam Preparation
Big-8 Topics:
- Vector Space Retrieval (TF-IDF + Cosine)
- Evaluation (Precision, Recall, F-Measure, NDCG)
- Statistical Properties of Text (Heaps, Zipfs)
- Link Analysis (PageRank, Hubs and Authorities)
- Clustering (K-Means, maybe hierarchical)
- Classification (Rocchio, KNN, Naive Bayes)
- Recommenders (Content-Based, Collaborative Filtering)
- Extra (Learning to Rank, Location and Geo)
Question Answering
Dan Jurafsky
Google gives you 10 blue links to nagivate through. Most times, we just want a single answer.
Example:
- Siri
- WolframAlpha
Complex Questions
- In children with an acute febrile illnese, what is the efficacy of acetaminophen in reducing fever?
- What do scholars think about Jefferson's position on dealing with pirates?
"Factoid" Questions
- Who wrote "The Universal Declaration of Human Rights"?
- How many calories are there in two slices of apple pie?
- What is the average age of the onset of autism?
- Where is Apple Computer based?
two approaches
- IR (relevant to our class)
- TREC
- IBM Watson
- Knowledge-based (more of an AI topic)
- Siri
- Evi
IR Factoid Q/A
- Find a bunch of relevant documents
- pull out passage windows containing query terms
- filter top words that are close by
Further analysis:
- coarse topics
- description
- location
- abbreviation
- entity
- numeric
- human
- finer topics underneath each coarse topic