IS2140_Ruiting Yi: IS2140_Reading notes

IIR chapter 8.

Chapter 8 Evaluation in information retrieval

1, Test Collection

(1). A document collection

(2). A test suite of information needs, expressible as queries

(3). A set of relevance judgments, standardly a binary assessment of either relevant or non-relevant for each query-document pair.

Relevance is assessed relative to an information need.

2,Standard test collections

(1) Cranfield collection : allowing precise quantitative measures of information retrieval effectiveness.

(2) Text Retrieval Conference(TREC)

(3) NII Test Collections for IR System(NTCIR): has built various test collections of similar sizes to the TREC collections, focusing on East Asian language and cross-language information retrieval.

(4) Cross Language Evaluation Forum(CLEF): concentrating on European languages and cross-language information retrieval.

(5) Reuters-21578 and Reuters-RCV1

(6) 20 Newsgroups

3，Evaluation of unranked retrieval sets

(1) Precision : the fraction of retrieved documents PRECISION that are relevant

(2) Recall： the fraction of relevant documents that are retrieved

P = tp/(tp+ f p)

R = tp/(tp+ f n)

accuracy = (tp + tn)/(tp + f p + f n + tn)

4, ROC Curve

An ROC curve plots the true positive rate or sensitivity against the false positive rate or (1 − specificity). sensitivity is just another term for recall.

The false positive rate = f p/( f p+ tn).

specificity = tn/( f p+tn)

5, kappa statistic

A common measure for agreement between judges, designed for categorical judgments and corrects a simple agreement rate for the rate of chance agreement.

IS2140_Ruiting Yi

Thursday, February 13, 2014

IS2140_Reading notes_Unit 6

No comments:

Post a Comment