"Approximate Recall Confidence Intervals": code and data

This page contains code and data for the paper, Approximate Recall Confidence Intervals, by William Webber, (ACM Transactions on Information Systems, Volume 31, Issue 1, January 2013, pages 2:1--33).

The recall confidence intervals described in the paper have been implemented as an R package, recci. The version of the recci package corresponding to the published paper is recci version 0.2. There are also later versions, as follows:

0.5.0 2014-12-09 Confidence intervals on eRecall
0.4.1 2013-04-26 One-sided confidence intervals
0.4 2013-04-23 Normal approximation interval on F1 (derivation in this tech report)
0.3 2012-10-19 Bayesian intervals on F1
0.2 2012-03-14 Original version as used in Approximate Recall Confidence Intervals

To generate the figures and tables in the paper, install the reccipub R package. Execute the reccipub:::DoPubFig() function for figures, and the reccipub:::DoPubTbl() function for tables.

To re-run the experiments performed in the paper, install the recciexp R package, and execute the recciexp:::DoGenData() function. Note that this will take around 60 hours to run on a modern single processor, though it take advantage of multiple cores using the doMC package. (I have also run these experiments successfully using the doMPI package, though it takes some scripting to get working; contact me if you need help.) If you simply want to test the experimental functions, with small numbers of samples and simulations, then run recciexp:::DoGenData(size="test").

William Webber, william@williamwebber.com.