Score contribution per author:
α: calibrated so average coauthorship-adjusted count equals average raw count
Assume we are asked to predict a real-valued variable yt based on certain characteristics , and on a database consisting of for i=1,...,n. Analogical reasoning suggests to combine past observations of x and y with the current values of x to generate an assessment of y by similarity-weighted averaging. Specifically, the predicted value of y, , is the weighted average of all previously observed values yi, where the weight of yi, for every i=1,...,n, is the similarity between the vector , associated with yt, and the previously observed vector, . The "empirical similarity" approach suggests estimation of the similarity function from past data. We discuss this approach as a statistical method of prediction, study its relationship to the statistical literature, and extend it to the estimation of probabilities and of density functions.