P_NEAR
Name
P_NEAR
Synopsis
Returns records where the listed terms
or subqueries are within a specified number of words.
Arguments
P-Value
Maximum distance between the words
List of terms and subqueries
Ranking Scheme
Ranks using the p-norm method. Conceptually
you can consider each term a unique direction in n-space (where
n is the number of terms). Each term's weight represents the
distance in that direction. The p-value (p in the above formula)
represents how much the final calculation is like the maximum
weight or like the total distance of all vectors in n-space.
For p = 2 the p-norm is identical to a vector space model. As
p approaches infinity the returned weight becomes equal to the
largest weight among the terms.
Picking an appropriate p-value can be tricky
and often is the result of experimentation. A simplified way
of thinking about the calculation is that it varies between a
kind of average weight to the maximum weight. Generally a p-value
of 2.5 or 3 is effective.
The p-norm weighting method is one
of the most efficient methods available, especially if you've
been careful in choosing how you weight your terms.
Note that the p-norm calculation
for P_NEAR is the same as for P_AND.
Comments
NEAR returns the hits (location) for each
term or subquery that are within the specified number of words
of each other.
Note that while we say, "within n
words," what counts as a word is specified by the indexing
process. Each call to prIndexWord increments the word count.
Example
P_NEAR(2.5, 3 'blue', 'green' );
See Also
near, r_near, p_near,
v_near, ordered_near,
r_ordered_near, p_ordered_near,
v_ordered_near