P_ORDERED_NEAR
Name
P_ORDERED_NEAR
Synopsis
Returns records where the listed terms
or subqueries are within a specified number of words and are
in the same order as listed.
Arguments
Maximum distance between the words
List of terms and subqueries
Ranking Scheme
Ranks using the p-norm method. Conceptually
you can consider each term a unique direction in n-space (where
n is the number of terms). Each term's weight represents the
distance in that direction. The p-value (p in the above formula)
represents how much the final calculation is like the maximum
weight or like the total distance of all vectors in n-space.
For p = 2 the p-norm is identical to a vector space model. As
p approaches infinity the returned weight becomes equal to the
largest weight among the terms.
Picking an appropriate p-value can be tricky
and often is the result of experimentation. A simplified way
of thinking about the calculation is that it varies between a
kind of average weight to the maximum weight. Generally a p-value
of 2.5 or 3 is effective.
The p-norm weighting method is one
of the most efficient methods available, especially if you've
been careful in choosing how you weight your terms.
Note that the p-norm calculation
for P_ORDERED_NEAR is the same as for P_AND.
Comments
P_ORDERED_NEAR combines the features of
NEAR and PHRASE. It is like a PHRASE in that each term or subquery
must occur in the order they are listed. It differs in that they
need not be adjacent. They must, however, be within the specified
number of words.
Note that while we say, "within n
words," what counts as a word is specified by the indexing
process. Each call to prIndexWord increments the word count.
Example
P_ORDERED_NEAR( 2.5, 3 'blue', 'green'
);
See Also
near, r_near,
p_near, v_near, ordered_near, r_ordered_near, p_ordered_near, v_ordered_near