Profiling Engine SDK
Operator Reference

Operator Reference
Operator List
Topical List
   
 
Query Language
Introduction
Summary
Operators
Tips, Questions, and Answers
   
 
Main Index
Index
Tutorial
API Functions
Query Language
   
Technology Overview
   
Contact Us
   
 
Other Products
Onix Text Search and Retrieval Engine
Brevity Document Summarizer
Lextek Document Profiler & Categorizer
RouteX Document Routing Engine
Lextek Language Identifier
 

P_ATLEAST

Name

P_ATLEAST

Synopsis

Returns records where any of the terms and subqueries that have at least a specified weight are present.

Arguments

P-Value
Minimum required weight
List of terms and subqueries

Ranking Scheme

Ranks using the p-norm method. Conceptually you can consider each term a unique direction in n-space (where n is the number of terms). Each term's weight represents the distance in that direction. The p-value (p in the above formula) represents how much the final calculation is like the maximum weight or like the total distance of all vectors in n-space. For p = 2 the p-norm is identical to a vector space model. As p approaches infinity the returned weight becomes equal to the largest weight among the terms.

Picking an appropriate p-value can be tricky and often is the result of experimentation. A simplified way of thinking about the calculation is that it varies between a kind of average weight to the maximum weight. Generally a p-value of 2.5 or 3 is effective.

The p-norm weighting method is one of the most efficient methods available, especially if you've been careful in choosing how you weight your terms.

The only terms included in the p-norm calculation for P_ATLEAST are those found in each record with the required weight. Thus if you've indexed several documents as unique records you can have different weights for each one.

Comments

P_ATLEAST requires that any returned term or subquery have a weight in the record of at least the specified weight.

Example

P_ATLEAST( 2, 0.6, 'blue', 'green', 'red' );

See Also

atleast, atmost, r_atleast, r_atmost, p_atleast, p_atmost, or, value, r_or, p_or, m_or