Profiling Engine SDK
Operator Reference

Operator Reference
Operator List
Topical List
   
 
Query Language
Introduction
Summary
Operators
Tips, Questions, and Answers
   
 
Main Index
Index
Tutorial
API Functions
Query Language
   
Technology Overview
   
Contact Us
   
 
Other Products
Onix Text Search and Retrieval Engine
Brevity Document Summarizer
Lextek Document Profiler & Categorizer
RouteX Document Routing Engine
Lextek Language Identifier
 

P_OR

Name

P_OR

Synopsis

Returns records where any the terms or subqueries are present.

Arguments

P-value
List of terms and subqueries

Ranking Scheme

Ranks using the p-norm method. Conceptually you can consider each term a unique direction in n-space (where n is the number of terms). Each term's weight represents the distance in that direction. The p-value (p in the above formula) represents how much the final calculation is like the maximum weight or like the total distance of all vectors in n-space. For p = 2 the p-norm is identical to a vector space model. As p approaches infinity the returned weight becomes equal to the largest weight among the terms.

Picking an approrpiate p-value can be tricky and often is the result of experimentation. A simplified way of thinking about the calculation is that it varies between a kind of average weight to the maximum weight. Generally a p-value of 2.5 or 3 is effective.

The p-norm weighting method is one of the most effecient methods available, especially if you've been careful in choosing how you weight your terms.

The only terms included in the p-norm calculation for P_OR are those found in each record. Thus if you've indexed several documents as unique records you can have different weights for each one, depending upon which terms were present.

Comments

P_OR returns any hit for the terms or subqueries.

Example

P_OR( 'blue', 'green', 'red', 'orange', 'violet' );

See Also

any, or, value, r_or, p_or, m_or, atleast, atmost, r_atleast, r_atmost, p_atleast, p_atmost