Profiling Engine SDK
|
|
The Lextek Profiling Engine SDK consists of two parts: the API and the query language. The API is fundamentally the most important and fundamental part of the SDK. It is the API (application programming interface) that you use to integrate the SDK into your program. The API consists of a few function calls that can be divided up into three main categories. The first deal with the creating and deleting of a profiling object. You can think of this object as the profiler itself. You use this object to control the profiling process by passing it to all other functions. The next category are the calls that index your documents. These functions are what put information into the profiler. The final category is for executing queries and then manipulating the results of those queries. This manual breaks all of the functions down by category and provides a brief explanation of those categories. For further information consult the documentation for each individual function or the Tutorial for using the Profiling Engine SDK.
Functions for Creating and Deleting the Profiling Object The main component of the Profiling Engine is a profiling object. This object is a variable that is declared to be of type ProfilingEngineT. The object is used to coordinate all your calls to the Profiling Engine and is passed to nearly every function. Before you can use the Profiling Engine you must create one of the objects. When you have finished using the Profiling Engine you must delete the object.
Functions for Indexing Documents To index a document you continually call prIndexWord. You determine what words are sent to the indexer. This lets you eliminate stop-words or even index only those words that you use in your queries. Further it lets you have as much control over the indexing process as you wish. The Profiling Engine breaks all indexes up into series of records. Records are the default area over which all query operators work. For instance you can find two words that occur in the same record. To start adding words to a new record you simply call prIncrementRecord. Where you put record breaks is at your discretion. While the most common use of the Profiling Engine is to start a new record with each new document, many people break on document sections or paragraphs. When you have finished with a collection of records you can reset the profiler and start adding new documents.
Functions for Analyzing Documents After you've added words to the index you can begin to analyze your results. You perform a query by passing a string to prProcessQuery. This returns a list of "hits" called a vector in an object of type OnixQueryVectorT. Each hit consists of the location of the hit (its word number and record number) and the weight or rank of that hit, You can iterate through all the hits in a vector with a few simple function calls. Because queries can define persistent named queries called functions, there is also a function for clearing all the information in the query processor. The typical use of the Profiling Engine is to load large queries that define categories prior to indexing any document. As you load documents you then process queries that make use of those categories - in effect testing each document against these preloaded queries. This is rather different from most indexes where you load all the documents first and then test queries against the documents.
|