Language Identifier SDK
For many applications,
it is important to be able to correctly identify the language
that a document or piece of text is written in. The Lextek Language
Identifier enables you to do this. Since some languages may be
written in several character encodings, the Lextek Language Identifier
will automatically identify what character encoding the text
was written in.
260 different languages and character encodings, the Lextek Language
Identifier gives you the ability to automatically recognize more
languages and encodings than any other language identifier available.
We are adding more languages all the time and work closely with
our customers to ensure that their language recognition needs
are fully supported.
Uses for the Lextek Language Identifier
Because of the growth of
the Internet, we now live in a world wide society communicating
and doing business with people who use a wide variety of languages.
Because of this, there is a need in today's software to be able
to automatically identify what language a piece of text is written
in. Just a few of the many different uses where a language identifier
may prove useful are as follows:
- Natural language processing
- Multi-Lingual spell checking.
- E-Mail routing and filtering
- Content based and language
specific web crawlers and search engines.
- Information retrieval
- Text mining applications.
- Document filtering systems.
- Translation service bureau
- Identification of the
language or encoding of WWW pages.
- Anywhere where you might
need to work with more than one language seamlessly.
- Working in conjunction
with stemmers or morphological analyzers.
- Knowledge Management Systems
The Lextek Language Identifier
doesn't just support more languages than any other language identifier.
It also has impressive recognition accuracy. Sometimes documents
do not always provide much text to work with. The Lextek Language
Identifier's high sensitivity still lets it make good estimates
of the correct language for these short documents. It can provide
estimates with as few as a dozen characters. As the document
size goes up so does the accuracy. It takes little text to achieve
an almost 100% recognition rate.
The Lextek Language Identifier
has been designed up to be able to quickly and efficiently recognize
languages. It utilizes a highly optimized engine. We spent the
time to optimize the language identifier so that your application
can spend its time doing what it does best.
Easy To Use
should be a simple job. Unlike some language identifiers, we
have kept it as simple as it should be. Our API is as easy to
use as one would hope for it to be. Adding automatic language
identification to your applications is quick and straightforward.
It involves only the addition of a few calls to the Language
Identifier SDK to your application.
Simply start up the language
identification library, open the language modules you are interested
in, and pass in some text to be recognized, and the Lextek Language
Identifier will tell you which language it is. It is as simple
Currently, the language
identifier is available on Windows, Linux, MacOSX, MacOS9, and
Solaris. We designed the language identifer's code to be cross
platform. We are therefore capable of porting it to any additional
platform(s) you may need. In addition, support is provided for
a variety of different programming environments and compilers.
For further information
about the Lextek Language Identifier SDK or for an evaluation
copy, please contact firstname.lastname@example.org.
Please specify what operating system and compiler you are planning
on using for your evaluation..
In addition, the manual
is available at: http://www.lextek.com/manuals/langid/
and a free end user application is available from: http://www.languageidentifier.com/
1051 E. Fir Ave
Provo, UT 84604
United States of America