Languages Supported
About Languages
Development SDK/API
  Contact Us
  Online Demonstration

Have you ever come across documents or websites and not known what language it is written in? The Lextek Language Identifier is capable of identifying not only what language it is written in but also its character encoding and what other languages are most similar. This will alleviate your wondering of what languages you are working with or looking at.

Free Download! Free For Non-Commercial Use!

The Lextek Language Identifier is free for you to use for your personal use on your non-commercial projects. Please feel free to share it with your friends those you work with. The Language Identifier is free to download and free to share with others for non-commercial purposes.

Lextek Language Identifier ScreenshotEasy To Use

An easy to use interface gives you the ability to quickly and easily identify the language of your documents. Simply paste the text in question into the Language Identifier's text window and press the Identify Button. The language identifier will analyze the text you have provided and tell you which language matches and which other languages most closely match. Quick, simple, and efficient.

More Languages!

The Lextek Language Identifier offers more language and encoding modules than any other language identifier. Currently, there are 260 language and encoding modules for you to use in your analysis. We challenge you to find any other language identifier that supports such a wide range of languages and encodings. More languages are being added all the time and if you have a need for a custom language module to be built, please contact us. (This can often be done as a free service.)

Identify the Encoding / Character Set

Not only is the language identifier able to identify which language it is analyzing, it is also capable of identifying the character encoding. This is important because of the wide variety of character encodings that are currently in use. Even Unicode -- the "master" character encoding standard -- has a variety of different encodings and formats (UTF-7, UTF-8, UTF-16, etc.) that are in use.

  Download the Lextek Language Identifier Now!


Software Developers Version Available

If you are a software developer and are writing software that must efficiently deal with multiple languages, the Lextek Language Identifier development kit will make your life much easier. The easy to use API and interface will make integration into your projects a simple and straight forward process. In a world where software must deal efficiently with multiple languages and character encodings, the Lextek Language Identifier will help your software know how to treat the text and documents it must deal with.

Sample Language Identifier Applications:

  • Identification of the language and encoding of WWW pages
  • Language specific web crawling applications
  • Information retrieval applications
  • Natural language processing applications
  • Text Mining
  • Translation service bureau software
  • Spell checking software
  • For use in conjunction to with stemming or morphological analyzers
  • Knowledge management systems