Language Detection

The language detection model identifies which natural language the input text is in.

For example, language detection can help make customer support interactions more personable and quicker. Customer service chatbots can interact with customers based on the language of their input text and respond accordingly. If a customer needs help with a product, the chatbot server can field the corresponding language product manual, or transfer to the call center for the specific language.

Supported Languages

Language Code Language
af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
as Assamese
az Azerbaijani
ba Bashkir
eu Basque
be Belarusian
bn Bengali
ber Berber
bs Bosnian
bg Bulgarian
my Burmese
ca Catalan
ceb Cebuano
km Central Khmer
ce Chechen
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
cv Chuvash
co Corsican
hr Croatian
cs Czech
da Danish
dv Dhivehi
nl Dutch
mhr Eastern Mari
en English
eo Esperanto
et Estonian
fi Finnish
fr French
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian
ha Hausa
he Hebrew
hi Hindi
hu Hungarian
is Icelandic
io Ido
ig Igbo
ilo Iloko
id Indonesian
ga Irish
it Italian
ja Japanese
jv Javanese
kab Kabyle
kn Kannada
kk Kazakh
ky Kirghiz / Kyrgyz
ko Korean
ku Kurdish
lo Lao
la Latin
lv Latvian
lt Lithuanian
jbo Lojban
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mr Marathi
mn Mongolian
ne Nepali
no Norwegian (Bokmal)
nn Norwegian (Nynorsk)
or Oriya
fa Punjabi/Panjabi
pl Persian
pt Polish
pa Portuguese
ps Pushto
qu Quechua
ro Romanian
ru Russian
sr Serbian
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tl Tagalog
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
bo Tibetan
tr Turkish
tk Turkmen
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
war Waray
cy Welsh
fy Western Frisian
sah Yakut
yi Yiddish
yo Yoruba

Examples

Input Text Language and Scores
OCI recently added new services to the existing 
compliance program including SOC, HIPAA, and ISO, to enable 
our customers to solve their use cases. We also released new 
white papers and guidance documents related to Object Storage, 
the Australian Prudential Regulation Authority (APRA), and 
the Central Bank of Brazil. These resources help regulated 
customers better understand how OCI supports their regional 
and industry-specific compliance requirements. Not only are 
we expanding our number of compliance offerings and regulatory 
alignments, we continue to add regions and services at 
a faster rate.
English 0.9999
«Нос» - сатирический рассказ Николая
Гоголя, написанный во время его жизни в
Санкт-Петербурге. В это время в
творчестве Гоголя основное внимание
уделялось сюрреализму и гротеску с
романтическим оттенком; Предлагаемый
здесь рассказ «Нос» является примером.
Рассказ Николая Гоголя «Нос» был
написан между 1832 и 1833 годами и
завершен в 1834 году, подвергался
различным пересмотрам и модификациям
Н. Гоголем, в основном из-занепрерывного вмешательства цензуры.
Russian 0.9999

The JSON for the first example is:

Sample Request
POST https://<region-url>/20210101/actions/batchDetectDominantLanguage
API Request format:
{
  "documents": [
    {
      "key": "doc1",
      "text": "OCI recently added new services to existing compliance program 
including SOC, HIPAA, and ISO to enable our customers to solve their use 
cases. We also released new white papers and guidance documents related to 
Object Storage, the Australian Prudential Regulation Authority (APRA), and the 
Central Bank of Brazil. These resources help regulated customers better 
understand how OCI supports their regional and industry-specific compliance 
requirements. Not only are we expanding our number of compliance offerings and 
regulatory alignments, we continue to add regions and services at a faster clip."
    }
  ]
}
Response JSON:
{
    "documents": [
        {
            "key": "1",
            "languages": [
                {
                    "code": "en",
                    "name": "English",
                    "score": 0.9999840921009815
                }
            ]
        }
    ],
    "errors": []
}

Limitations

  • Only one language is returned. In cases where the input is multilingual, the dominant language is returned.