Language Detection
The language detection model identifies which natural language the input text is in.
For example, language detection can help make customer support interactions more personable and quicker. Customer service chatbots can interact with customers based on the language of their input text and respond accordingly. If a customer needs help with a product, the chatbot server can field the corresponding language product manual, or transfer to the call center for the specific language.
Supported Languages
Language Code | Language |
---|---|
af |
Afrikaans |
sq |
Albanian |
am |
Amharic |
ar |
Arabic |
hy |
Armenian |
as |
Assamese |
az |
Azerbaijani |
ba |
Bashkir |
eu |
Basque |
be |
Belarusian |
bn |
Bengali |
ber |
Berber |
bs |
Bosnian |
bg |
Bulgarian |
my |
Burmese |
ca |
Catalan |
ceb |
Cebuano |
km |
Central Khmer |
ce |
Chechen |
zh-CN |
Chinese (Simplified) |
zh-TW |
Chinese (Traditional) |
cv |
Chuvash |
co |
Corsican |
hr |
Croatian |
cs |
Czech |
da |
Danish |
dv |
Dhivehi |
nl |
Dutch |
mhr |
Eastern Mari |
en |
English |
eo |
Esperanto |
et |
Estonian |
fi |
Finnish |
fr |
French |
gl |
Galician |
ka |
Georgian |
de |
German |
el |
Greek |
gu |
Gujarati |
ht |
Haitian |
ha |
Hausa |
he |
Hebrew |
hi |
Hindi |
hu |
Hungarian |
is |
Icelandic |
io |
Ido |
ig |
Igbo |
ilo |
Iloko |
id |
Indonesian |
ga |
Irish |
it |
Italian |
ja |
Japanese |
jv |
Javanese |
kab |
Kabyle |
kn |
Kannada |
kk |
Kazakh |
ky |
Kirghiz / Kyrgyz |
ko |
Korean |
ku |
Kurdish |
lo |
Lao |
la |
Latin |
lv |
Latvian |
lt |
Lithuanian |
jbo |
Lojban |
lb |
Luxembourgish |
mk |
Macedonian |
mg |
Malagasy |
ms |
Malay |
ml |
Malayalam |
mt |
Maltese |
mr |
Marathi |
mn |
Mongolian |
ne |
Nepali |
no |
Norwegian (Bokmal) |
nn |
Norwegian (Nynorsk) |
or |
Oriya |
fa |
Punjabi/Panjabi |
pl |
Persian |
pt |
Polish |
pa |
Portuguese |
ps |
Pushto |
qu |
Quechua |
ro |
Romanian |
ru |
Russian |
sr |
Serbian |
sd |
Sindhi |
si |
Sinhala |
sk |
Slovak |
sl |
Slovenian |
so |
Somali |
es |
Spanish |
su |
Sundanese |
sw |
Swahili |
sv |
Swedish |
tl |
Tagalog |
tg |
Tajik |
ta |
Tamil |
tt |
Tatar |
te |
Telugu |
th |
Thai |
bo |
Tibetan |
tr |
Turkish |
tk |
Turkmen |
ug |
Uighur |
uk |
Ukrainian |
ur |
Urdu |
uz |
Uzbek |
vi |
Vietnamese |
war |
Waray |
cy |
Welsh |
fy |
Western Frisian |
sah |
Yakut |
yi |
Yiddish |
yo |
Yoruba |
Examples
Input Text | Language and Scores |
---|---|
|
English 0.9999 |
|
Russian 0.9999 |
The JSON for the first example is:
- Sample Request
-
POST https://<region-url>/20210101/actions/batchDetectDominantLanguage
- API Request format:
-
{ "documents": [ { "key": "doc1", "text": "OCI recently added new services to existing compliance program including SOC, HIPAA, and ISO to enable our customers to solve their use cases. We also released new white papers and guidance documents related to Object Storage, the Australian Prudential Regulation Authority (APRA), and the Central Bank of Brazil. These resources help regulated customers better understand how OCI supports their regional and industry-specific compliance requirements. Not only are we expanding our number of compliance offerings and regulatory alignments, we continue to add regions and services at a faster clip." } ] }
- Response JSON:
-
{ "documents": [ { "key": "1", "languages": [ { "code": "en", "name": "English", "score": 0.9999840921009815 } ] } ], "errors": [] }
Limitations
-
Only one language is returned. In cases where the input is multilingual, the dominant language is returned.