Named Entity Recognition (NER) detects named entities in text.
The NER model uses natural language processing to find a variety of named entities. For each entity extracted, NER also returns the location of the entity extracted (offset and length), and a confidence score, which is a value 0 to 1.
Supported Languages for Input Text
English
Spanish
Use Cases
You could use the NER endpoint effectively in these scenarios:
Classifying content for news providers
It can be difficult to classify and categorize news article content. The NER model can automatically scan articles to identify the major people, organizations, and places in them. The extracted entities can be saved as tags with the related articles. Knowing the relevant tags for each article helps with automatically categorizing the articles in defined hierarchies, and content discovery.
Customer support
Recognizing relevant entities in customer complaints and feedback,
product specifications, department details, or company branch details,
helps to classify the feedback appropriately. The entities can then be
forwarded to the person responsible for the identified product.
Similarly, there could be feedback tweets where you can categorize them
all based on their locations, and the products mentioned.
Efficient search algorithms
You could use NER to extract entities that are then searched against the
query, instead of searching for a query across the millions of articles
and websites online. When run on articles, all the relevant entities
associated with each article are extracted and stored separately. This
separation could speed up the search process considerably. The search
term is only matched with a small list of entities in each article,
leading to quick and efficient searches.
It can be used for searching content from millions of research papers,
Wikipedia articles, blogs, articles, and so on.
Content recommendations
Extracting entities from a particular article, and recommending the other
articles that have the most similar entities mentioned in them is
possible with NER. For example, it can be used effectively to develop
content recommendations for a media industry client. It enables the
extraction of the entities associated with historical content or
previous activities. NER compares them with the label assigned to other
unseen content to filter relevant entities.
Automatically summarizing job candidates
The NER model could facilitate the evaluation of job candidates, by
simplifying the effort required to shortlist candidates with numerous
applications. Recruiters could filter and categorize them based on
identified entities like location, college degrees, employers, skills,
designations, certifications, and patents.
Supported Entities 🔗
The following table describes the different entities that NER can extract. The entity
type and subtype depends on the API that you call
(detectDominantLanguageEntities or
batchDetectDominantLanguageEntities).
Note
To maintain backward compatibility, the
detectDominantLanguageEntities wasn't modified when we
introduced the concept of subtype. We recommend that you use the
batchDetectDominantLanguageEntities endpoint because the
service uses types and subtypes. The isPii property was dropped
to introduce the batching API so you can compute it with the supported entity
types as in the following table.
Entity (Full Name)
Entity Type (In Prediction)
Entity Subtype (In prediction)
Single Record API / Batch API (if blank, both APIs are
consistent)
Is PII
Description
DATE
DATE
Single record
X
Absolute or relative dates, periods, and date range.
Examples:
"10th of June",
"third Friday in August"
"the first week of March"
DATETIME
DATE
Batch
EMAIL
EMAIL
√
EVENT
EVENT
Χ
Named hurricanes, sports events, and so on.
FACILITY
FACILITY
Single record
Χ
Buildings, airports, highways, bridges, and so
on.
LOCATION
FACILITY
Batch
GEOPOLITICAL ENTITY
GPE
Single record
Χ
Countries, cities, and states.
LOCATION
GPE
Batch
IP ADDRESS
IPADDRESS
√
IP address according to IPv4 and IPv6 standards.
LANGUAGE
LANGUAGE
Χ
Any named language.
LOCATION
LOCATION
Χ
Non-GPE locations, mountain ranges, bodies of water.
CURRENCY
MONEY
Single record
X
Monetary values, including the unit.
QUANTITY
CURRENCY
Batch
NATIONALITIES,
RELIGIOUS and
POLITICAL GROUPS
NORP
Χ
Nationalities, religious or political groups.
ORGANIZATION
ORG
Χ
Companies, agencies, institutions, and so on.
PERCENTAGE
PERCENT
Single record
Χ
Percentage.
QUANTITY
PERCENTAGE
Batch
PERSON
PERSON
√
People, including fictional characters.
PHONENUMBER
PHONE_NUMBER
√
Supported phone numbers:
("GB") - United Kingdom
("AU") - Australia
("NZ") - New Zealand
("SG") - Singapore
("IN") - India
("US") - United States
PRODUCT
PRODUCT
Χ
Vehicles, tools, foods, and so on (not services).
NUMBER
QUANTITY
Single record
Χ
Measurements, as weight or distance.
QUANTITY
NUMBER
Batch
X
TIME
TIME
Single record
Χ
Anything less than 24 hours (time, duration, and so
on).
DATETIME
TIME
Batch
URL
URL
√
URL.
Examples 🔗
Input Text
Entities and Scores
Red Bull Racing Honda, the four-time Formula-1 World
Champion team, has chosen Oracle Cloud Infrastructure
(OCI) as their infrastructure partner.
Red Bull Racing Honda [ORG] 1.0000
four-time [QUANTITY/NUMBER] 1.0000
Formula-1 World [EVENT] 0.9705
Oracle Cloud Infrastructure (OCI [ORG] 0.9811
OCI recently added new services to the existing
compliance program including SOC, HIPAA, and ISO, to enable our customers
to solve their use cases. We also released new technical papers and
guidance documents related to Object Storage, the Australian Prudential
Regulation Authority (APRA), and the Central Bank of Brazil. These
resources help regulated customers better understand how OCI
supports their regional and industry-specific compliance requirements.
Not only are we expanding our number of compliance offerings and
regulatory alignments, we continue to add regions and services at
a faster rate.
OCI [ORG] 1.0000
SOC [ORG] 1.0000
HIPAA [ORG] 1.0000
ISO [ORG] 1.0000
Australian Prudential Regulation Authority [ORG] 1.0000
Central Bank of Brazil [ORG] 0.9998
OCI [ORG] 1.0000
The JSON for the first example is:
Sample Request
Copy
POST https://<region-url>/20210101/actions/batchDetectLanguageEntities
API Request format:
Copy
"{
"documents": [
{ "key": "doc1", "text": " Red Bull Racing Honda, the four-time Formula-1 World Champion team, has chosen Oracle Cloud Infrastructure (OCI) as their infrastructure partner." }
]
}"
Sometimes, entities might not be separated or combined as you expect.
NER uses the context of the sentence to identify entities. If the context isn't present in the text processed, entities might not be extracted as you expect.
Malformed text (structure and semantics) might reduce the performance.
Age isn't a separate entity so age-related periods might be identified as a
date entity.