データセットのエクスポート
データ・ラベリングのデータ・セットを様々なテキスト形式およびイメージ形式でエクスポートし、JSONLファイルをスナップショットできます。
データ・ラベリングのデータセットは、テナンシ内のオブジェクト・ストレージの場所にエクスポートできます。したがって、バージョンを管理したり、他の場所で(たとえば機械学習モデル開発への入力として)データセットを使用したりできます。出力ファイルの場所はエクスポート・パネルに含まれています。エクスポート後、関連付けられた作業リクエストで宛先を使用できます。宛先は「データセット詳細」ページにも表示されますが、表示されるのは作業リクエストが存在する間のみです。
ドキュメントの場合、JSONLファイルにエクスポートできます。
イメージの場合は、次のファイル形式にエクスポートできます。
- JSONL
- YOLO V5
- COCO
- PASCAL VOC
- JSONL
- JSONL Compact Plus Content
- spaCy
- CoNLL V2003 ノート
CoNLL形式でテキストをエクスポートする場合、再帰エンティティと重複エンティティは無視されます。
ノート
CSVの場合、唯一のオプションは
CSVの場合、唯一のオプションは
JSONL
にエクスポートすることです。このタスクはCLIでは使用できません。
このタスクはAPIでは使用できません。
エクスポートされたドキュメント、イメージおよびテキスト・データセットの例
データ・ラベリングでデータセットがエクスポートされたときに作成されるJSONファイルの例。
エクスポート済連結JSONファイルの例
エクスポートされた統合JSONファイルの例。
{
"id": "ocid1.datalabelingdatasetdev.oc1.iad.amaaaaaazaehrjyag7jcbu3xnpw4dcn3tmniarzorpxbtegnipsw5oleeauq",
"compartmentId": "ocid1.compartment.oc1..aaaaaaaaihdqc5z4zq4sqt7t4c7vbwc6lbf5dr6mky2phcpvdlh7c3p5mtuq",
"displayName": "test-check",
"description": "test check",
"labelsSet": [{
"name": "location"
}, {
"name": "university"
}],
"annotationFormat": "ENTITY_EXTRACTION",
"datasetSourceDetails": {
"namespace": "idrcdhfxwqwa",
"bucket": "test-sachin-cucket"
},
"datasetFormatDetails": {
"formatType": "TEXT"
}
} {
"id": "ocid1.datalabelingrecord.oc1.iad.amaaaaaazaehrjyahykmu6hvdksayw64a3wmur7mk2366hgitlypk6u2soea",
"timeCreated": "2021-10-12 12:09:37",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "sample-text.txt"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyat64zcfbjviu3pttykthabv5jiuicva3dkv6oikstzd7q",
"timeCreated": "2021-10-12 12:16:51",
"createdBy": "ocid1.user.oc1..aaaaaaaaktqgvx2skco6bfyziwjzfjaxensoewscqbk7p44sjqyrxmz4qozq",
"entities": [{
"entityType": "TEXTSELECTION",
"labels": [{
"label_name": "university"
}],
"textSpan": {
"offset": 60,
"length": 11
}
}]
}]
}
エクスポートされたドキュメント・データセットJSONファイルの例
エクスポートされたドキュメント・データセットJSONファイルの例。
{
"id":"ocid1.datalabelingdatasetint.oc1.iad.amaaaaaaniob46iafkiyw6a4uwgrnpy4lfxjoslocap7elaj257mxh4fzuwq",
"compartmentId":"ocid1.compartment.oc1..aaaaaaaajqiw27knoagxurhzjlihw7ijnoshsu4zi2uawdn5gfexdqwvu4vq",
"displayName":"Sep6_PDF",
"labelsSet":[
{
"name":"L1"
},
{
"name":"L"
},
{
"name":"23423"
}
],
"annotationFormat":"MULTI_LABEL",
"datasetSourceDetails":{
"namespace":"idgszs0xipmn",
"bucket":"Demo-bucket"
},
"datasetFormatDetails":{"formatType":"DOCUMENT"},
"recordFiles":[
{
"namespace":"idgszs0xipmn",
"bucket":"COVID_Dataset",
"path":"Snapshotsrecords_1632479104889.jsonl"
}
]
}
エクスポートされたイメージ・データセットJSONファイルの例
エクスポートされたイメージ・データセットJSONファイルの例。
{
"id": "ocid1...",
"compartmentId": "",
"timeCreated":2020-12-15...,
"displayName":...,
"description":...,
"labelsSet": [
{"name":"germanshepherd"},
{"name":"americanshepherd"},
{"name":"australianshepherd"},
{"name":"irishwolfhound"}
]
"annotationFormat": "IMAGE_OBJECT_SELECTION",
"datasetSourceDetails": {
"sourceType": "OBJECT_STORAGE",
"namespace": "i235o3idk",
"bucket": "mytrainingdata",
"prefix": "puppyproject/"
}
"datasetFormatDetails": {
"formatType": "IMAGE" # image requires less metadata than delimited for example
}
"recordsFiles: {
[
{
"namespace": "i235o3idk"
"bucket": "mylabels"
"path": "puppyproject/records1.json"
}
]
}
"definedTags": {}
"freeformTags": {}
}
エクスポートされたテキスト・データセットJSONファイルの例
エクスポートされたテキスト・データセットJSONファイルの例。
{
"id":"ocid1.datalabelingdatasetdev.oc1.iad.amaaaaaazaehrjyamqjx733dhxd25zxcro2nftrewq7ltj34ua2cfapzsmjq",
"compartmentId":"ocid1.compartment.oc1..aaaaaaaagzh2kii2frktoc7bcvfydpzkxr7dbn6nf6jcyrxwgzen4pi5y4zq",
"displayName":"NER DEMO DATASET UNLABELLED",
"description":"NER DEMO DATASET UNLABELLED",
"labelsSet":[
{
"name":"Person"
},
{
"name":"Organization"
},
{
"name":"Event"
},
{
"name":"Place"
}
],
"annotationFormat":"ENTITY_EXTRACTION",
"datasetSourceDetails":{
"namespace":"idrcdhfxwqwa",
"bucket":"news-articles"
},
"datasetFormatDetails":{
},
"recordFiles":[
{
"namespace":"idrcdhfxwqwa",
"bucket":"snapshots",
"path":"forReview/records_1621847577526.jsonl"
}
]
}
エクスポートされた文書レコードJSONファイルの例
エクスポートされたドキュメント・レコードJSONファイルの例。
{
"id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaqgpzhscdpdcgohg5ocp3obwmjjgju6m73bmyrt4aovhq",
"timeCreated":"2021-09-06 03:40:02",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"SampleDocs-sample-pdf-file copy 98.pdf"
},
"annotations":[
{
"id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaaniob46iatjg3p6hlszxrgmsj4y76b5tndddaedm6ardkoxbtt6mq",
"timeCreated":"2021-09-06 03:42:43",
"createdBy":"ocid1.user.oc1..aaaaaaaa6ynps4htdea6fqoerfhkedp3lih2ktureqhw3hmfojde6ukf3mpa",
"entities":[
{
"entityType":"GENERIC","labels":[
{
"label_name":"23423"
}
]
}
]
}
]
}{
"id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iasb5klulgaj4djn3acsgsd3cekx3ix46ftxjdip4tu23a",
"timeCreated":"2021-09-06 03:40:02",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"SampleDocs-sample-pdf-file copy 99.pdf"
},
"annotations":[
{
"id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaaniob46iav45mlpcleqjt7cnmhyogopszi2rfnilwjhd4xyxa7irq",
"timeCreated":"2021-09-06 03:42:47",
"createdBy":"ocid1.user.oc1..aaaaaaaa6ynps4htdea6fqoerfhkedp3lih2ktureqhw3hmfojde6ukf3mpa",
"entities":[
{
"entityType":"GENERIC","labels":[
{
"label_name":"L1"
}
]
}
]
}
]
}{
"id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaxhixolkqryomyu6i4jrrmzwcckw2tmgva47suylu5rzq",
"timeCreated":"2021-09-06 03:40:02",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"SampleDocs-sample-pdf-file copy 97.pdf"
}
}{
"id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iagymrjuem42kvzilxjd5hdrr3djznrl7aajvvcr6zc6sq",
"timeCreated":"2021-09-06 03:40:02",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"SampleDocs-sample-pdf-file copy 96.pdf"
}
}{
"id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaclpccpxn5hgmplesv3mt3g6hxkfaepzv6fuy7b6he3ca",
"timeCreated":"2021-09-06 03:40:02",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"SampleDocs-sample-pdf-file copy 2.pdf"
}
}
エクスポートされたイメージ・レコードJSONファイルの例
エクスポートされたイメージ・レコードJSONファイルの例。
{
"id": "ocid1...",
"timeCreated": 2020-12-15...,
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "filename2.jpg"
}
"annotations": [
{
"id": "ocid1....",
"timeCreated": ...,
"createdBy": ...,
"entities: [
{
"entityType": "IMAGEOBJECTSELECTION",
"labels": [
{"name": "germanshepherd"}
],
"boundingPolygon": {
normalizedVertices: [
{"x":0.2, "y":0.2},
{"x":0.3, "y":0.2},
{"x":0.3, "y":0.3},
{"x":0.2, "y":0.3}
]
}
},
{
"entityType": "BOUNDING_BOX",
"labels": [
{"name": "irishwolfhound"}
],
"boundingPolygon": {
normalizedVertices: [
{"x":0.4, "y":0.4},
{"x":0.5, "y":0.4},
{"x":0.5, "y":0.5},
{"x":0.4, "y":0.5}
]
}
}
]
}
],
"freeformTags": {
"set": "validation" # optional, user defined convention used for reproducibility
}
}
エクスポートされたテキスト・レコードJSONファイルの例
エクスポートされたテキスト・レコードJSONファイルの例。
{
"id":"ocid1.record.oc1.iad.UxxfPBMZVYfwZHZnjCPUGkhMwpWoTPMOnxDnrgXbBxwLKkrdeGwewdViOoUJ",
"timeCreated":"2021-06-21 09:06:01",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"article_3.txt"
},
"annotations":[
{
"id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyadghacojq3nmo2mtcbcmlo4rgslmpzxeboujhduft5nta",
"timeCreated":"2021-46-21 09:46:45",
"createdBy":"ocid1.user.oc1..aaaaaaaazjupiis2cu54smlzemiujpqxriz6i4wp3euuqrzffdugib73epbq",
"entities":[
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Event"
}
],
"textSpan":{
"offset":141,
"length":12
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Organization"
}
],
"textSpan":{
"offset":204,
"length":20
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Person"
}
],
"textSpan":{
"offset":254,
"length":15
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Organization"
}
],
"textSpan":{
"offset":402,
"length":3
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Place"
}
],
"textSpan":{
"offset":638,
"length":11
}
}
]
}
]
}{
"id":"ocid1.record.oc1.iad.AakCoDHvJpnZofzIYfRCfpZnFUqNmfiWNIuNysbXCSRZeTVqdwKGvYjJpMvh",
"timeCreated":"2021-06-21 09:06:01",
"sourceDetails":{
"sourceType":"OBJECT_STORAGE",
"path":"article_1.txt"
},
"annotations":[
{
"id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyafoed6oimxqxeyey6osjo3jp52vsyd75i5zspfvcfdz3q",
"timeCreated":"2021-30-21 03:30:10",
"createdBy":"ocid1.user.oc1..aaaaaaaazjupiis2cu54smlzemiujpqxriz6i4wp3euuqrzffdugib73epbq",
"entities":[
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Person"
}
],
"textSpan":{
"offset":36,
"length":8
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Person"
}
],
"textSpan":{
"offset":147,
"length":23
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Organization"
}
],
"textSpan":{
"offset":196,
"length":3
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Event"
}
],
"textSpan":{
"offset":311,
"length":22
}
},
{
"entityType":"TEXTSELECTION",
"labels":[
{
"label_name":"Place"
}
],
"textSpan":{
"offset":512,
"length":49
}
}
]
}
]
}
エクスポートされたCSVテキスト・データセットJSONファイルの例
エクスポートされたCSV (テキスト)データセットJSONファイルの例。
{
"id": "ocid1.datalabelingdatasetint.oc1.phx.amaaaaaaniob46iaxarhafiu42tbdm2d2nkxlkxwhnc76ohnwvpsdfccqw5q",
"compartmentId": "ocid1.compartment.oc1..aaaaaaaaundh4v2w4spnyt4hgy367qf54jonakpz6gh573bspmgzfoj2auga",
"displayName": "Text Classification CSV dataset",
"labelsSet": [{
"name": "positive"
}, {
"name": "neutral"
}, {
"name": "negative"
}],
"annotationFormat": "SINGLE_LABEL",
"datasetSourceDetails": {
"namespace": "idgszs0xipmn",
"bucket": "TEST",
"prefix": "languageteam/Text_Classification_Context_Oracle_advt.csv"
},
"datasetFormatDetails": {
"formatType": "TEXT",
"textFileTypeMetadata": {
"formatType": "DELIMITED",
"delimitedFileTypeMetaData": {
"columnIndex": 5,
"columnName": "CONTENT",
"columnDelimiter": ","
}
}
}
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iajx42mojwkktind744i3t2q3di6tdhwysw2wy4d42tseq",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/546"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iadsu6zpch4lvozx7ci3as5st23jqxjpjdcryp4jworala",
"timeCreated": "2022-06-05 05:40:48",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "neutral"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia7otgs2rb3kuh464sisfbjxxbbkb65sbg2icst3gquw3q",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/303"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iatfuceqzjb5nnh7quk5wupvwe74bfpn5oka57cz6gqv4a",
"timeCreated": "2022-06-05 05:41:30",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "neutral"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iab55fqcxlfb3xszlpp7qnpsthjdhzzb7nki65xqdvgceq",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/547"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iamosgunt72lci3g3mzyyx2sskjdje4e5zspts7mbnsl5q",
"timeCreated": "2022-06-05 05:41:36",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "neutral"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia45ave4zhtisvu2k7d6tbciskcge4ecm2imb6bvdqe4da",
"timeCreated": "2022-06-05 04:39:21",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/564"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iauqo6tlqil7vijetsayt6vsmpohxum5vmj6cde3wbfxua",
"timeCreated": "2022-06-05 05:40:44",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "positive"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iasymkpbstgjwmae7ar5ikgp5mtth2izcaaaruatpl45ma",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/545"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iatu6k7afdwirdtvv6bofrquc65m4ruet4hlfmhgzhqjxa",
"timeCreated": "2022-06-05 05:41:02",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "positive"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia6n4whohdhn257pmot7zlncawockthadosdhrp5so2nna",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/304"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iaslgb6s6h5ffce5mcgeidndp3vydcxzjya7yrbaj6pw5a",
"timeCreated": "2022-06-05 05:40:57",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "negative"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iamgsncrjarzujr6duaedmsjyrp67yi7dpe2uoi6h54c5a",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/548"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iabt3hwyc7mkaanez7q24k7vlfds3lisa6hdu53hntq2qq",
"timeCreated": "2022-06-05 05:42:55",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "positive"
}]
}]
}]
} {
"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iactsl4j7v633d2y2t67lkxawv2nyemz7wwarppjpxeofq",
"timeCreated": "2022-06-05 04:39:18",
"sourceDetails": {
"sourceType": "OBJECT_STORAGE",
"path": "/305"
},
"annotations": [{
"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46ia7xxg4ukky3ur56zzwaodvwrks4vqgvoug2z2moif274a",
"timeCreated": "2022-06-05 05:41:44",
"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
"entities": [{
"entityType": "GENERIC",
"labels": [{
"label_name": "negative"
}]
}]