Exporting the Dataset

You can export datasets in Data Labeling in various text and image formats, and snapshot JSONL files.

You can export datasets in Data Labeling to any Object Storage location in the tenancy. Thus, you can maintain versions, or use the dataset elsewhere, for example, as an input to machine learning model development. The output file location is included in the export panel. After export, the destination is available in the associated work request. The destination is also displayed in the Dataset Details page, but only while the work request exists.

For documents, you can export to JSONL files.

For images, you have the choice of exporting to the following file formats:
  • JSONL
  • YOLO V5
  • COCO
  • PASCAL VOC
For text, you have the choice of exporting to the following file formats:
  • JSONL
  • JSONL Compact Plus Content
  • spaCy
  • CoNLL V2003
    Note

    If you export text in the CoNLL format, recursive and overlapping entities are ignored.
Note

For CSV, the only option is to export to JSONL.
    1. In the dataset details page, click Export to display the Export Dataset panel.
    2. Namespace is read-only and shows where the JSON files are stored.
    3. Choose export file format.
    4. (Optional) To change the compartment where the Object Storage bucket resides, click Change Compartment.
    5. Select the Bucket from the list.
    6. (Optional) Enter a Prefix.
      The exported dataset files are stored starting with this path prefix.
    7. (Optional) To export all records, including those records yet to be labeled, click Include unlabeled records to export.
    8. (Optional) To export the dataset and records as a single file, click Consolidate dataset and record file into a single file.
      Note

      This option is only available for JSONL files, and is disabled for other export formats.
    9. Click Export dataset.
  • This task is not available in the CLI.

  • This task is not available in the API.

Examples of Exported Document, Image, and Text Datasets

Examples of the JSON files created when a dataset is exported in Data Labeling.

Example of an Exported Consolidated JSON File

An example of an exported consolidated JSON file.

{
	"id": "ocid1.datalabelingdatasetdev.oc1.iad.amaaaaaazaehrjyag7jcbu3xnpw4dcn3tmniarzorpxbtegnipsw5oleeauq",
	"compartmentId": "ocid1.compartment.oc1..aaaaaaaaihdqc5z4zq4sqt7t4c7vbwc6lbf5dr6mky2phcpvdlh7c3p5mtuq",
	"displayName": "test-check",
	"description": "test  check",
	"labelsSet": [{
		"name": "location"
	}, {
		"name": "university"
	}],
	"annotationFormat": "ENTITY_EXTRACTION",
	"datasetSourceDetails": {
		"namespace": "idrcdhfxwqwa",
		"bucket": "test-sachin-cucket"
	},
	"datasetFormatDetails": {
		"formatType": "TEXT"
	}
} {
	"id": "ocid1.datalabelingrecord.oc1.iad.amaaaaaazaehrjyahykmu6hvdksayw64a3wmur7mk2366hgitlypk6u2soea",
	"timeCreated": "2021-10-12 12:09:37",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "sample-text.txt"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyat64zcfbjviu3pttykthabv5jiuicva3dkv6oikstzd7q",
		"timeCreated": "2021-10-12 12:16:51",
		"createdBy": "ocid1.user.oc1..aaaaaaaaktqgvx2skco6bfyziwjzfjaxensoewscqbk7p44sjqyrxmz4qozq",
		"entities": [{
			"entityType": "TEXTSELECTION",
			"labels": [{
				"label_name": "university"
			}],
			"textSpan": {
				"offset": 60,
				"length": 11
			}
		}]
	}]
}
Example of an Exported Document Dataset JSON File

An example of an exported document dataset JSON file.

{
   "id":"ocid1.datalabelingdatasetint.oc1.iad.amaaaaaaniob46iafkiyw6a4uwgrnpy4lfxjoslocap7elaj257mxh4fzuwq",
   "compartmentId":"ocid1.compartment.oc1..aaaaaaaajqiw27knoagxurhzjlihw7ijnoshsu4zi2uawdn5gfexdqwvu4vq",
   "displayName":"Sep6_PDF",
   "labelsSet":[
     {
          "name":"L1"
     },
     {
          "name":"L"
     },
     {
          "name":"23423"
     }
   ],
   "annotationFormat":"MULTI_LABEL",
   "datasetSourceDetails":{
      "namespace":"idgszs0xipmn",
      "bucket":"Demo-bucket"
   },
   "datasetFormatDetails":{"formatType":"DOCUMENT"},
   "recordFiles":[
      {
         "namespace":"idgszs0xipmn",
         "bucket":"COVID_Dataset",
         "path":"Snapshotsrecords_1632479104889.jsonl"
      }
   ]
}

Example of an Exported Image Dataset JSON File

An example of an exported image dataset JSON file.

{
     "id": "ocid1...",
     "compartmentId": "",
     "timeCreated":2020-12-15...,
     "displayName":...,
     "description":...,
     "labelsSet": [
          {"name":"germanshepherd"},
          {"name":"americanshepherd"},
          {"name":"australianshepherd"},
          {"name":"irishwolfhound"}
     ]
     "annotationFormat": "IMAGE_OBJECT_SELECTION",
     "datasetSourceDetails": {
          "sourceType": "OBJECT_STORAGE",
          "namespace": "i235o3idk",
          "bucket": "mytrainingdata",
          "prefix": "puppyproject/"
     }
     "datasetFormatDetails": {
          "formatType": "IMAGE" # image requires less metadata than delimited for example
     }
     "recordsFiles: {
          [
              {
              "namespace": "i235o3idk"
              "bucket": "mylabels"
              "path": "puppyproject/records1.json"
              }
          ]
     }
     "definedTags": {}
     "freeformTags": {}   
}
Example of an Exported Text Dataset JSON File

An example of an exported text dataset JSON file.

{
   "id":"ocid1.datalabelingdatasetdev.oc1.iad.amaaaaaazaehrjyamqjx733dhxd25zxcro2nftrewq7ltj34ua2cfapzsmjq",
   "compartmentId":"ocid1.compartment.oc1..aaaaaaaagzh2kii2frktoc7bcvfydpzkxr7dbn6nf6jcyrxwgzen4pi5y4zq",
   "displayName":"NER DEMO DATASET UNLABELLED",
   "description":"NER DEMO DATASET UNLABELLED",
   "labelsSet":[
      {
         "name":"Person"
      },
      {
         "name":"Organization"
      },
      {
         "name":"Event"
      },
      {
         "name":"Place"
      }
   ],
   "annotationFormat":"ENTITY_EXTRACTION",
   "datasetSourceDetails":{
      "namespace":"idrcdhfxwqwa",
      "bucket":"news-articles"
   },
   "datasetFormatDetails":{
       
   },
   "recordFiles":[
      {
         "namespace":"idrcdhfxwqwa",
         "bucket":"snapshots",
         "path":"forReview/records_1621847577526.jsonl"
      }
   ]
}

Example of an Exported Document Record JSON File

An example of an exported document record JSON file.

{
   "id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaqgpzhscdpdcgohg5ocp3obwmjjgju6m73bmyrt4aovhq",
   "timeCreated":"2021-09-06 03:40:02",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"SampleDocs-sample-pdf-file copy 98.pdf"
   },
   "annotations":[
      {
         "id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaaniob46iatjg3p6hlszxrgmsj4y76b5tndddaedm6ardkoxbtt6mq",
         "timeCreated":"2021-09-06 03:42:43",
         "createdBy":"ocid1.user.oc1..aaaaaaaa6ynps4htdea6fqoerfhkedp3lih2ktureqhw3hmfojde6ukf3mpa",
         "entities":[
            {
               "entityType":"GENERIC","labels":[
                  {
                     "label_name":"23423"
                  }
               ]
            }
         ]
      }
   ]
}{
   "id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iasb5klulgaj4djn3acsgsd3cekx3ix46ftxjdip4tu23a",
   "timeCreated":"2021-09-06 03:40:02",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"SampleDocs-sample-pdf-file copy 99.pdf"
   },
   "annotations":[
      {
         "id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaaniob46iav45mlpcleqjt7cnmhyogopszi2rfnilwjhd4xyxa7irq",
         "timeCreated":"2021-09-06 03:42:47",
         "createdBy":"ocid1.user.oc1..aaaaaaaa6ynps4htdea6fqoerfhkedp3lih2ktureqhw3hmfojde6ukf3mpa",
         "entities":[
            {
               "entityType":"GENERIC","labels":[
                  {
                     "label_name":"L1"
                  }
               ]
            }
         ]
      }
   ]
}{
   "id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaxhixolkqryomyu6i4jrrmzwcckw2tmgva47suylu5rzq",
   "timeCreated":"2021-09-06 03:40:02",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"SampleDocs-sample-pdf-file copy 97.pdf"
   }
}{
   "id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iagymrjuem42kvzilxjd5hdrr3djznrl7aajvvcr6zc6sq",
   "timeCreated":"2021-09-06 03:40:02",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"SampleDocs-sample-pdf-file copy 96.pdf"
   }
}{
   "id":"ocid1.datalabelingrecord.oc1.iad.amaaaaaaniob46iaclpccpxn5hgmplesv3mt3g6hxkfaepzv6fuy7b6he3ca",
   "timeCreated":"2021-09-06 03:40:02",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"SampleDocs-sample-pdf-file copy 2.pdf"
   }
}
Example of an Exported Image Record JSON File

An example of an exported image record JSON file.

{
    "id": "ocid1...",
    "timeCreated": 2020-12-15...,
    "sourceDetails": {
         "sourceType": "OBJECT_STORAGE",
         "path": "filename2.jpg"
    }
    "annotations": [
        {
            "id": "ocid1....",
            "timeCreated": ...,
            "createdBy": ...,
            "entities: [
                {
                    "entityType": "IMAGEOBJECTSELECTION",
                    "labels": [
                        {"name": "germanshepherd"}
                    ],
                    "boundingPolygon": {
                        normalizedVertices: [
                            {"x":0.2, "y":0.2},
                            {"x":0.3, "y":0.2},
                            {"x":0.3, "y":0.3},
                            {"x":0.2, "y":0.3}
                        ]
                    }
                },
                {
                    "entityType": "BOUNDING_BOX",
                    "labels": [
                        {"name": "irishwolfhound"}
                    ],
                    "boundingPolygon": {
                        normalizedVertices: [
                            {"x":0.4, "y":0.4},
                            {"x":0.5, "y":0.4},
                            {"x":0.5, "y":0.5},
                            {"x":0.4, "y":0.5}
                        ]
                    }
                }
            ]
        }
    ],
    "freeformTags": {
        "set": "validation" # optional, user defined convention used for reproducibility
    }
}
Example of an Exported Text Record JSON File

An example of an exported text record JSON file.

{
   "id":"ocid1.record.oc1.iad.UxxfPBMZVYfwZHZnjCPUGkhMwpWoTPMOnxDnrgXbBxwLKkrdeGwewdViOoUJ",
   "timeCreated":"2021-06-21 09:06:01",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"article_3.txt"
   },
   "annotations":[
      {
         "id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyadghacojq3nmo2mtcbcmlo4rgslmpzxeboujhduft5nta",
         "timeCreated":"2021-46-21 09:46:45",
         "createdBy":"ocid1.user.oc1..aaaaaaaazjupiis2cu54smlzemiujpqxriz6i4wp3euuqrzffdugib73epbq",
         "entities":[
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Event"
                  }
               ],
               "textSpan":{
                  "offset":141,
                  "length":12
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Organization"
                  }
               ],
               "textSpan":{
                  "offset":204,
                  "length":20
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Person"
                  }
               ],
               "textSpan":{
                  "offset":254,
                  "length":15
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Organization"
                  }
               ],
               "textSpan":{
                  "offset":402,
                  "length":3
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Place"
                  }
               ],
               "textSpan":{
                  "offset":638,
                  "length":11
               }
            }
         ]
      }
   ]
}{
   "id":"ocid1.record.oc1.iad.AakCoDHvJpnZofzIYfRCfpZnFUqNmfiWNIuNysbXCSRZeTVqdwKGvYjJpMvh",
   "timeCreated":"2021-06-21 09:06:01",
   "sourceDetails":{
      "sourceType":"OBJECT_STORAGE",
      "path":"article_1.txt"
   },
   "annotations":[
      {
         "id":"ocid1.datalabelingannotation.oc1.iad.amaaaaaazaehrjyafoed6oimxqxeyey6osjo3jp52vsyd75i5zspfvcfdz3q",
         "timeCreated":"2021-30-21 03:30:10",
         "createdBy":"ocid1.user.oc1..aaaaaaaazjupiis2cu54smlzemiujpqxriz6i4wp3euuqrzffdugib73epbq",
         "entities":[
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Person"
                  }
               ],
               "textSpan":{
                  "offset":36,
                  "length":8
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Person"
                  }
               ],
               "textSpan":{
                  "offset":147,
                  "length":23
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Organization"
                  }
               ],
               "textSpan":{
                  "offset":196,
                  "length":3
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Event"
                  }
               ],
               "textSpan":{
                  "offset":311,
                  "length":22
               }
            },
            {
               "entityType":"TEXTSELECTION",
               "labels":[
                  {
                     "label_name":"Place"
                  }
               ],
               "textSpan":{
                  "offset":512,
                  "length":49
               }
            }
         ]
      }
   ]
}
Example of an Exported CSV Text Dataset JSON File

An example of an exported CSV (text) dataset JSON file.

{
	"id": "ocid1.datalabelingdatasetint.oc1.phx.amaaaaaaniob46iaxarhafiu42tbdm2d2nkxlkxwhnc76ohnwvpsdfccqw5q",
	"compartmentId": "ocid1.compartment.oc1..aaaaaaaaundh4v2w4spnyt4hgy367qf54jonakpz6gh573bspmgzfoj2auga",
	"displayName": "Text Classification CSV dataset",
	"labelsSet": [{
		"name": "positive"
	}, {
		"name": "neutral"
	}, {
		"name": "negative"
	}],
	"annotationFormat": "SINGLE_LABEL",
	"datasetSourceDetails": {
		"namespace": "idgszs0xipmn",
		"bucket": "TEST",
		"prefix": "languageteam/Text_Classification_Context_Oracle_advt.csv"
	},
	"datasetFormatDetails": {
		"formatType": "TEXT",
		"textFileTypeMetadata": {
			"formatType": "DELIMITED",
			"delimitedFileTypeMetaData": {
				"columnIndex": 5,
				"columnName": "CONTENT",
				"columnDelimiter": ","
			}
		}
	}
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iajx42mojwkktind744i3t2q3di6tdhwysw2wy4d42tseq",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/546"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iadsu6zpch4lvozx7ci3as5st23jqxjpjdcryp4jworala",
		"timeCreated": "2022-06-05 05:40:48",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "neutral"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia7otgs2rb3kuh464sisfbjxxbbkb65sbg2icst3gquw3q",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/303"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iatfuceqzjb5nnh7quk5wupvwe74bfpn5oka57cz6gqv4a",
		"timeCreated": "2022-06-05 05:41:30",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "neutral"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iab55fqcxlfb3xszlpp7qnpsthjdhzzb7nki65xqdvgceq",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/547"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iamosgunt72lci3g3mzyyx2sskjdje4e5zspts7mbnsl5q",
		"timeCreated": "2022-06-05 05:41:36",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "neutral"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia45ave4zhtisvu2k7d6tbciskcge4ecm2imb6bvdqe4da",
	"timeCreated": "2022-06-05 04:39:21",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/564"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iauqo6tlqil7vijetsayt6vsmpohxum5vmj6cde3wbfxua",
		"timeCreated": "2022-06-05 05:40:44",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "positive"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iasymkpbstgjwmae7ar5ikgp5mtth2izcaaaruatpl45ma",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/545"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iatu6k7afdwirdtvv6bofrquc65m4ruet4hlfmhgzhqjxa",
		"timeCreated": "2022-06-05 05:41:02",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "positive"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46ia6n4whohdhn257pmot7zlncawockthadosdhrp5so2nna",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/304"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iaslgb6s6h5ffce5mcgeidndp3vydcxzjya7yrbaj6pw5a",
		"timeCreated": "2022-06-05 05:40:57",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "negative"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iamgsncrjarzujr6duaedmsjyrp67yi7dpe2uoi6h54c5a",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/548"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46iabt3hwyc7mkaanez7q24k7vlfds3lisa6hdu53hntq2qq",
		"timeCreated": "2022-06-05 05:42:55",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "positive"
			}]
		}]
	}]
} {
	"id": "ocid1.datalabelingrecord.oc1.phx.amaaaaaaniob46iactsl4j7v633d2y2t67lkxawv2nyemz7wwarppjpxeofq",
	"timeCreated": "2022-06-05 04:39:18",
	"sourceDetails": {
		"sourceType": "OBJECT_STORAGE",
		"path": "/305"
	},
	"annotations": [{
		"id": "ocid1.datalabelingannotation.oc1.phx.amaaaaaaniob46ia7xxg4ukky3ur56zzwaodvwrks4vqgvoug2z2moif274a",
		"timeCreated": "2022-06-05 05:41:44",
		"createdBy": "ocid1.user.oc1..aaaaaaaaavjgmgh67ndbznlhnuxhzswfbwcpd5tlvugskeeqt7noudcu7xha",
		"entities": [{
			"entityType": "GENERIC",
			"labels": [{
				"label_name": "negative"
			}]
		}]