clarifai 9.7.0__py3-none-any.whl → 9.7.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- clarifai/auth/__init__.py +6 -0
- clarifai/auth/helper.py +35 -36
- clarifai/auth/register.py +23 -0
- clarifai/{client → auth}/stub.py +10 -10
- clarifai/client/__init__.py +1 -4
- clarifai/client/app.py +483 -0
- clarifai/client/auth/__init__.py +4 -0
- clarifai/client/{abc.py → auth/abc.py} +2 -2
- clarifai/client/auth/helper.py +377 -0
- clarifai/client/auth/register.py +23 -0
- {clarifai_utils/client → clarifai/client/auth}/stub.py +10 -10
- clarifai/client/base.py +112 -0
- clarifai/client/dataset.py +290 -0
- clarifai/client/input.py +730 -0
- clarifai/client/lister.py +41 -0
- clarifai/client/model.py +218 -0
- clarifai/client/module.py +82 -0
- clarifai/client/user.py +125 -0
- clarifai/client/workflow.py +194 -0
- clarifai/datasets/upload/base.py +66 -0
- clarifai/datasets/upload/examples/README.md +31 -0
- clarifai/datasets/upload/examples/image_classification/cifar10/dataset.py +42 -0
- clarifai/datasets/upload/examples/image_classification/food-101/dataset.py +39 -0
- clarifai/datasets/upload/examples/text_classification/imdb_dataset/dataset.py +37 -0
- clarifai/{data_upload/datasets → datasets/upload}/features.py +4 -12
- clarifai/datasets/upload/image.py +156 -0
- clarifai/datasets/upload/loaders/README.md +49 -0
- clarifai/{data_upload/datasets/zoo → datasets/upload/loaders}/coco_captions.py +24 -21
- {clarifai_utils/data_upload/datasets/zoo → clarifai/datasets/upload/loaders}/coco_detection.py +46 -42
- clarifai/datasets/upload/loaders/coco_segmentation.py +166 -0
- clarifai/{data_upload/datasets/zoo → datasets/upload/loaders}/imagenet_classification.py +22 -12
- clarifai/{data_upload/datasets/zoo → datasets/upload/loaders}/xview_detection.py +44 -53
- clarifai/datasets/upload/text.py +50 -0
- clarifai/datasets/upload/utils.py +62 -0
- clarifai/errors.py +90 -0
- clarifai/urls/helper.py +16 -17
- clarifai/utils/logging.py +40 -0
- clarifai/utils/misc.py +33 -0
- clarifai/versions.py +6 -0
- {clarifai-9.7.0.dist-info → clarifai-9.7.2.dist-info}/LICENSE +1 -1
- clarifai-9.7.2.dist-info/METADATA +179 -0
- clarifai-9.7.2.dist-info/RECORD +350 -0
- clarifai_utils/auth/__init__.py +6 -0
- clarifai_utils/auth/helper.py +35 -36
- clarifai_utils/auth/register.py +23 -0
- clarifai_utils/auth/stub.py +127 -0
- clarifai_utils/client/__init__.py +1 -4
- clarifai_utils/client/app.py +483 -0
- clarifai_utils/client/auth/__init__.py +4 -0
- clarifai_utils/client/{abc.py → auth/abc.py} +2 -2
- clarifai_utils/client/auth/helper.py +377 -0
- clarifai_utils/client/auth/register.py +23 -0
- clarifai_utils/client/auth/stub.py +127 -0
- clarifai_utils/client/base.py +112 -0
- clarifai_utils/client/dataset.py +290 -0
- clarifai_utils/client/input.py +730 -0
- clarifai_utils/client/lister.py +41 -0
- clarifai_utils/client/model.py +218 -0
- clarifai_utils/client/module.py +82 -0
- clarifai_utils/client/user.py +125 -0
- clarifai_utils/client/workflow.py +194 -0
- clarifai_utils/datasets/upload/base.py +66 -0
- clarifai_utils/datasets/upload/examples/README.md +31 -0
- clarifai_utils/datasets/upload/examples/image_classification/cifar10/dataset.py +42 -0
- clarifai_utils/datasets/upload/examples/image_classification/food-101/dataset.py +39 -0
- clarifai_utils/datasets/upload/examples/text_classification/imdb_dataset/dataset.py +37 -0
- clarifai_utils/{data_upload/datasets → datasets/upload}/features.py +4 -12
- clarifai_utils/datasets/upload/image.py +156 -0
- clarifai_utils/datasets/upload/loaders/README.md +49 -0
- clarifai_utils/{data_upload/datasets/zoo → datasets/upload/loaders}/coco_captions.py +24 -21
- {clarifai/data_upload/datasets/zoo → clarifai_utils/datasets/upload/loaders}/coco_detection.py +46 -42
- clarifai_utils/datasets/upload/loaders/coco_segmentation.py +166 -0
- clarifai_utils/{data_upload/datasets/zoo → datasets/upload/loaders}/imagenet_classification.py +22 -12
- clarifai_utils/{data_upload/datasets/zoo → datasets/upload/loaders}/xview_detection.py +44 -53
- clarifai_utils/datasets/upload/text.py +50 -0
- clarifai_utils/datasets/upload/utils.py +62 -0
- clarifai_utils/errors.py +90 -0
- clarifai_utils/urls/helper.py +16 -17
- clarifai_utils/utils/logging.py +40 -0
- clarifai_utils/utils/misc.py +33 -0
- clarifai_utils/versions.py +6 -0
- clarifai/data_upload/README.md +0 -63
- clarifai/data_upload/convert_csv.py +0 -182
- clarifai/data_upload/datasets/base.py +0 -87
- clarifai/data_upload/datasets/image.py +0 -253
- clarifai/data_upload/datasets/text.py +0 -60
- clarifai/data_upload/datasets/zoo/README.md +0 -55
- clarifai/data_upload/datasets/zoo/coco_segmentation.py +0 -160
- clarifai/data_upload/examples/README.md +0 -5
- clarifai/data_upload/examples/image_classification/cifar10/dataset.py +0 -40
- clarifai/data_upload/examples/image_classification/food-101/dataset.py +0 -39
- clarifai/data_upload/examples/image_classification/food-101/images/beignets/1036242.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/beignets/1114182.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/beignets/2012944.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/beignets/2464389.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/beignets/478632.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/hamburger/1061270.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/hamburger/1202261.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/hamburger/1381751.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/hamburger/3289634.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/hamburger/862025.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/prime_rib/102197.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/prime_rib/2749372.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/prime_rib/2938268.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/prime_rib/3590861.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/prime_rib/746716.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/ramen/2955110.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/ramen/3208966.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/ramen/3270629.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/ramen/3424562.jpg +0 -0
- clarifai/data_upload/examples/image_classification/food-101/images/ramen/544680.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/annotations/2007_000464.xml +0 -39
- clarifai/data_upload/examples/image_detection/voc/annotations/2008_000853.xml +0 -28
- clarifai/data_upload/examples/image_detection/voc/annotations/2008_003182.xml +0 -54
- clarifai/data_upload/examples/image_detection/voc/annotations/2008_008526.xml +0 -67
- clarifai/data_upload/examples/image_detection/voc/annotations/2009_004315.xml +0 -28
- clarifai/data_upload/examples/image_detection/voc/annotations/2009_004382.xml +0 -28
- clarifai/data_upload/examples/image_detection/voc/annotations/2011_000430.xml +0 -28
- clarifai/data_upload/examples/image_detection/voc/annotations/2011_001610.xml +0 -46
- clarifai/data_upload/examples/image_detection/voc/annotations/2011_006412.xml +0 -99
- clarifai/data_upload/examples/image_detection/voc/annotations/2012_000690.xml +0 -43
- clarifai/data_upload/examples/image_detection/voc/dataset.py +0 -76
- clarifai/data_upload/examples/image_detection/voc/images/2007_000464.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2008_000853.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2008_003182.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2008_008526.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2009_004315.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2009_004382.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2011_000430.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2011_001610.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2011_006412.jpg +0 -0
- clarifai/data_upload/examples/image_detection/voc/images/2012_000690.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/annotations/instances_val2017_subset.json +0 -5342
- clarifai/data_upload/examples/image_segmentation/coco/dataset.py +0 -107
- clarifai/data_upload/examples/image_segmentation/coco/images/000000074646.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000086956.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000166563.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000176857.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000182202.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000193245.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000384850.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000409630.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000424349.jpg +0 -0
- clarifai/data_upload/examples/image_segmentation/coco/images/000000573008.jpg +0 -0
- clarifai/data_upload/examples/text_classification/imdb_dataset/dataset.py +0 -40
- clarifai/data_upload/examples.py +0 -17
- clarifai/data_upload/upload.py +0 -356
- clarifai/dataset_export/dataset_export_inputs.py +0 -205
- clarifai/listing/concepts.py +0 -37
- clarifai/listing/datasets.py +0 -37
- clarifai/listing/inputs.py +0 -111
- clarifai/listing/installed_module_versions.py +0 -40
- clarifai/listing/lister.py +0 -200
- clarifai/listing/models.py +0 -46
- clarifai/listing/module_versions.py +0 -42
- clarifai/listing/modules.py +0 -36
- clarifai/runners/base.py +0 -140
- clarifai/runners/example.py +0 -36
- clarifai-9.7.0.dist-info/METADATA +0 -99
- clarifai-9.7.0.dist-info/RECORD +0 -456
- clarifai_utils/data_upload/README.md +0 -63
- clarifai_utils/data_upload/convert_csv.py +0 -182
- clarifai_utils/data_upload/datasets/base.py +0 -87
- clarifai_utils/data_upload/datasets/image.py +0 -253
- clarifai_utils/data_upload/datasets/text.py +0 -60
- clarifai_utils/data_upload/datasets/zoo/README.md +0 -55
- clarifai_utils/data_upload/datasets/zoo/coco_segmentation.py +0 -160
- clarifai_utils/data_upload/examples/README.md +0 -5
- clarifai_utils/data_upload/examples/image_classification/cifar10/dataset.py +0 -40
- clarifai_utils/data_upload/examples/image_classification/food-101/dataset.py +0 -39
- clarifai_utils/data_upload/examples/image_classification/food-101/images/beignets/1036242.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/beignets/1114182.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/beignets/2012944.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/beignets/2464389.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/beignets/478632.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/hamburger/1061270.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/hamburger/1202261.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/hamburger/1381751.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/hamburger/3289634.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/hamburger/862025.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/prime_rib/102197.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/prime_rib/2749372.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/prime_rib/2938268.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/prime_rib/3590861.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/prime_rib/746716.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/ramen/2955110.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/ramen/3208966.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/ramen/3270629.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/ramen/3424562.jpg +0 -0
- clarifai_utils/data_upload/examples/image_classification/food-101/images/ramen/544680.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/__init__.py +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/__init__.py +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2007_000464.xml +0 -39
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2008_000853.xml +0 -28
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2008_003182.xml +0 -54
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2008_008526.xml +0 -67
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2009_004315.xml +0 -28
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2009_004382.xml +0 -28
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2011_000430.xml +0 -28
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2011_001610.xml +0 -46
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2011_006412.xml +0 -99
- clarifai_utils/data_upload/examples/image_detection/voc/annotations/2012_000690.xml +0 -43
- clarifai_utils/data_upload/examples/image_detection/voc/dataset.py +0 -76
- clarifai_utils/data_upload/examples/image_detection/voc/images/2007_000464.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2008_000853.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2008_003182.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2008_008526.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2009_004315.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2009_004382.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2011_000430.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2011_001610.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2011_006412.jpg +0 -0
- clarifai_utils/data_upload/examples/image_detection/voc/images/2012_000690.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/__init__.py +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/__init__.py +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/annotations/instances_val2017_subset.json +0 -5342
- clarifai_utils/data_upload/examples/image_segmentation/coco/dataset.py +0 -107
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000074646.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000086956.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000166563.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000176857.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000182202.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000193245.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000384850.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000409630.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000424349.jpg +0 -0
- clarifai_utils/data_upload/examples/image_segmentation/coco/images/000000573008.jpg +0 -0
- clarifai_utils/data_upload/examples/text_classification/__init__.py +0 -0
- clarifai_utils/data_upload/examples/text_classification/imdb_dataset/__init__.py +0 -0
- clarifai_utils/data_upload/examples/text_classification/imdb_dataset/dataset.py +0 -40
- clarifai_utils/data_upload/examples.py +0 -17
- clarifai_utils/data_upload/upload.py +0 -356
- clarifai_utils/dataset_export/dataset_export_inputs.py +0 -205
- clarifai_utils/listing/__init__.py +0 -0
- clarifai_utils/listing/concepts.py +0 -37
- clarifai_utils/listing/datasets.py +0 -37
- clarifai_utils/listing/inputs.py +0 -111
- clarifai_utils/listing/installed_module_versions.py +0 -40
- clarifai_utils/listing/lister.py +0 -200
- clarifai_utils/listing/models.py +0 -46
- clarifai_utils/listing/module_versions.py +0 -42
- clarifai_utils/listing/modules.py +0 -36
- clarifai_utils/runners/__init__.py +0 -0
- clarifai_utils/runners/base.py +0 -140
- clarifai_utils/runners/example.py +0 -36
- /clarifai/{data_upload/__init__.py → cli.py} +0 -0
- /clarifai/{data_upload/datasets → datasets}/__init__.py +0 -0
- /clarifai/{data_upload/datasets/zoo → datasets/upload}/__init__.py +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/__init__.py +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/__init__.py +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/cifar_small_test.csv +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/cifar_small_train.csv +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_700.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_701.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_702.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_703.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_704.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_705.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_706.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_707.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_708.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_709.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/__init__.py +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/1420783.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/3287885.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/3617075.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/38052.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/39147.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/139558.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/1636096.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/2480925.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/3385808.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/3647386.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/1826869.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/2243245.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/259212.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/2842688.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/3035414.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/1545393.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/2427642.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/3520891.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/377566.jpg +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/503504.jpg +0 -0
- /clarifai/{data_upload/examples/image_detection → datasets/upload/examples/text_classification}/__init__.py +0 -0
- /clarifai/{data_upload/examples/image_detection/voc → datasets/upload/examples/text_classification/imdb_dataset}/__init__.py +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/text_classification/imdb_dataset/test.csv +0 -0
- /clarifai/{data_upload → datasets/upload}/examples/text_classification/imdb_dataset/train.csv +0 -0
- /clarifai/{data_upload/examples/image_segmentation → datasets/upload/loaders}/__init__.py +0 -0
- /clarifai/{data_upload/examples/image_segmentation/coco → utils}/__init__.py +0 -0
- {clarifai-9.7.0.dist-info → clarifai-9.7.2.dist-info}/WHEEL +0 -0
- {clarifai-9.7.0.dist-info → clarifai-9.7.2.dist-info}/entry_points.txt +0 -0
- {clarifai-9.7.0.dist-info → clarifai-9.7.2.dist-info}/top_level.txt +0 -0
- /clarifai/data_upload/examples/text_classification/__init__.py → /clarifai_utils/cli.py +0 -0
- {clarifai/data_upload/examples/text_classification/imdb_dataset → clarifai_utils/datasets}/__init__.py +0 -0
- {clarifai/listing → clarifai_utils/datasets/upload}/__init__.py +0 -0
- {clarifai/runners → clarifai_utils/datasets/upload/examples/image_classification}/__init__.py +0 -0
- /clarifai_utils/{data_upload → datasets/upload/examples/image_classification/cifar10}/__init__.py +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/cifar_small_test.csv +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/cifar_small_train.csv +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_700.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_701.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_702.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_703.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_704.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_705.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_706.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_707.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_708.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/cifar10/images/test_batch_709.jpg +0 -0
- /clarifai_utils/{data_upload/datasets → datasets/upload/examples/image_classification/food-101}/__init__.py +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/1420783.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/3287885.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/3617075.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/38052.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/beignets/39147.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/139558.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/1636096.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/2480925.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/3385808.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/hamburger/3647386.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/1826869.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/2243245.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/259212.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/2842688.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/prime_rib/3035414.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/1545393.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/2427642.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/3520891.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/377566.jpg +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/image_classification/food-101/images/ramen/503504.jpg +0 -0
- /clarifai_utils/{data_upload/datasets/zoo → datasets/upload/examples/text_classification}/__init__.py +0 -0
- /clarifai_utils/{data_upload/examples/image_classification → datasets/upload/examples/text_classification/imdb_dataset}/__init__.py +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/text_classification/imdb_dataset/test.csv +0 -0
- /clarifai_utils/{data_upload → datasets/upload}/examples/text_classification/imdb_dataset/train.csv +0 -0
- /clarifai_utils/{data_upload/examples/image_classification/cifar10 → datasets/upload/loaders}/__init__.py +0 -0
- /clarifai_utils/{data_upload/examples/image_classification/food-101 → utils}/__init__.py +0 -0
|
@@ -1,182 +0,0 @@
|
|
|
1
|
-
import csv
|
|
2
|
-
import os
|
|
3
|
-
import sys
|
|
4
|
-
from itertools import chain
|
|
5
|
-
|
|
6
|
-
import pandas as pd
|
|
7
|
-
from sklearn.model_selection import train_test_split
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
class DataPreprocessor:
|
|
11
|
-
|
|
12
|
-
def __init__(self, filename, multi_val, separator):
|
|
13
|
-
self.filename = filename
|
|
14
|
-
self.multi_val = multi_val
|
|
15
|
-
self.separator = separator
|
|
16
|
-
self.df = pd.read_csv(filename)
|
|
17
|
-
|
|
18
|
-
def process_data(self):
|
|
19
|
-
text_col = self.get_column("Enter the number of the column that contains the text: ")
|
|
20
|
-
label_col = self.get_column(
|
|
21
|
-
"Enter the number of the column that contains the labels: ", exclude=text_col)
|
|
22
|
-
|
|
23
|
-
self.split_values(label_col)
|
|
24
|
-
self.df[label_col] = self.df[label_col].apply(lambda x: [x] if isinstance(x, str) else x)
|
|
25
|
-
|
|
26
|
-
# Use chain.from_iterable to expand the multi-values if applicable
|
|
27
|
-
unique_labels = list(set(chain.from_iterable(self.df[label_col].values)))
|
|
28
|
-
|
|
29
|
-
print(
|
|
30
|
-
"\nThe following unqiue labels have been found in the '{}' column and will be used in the dataset:".
|
|
31
|
-
format(label_col))
|
|
32
|
-
for i, label in enumerate(unique_labels, start=1):
|
|
33
|
-
print('{}) {}'.format(i, label))
|
|
34
|
-
|
|
35
|
-
self.convert_for_classification(text_col, label_col, unique_labels)
|
|
36
|
-
|
|
37
|
-
def get_column(self, prompt, exclude=None):
|
|
38
|
-
available_columns = self.df.columns.drop(exclude) if exclude else self.df.columns
|
|
39
|
-
if len(available_columns) == 1:
|
|
40
|
-
print(f'\nThe column named \'{available_columns[0]}\' will be used as the labels column.')
|
|
41
|
-
return available_columns[0]
|
|
42
|
-
else:
|
|
43
|
-
for i, col in enumerate(available_columns):
|
|
44
|
-
print(f'{i+1}) {col}')
|
|
45
|
-
col_index = int(input(prompt)) - 1
|
|
46
|
-
return available_columns[col_index]
|
|
47
|
-
|
|
48
|
-
def split_values(self, label_col):
|
|
49
|
-
if self.multi_val.lower() == 'y':
|
|
50
|
-
self.df[label_col] = self.df[label_col].apply(
|
|
51
|
-
lambda x: str(x).split(self.separator) if not isinstance(x, float) else x)
|
|
52
|
-
|
|
53
|
-
def convert_for_classification(self, text_col, label_col, unique_labels):
|
|
54
|
-
# Binary classification
|
|
55
|
-
if len(unique_labels) == 2:
|
|
56
|
-
print("Converting the CSV to be used with binary classification")
|
|
57
|
-
self.df['input.data.text.raw'] = self.df[text_col]
|
|
58
|
-
self.df['input.data.concepts[0].id'] = label_col
|
|
59
|
-
self.df['input.data.concepts[0].value'] = self.df[label_col].apply(
|
|
60
|
-
lambda x: 1 if unique_labels[0] in x else 0)
|
|
61
|
-
self.df = self.df[[
|
|
62
|
-
'input.data.text.raw', 'input.data.concepts[0].id', 'input.data.concepts[0].value'
|
|
63
|
-
]]
|
|
64
|
-
|
|
65
|
-
# Multi-class classification
|
|
66
|
-
else:
|
|
67
|
-
print("Converting the CSV to be used with multi-class classification")
|
|
68
|
-
self.df['input.data.text.raw'] = self.df[text_col].apply(
|
|
69
|
-
lambda x: x[0] if isinstance(x, list) else x)
|
|
70
|
-
for i in range(len(unique_labels)):
|
|
71
|
-
self.df[f'input.data.concepts[{i}].id'] = self.df[label_col].apply(
|
|
72
|
-
lambda x: unique_labels[i] if unique_labels[i] in x else '')
|
|
73
|
-
self.df[f'input.data.concepts[{i}].value'] = self.df[label_col].apply(
|
|
74
|
-
lambda x: 1 if unique_labels[i] in x else '')
|
|
75
|
-
|
|
76
|
-
self.df = self.df[['input.data.text.raw'] +
|
|
77
|
-
[f'input.data.concepts[{i}].id' for i in range(len(unique_labels))] +
|
|
78
|
-
[f'input.data.concepts[{i}].value' for i in range(len(unique_labels))]]
|
|
79
|
-
|
|
80
|
-
# Reorder the columns
|
|
81
|
-
cols = self.df.columns.tolist()
|
|
82
|
-
new_cols = cols[:1] # The first column 'input.data.text.raw'
|
|
83
|
-
pairs = [[cols[i], cols[i + len(unique_labels)]] for i in range(1, len(unique_labels) + 1)]
|
|
84
|
-
for pair in pairs:
|
|
85
|
-
new_cols.extend(pair)
|
|
86
|
-
self.df = self.df[new_cols]
|
|
87
|
-
|
|
88
|
-
# Remove special characters from column names
|
|
89
|
-
self.df.columns = self.df.columns.str.replace("^[\[]|[\]]$", "", regex=True)
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
class DatasetSplitter:
|
|
93
|
-
|
|
94
|
-
def __init__(self, df, split_dataset, shuffle_dataset, seed=555):
|
|
95
|
-
self.df = df
|
|
96
|
-
self.split_dataset = split_dataset
|
|
97
|
-
self.shuffle_dataset = shuffle_dataset
|
|
98
|
-
self.seed = seed if seed != '' else 555
|
|
99
|
-
|
|
100
|
-
def split_and_save(self, filename_base):
|
|
101
|
-
if self.split_dataset.lower() == 'y':
|
|
102
|
-
split_type = self.get_split_type()
|
|
103
|
-
|
|
104
|
-
if split_type == 1:
|
|
105
|
-
train_pct = self.get_percentage(
|
|
106
|
-
'What percentage of the dataset should be used for training? Enter a number between 1 and 99: ',
|
|
107
|
-
99)
|
|
108
|
-
test_pct = 100 - train_pct
|
|
109
|
-
print(f'Data will be split {train_pct}% train, {test_pct}% test') # Added print statement
|
|
110
|
-
elif split_type == 2:
|
|
111
|
-
train_pct = self.get_percentage(
|
|
112
|
-
'What percentage of the dataset should be used for training? Enter a number between 1 and 98: ',
|
|
113
|
-
98)
|
|
114
|
-
max_val_pct = 99 - train_pct # Max percentage for validation is now reduced by 1
|
|
115
|
-
val_pct = self.get_percentage(
|
|
116
|
-
f'What percentage of the dataset should be used for validation? Enter a number between 1 and {max_val_pct}: ',
|
|
117
|
-
max_val_pct)
|
|
118
|
-
test_pct = 100 - train_pct - val_pct
|
|
119
|
-
print(f'Data will be split {train_pct}% train, {val_pct}% validation, {test_pct}% test'
|
|
120
|
-
) # Added print statement
|
|
121
|
-
|
|
122
|
-
train_df, test_df = train_test_split(
|
|
123
|
-
self.df,
|
|
124
|
-
test_size=test_pct / 100,
|
|
125
|
-
random_state=self.seed,
|
|
126
|
-
shuffle=self.shuffle_dataset.lower() == 'y')
|
|
127
|
-
train_df.to_csv(filename_base + '-train.csv', index=False, quoting=csv.QUOTE_MINIMAL)
|
|
128
|
-
test_df.to_csv(filename_base + '-test.csv', index=False, quoting=csv.QUOTE_MINIMAL)
|
|
129
|
-
|
|
130
|
-
if split_type == 2:
|
|
131
|
-
train_df, val_df = train_test_split(
|
|
132
|
-
train_df,
|
|
133
|
-
test_size=val_pct / 100,
|
|
134
|
-
random_state=self.seed,
|
|
135
|
-
shuffle=self.shuffle_dataset.lower() == 'y')
|
|
136
|
-
val_df.to_csv(filename_base + '-validation.csv', index=False, quoting=csv.QUOTE_MINIMAL)
|
|
137
|
-
else:
|
|
138
|
-
self.df.to_csv(filename_base + '.csv', index=False, quoting=csv.QUOTE_MINIMAL)
|
|
139
|
-
|
|
140
|
-
def get_split_type(self):
|
|
141
|
-
split_type = int(
|
|
142
|
-
input(
|
|
143
|
-
'How would you like to split the dataset?\n1) Train and test datasets\n2) Train, validate, and test datasets\n'
|
|
144
|
-
))
|
|
145
|
-
while split_type not in [1, 2]:
|
|
146
|
-
split_type = int(
|
|
147
|
-
input(
|
|
148
|
-
'Invalid option. Enter 1 for "Train and test" or 2 for "Train, validate, and test": '
|
|
149
|
-
))
|
|
150
|
-
return split_type
|
|
151
|
-
|
|
152
|
-
def get_percentage(self, prompt, max_pct):
|
|
153
|
-
pct = int(input(prompt))
|
|
154
|
-
while not 1 <= pct <= max_pct:
|
|
155
|
-
pct = int(input(f'Invalid input. Please enter a number between 1 and {max_pct}: '))
|
|
156
|
-
return pct
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
def main():
|
|
160
|
-
filename = sys.argv[1]
|
|
161
|
-
multi_val = input('Do any columns have multiple values? (y/[n]) ')
|
|
162
|
-
separator = input('Enter the separator: ') if multi_val.lower() == 'y' else None
|
|
163
|
-
|
|
164
|
-
preprocessor = DataPreprocessor(filename, multi_val, separator)
|
|
165
|
-
preprocessor.process_data()
|
|
166
|
-
|
|
167
|
-
split_dataset = input('Would you like to split this dataset? (y/[n]) ')
|
|
168
|
-
shuffle_dataset = 'n'
|
|
169
|
-
seed = '555'
|
|
170
|
-
|
|
171
|
-
if split_dataset.lower() == 'y':
|
|
172
|
-
shuffle_dataset = input('Would you like to shuffle the dataset before splitting? (y/[n]) ')
|
|
173
|
-
if shuffle_dataset.lower() == 'y':
|
|
174
|
-
seed = input('Enter a seed integer or hit enter to use the default [555]: ')
|
|
175
|
-
|
|
176
|
-
splitter = DatasetSplitter(preprocessor.df, split_dataset, shuffle_dataset, seed)
|
|
177
|
-
splitter.split_and_save(os.path.splitext(filename)[0] + '-clarifai')
|
|
178
|
-
print("Done!")
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
if __name__ == "__main__":
|
|
182
|
-
main()
|
|
@@ -1,87 +0,0 @@
|
|
|
1
|
-
from collections import defaultdict
|
|
2
|
-
from typing import Iterator, List, Tuple
|
|
3
|
-
|
|
4
|
-
from clarifai_grpc.grpc.api import resources_pb2
|
|
5
|
-
from google.protobuf.struct_pb2 import Struct
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
class ClarifaiDataset:
|
|
9
|
-
"""
|
|
10
|
-
Clarifai datasets base class.
|
|
11
|
-
"""
|
|
12
|
-
|
|
13
|
-
def __init__(self, datagen_object: Iterator, dataset_id: str, split: str) -> None:
|
|
14
|
-
self.datagen_object = datagen_object
|
|
15
|
-
self.dataset_id = dataset_id
|
|
16
|
-
self.split = split
|
|
17
|
-
self.input_ids = []
|
|
18
|
-
self._all_input_protos = {}
|
|
19
|
-
self._all_annotation_protos = defaultdict(list)
|
|
20
|
-
|
|
21
|
-
def __len__(self) -> int:
|
|
22
|
-
"""
|
|
23
|
-
Get size of all input protos
|
|
24
|
-
"""
|
|
25
|
-
return len(self._all_input_protos)
|
|
26
|
-
|
|
27
|
-
def _to_list(self, input_protos: Iterator) -> List:
|
|
28
|
-
"""
|
|
29
|
-
Parse protos iterator to list.
|
|
30
|
-
"""
|
|
31
|
-
return list(input_protos)
|
|
32
|
-
|
|
33
|
-
def create_input_protos(self, image_path: str, label: str, input_id: str, dataset_id: str,
|
|
34
|
-
metadata: Struct) -> resources_pb2.Input:
|
|
35
|
-
"""
|
|
36
|
-
Create input protos for each image, label input pair.
|
|
37
|
-
Args:
|
|
38
|
-
`image_path`: path to image.
|
|
39
|
-
`label`: image label
|
|
40
|
-
`input_id: unique input id
|
|
41
|
-
`dataset_id`: Clarifai dataset id
|
|
42
|
-
`metadata`: input metadata
|
|
43
|
-
Returns:
|
|
44
|
-
An input proto representing a single row input
|
|
45
|
-
"""
|
|
46
|
-
raise NotImplementedError()
|
|
47
|
-
|
|
48
|
-
def _extract_protos(self) -> None:
|
|
49
|
-
"""
|
|
50
|
-
Create input image protos for each data generator item.
|
|
51
|
-
"""
|
|
52
|
-
raise NotImplementedError()
|
|
53
|
-
|
|
54
|
-
def get_protos(self, input_ids: List[str]
|
|
55
|
-
) -> Tuple[List[resources_pb2.Input], List[resources_pb2.Annotation]]:
|
|
56
|
-
"""
|
|
57
|
-
Get input and annotation protos based on input_ids.
|
|
58
|
-
Args:
|
|
59
|
-
`input_ids`: List of input IDs to retrieve the protos for.
|
|
60
|
-
Returns:
|
|
61
|
-
Input and Annotation proto iterators for the specified input IDs.
|
|
62
|
-
"""
|
|
63
|
-
input_protos = [self._all_input_protos.get(input_id) for input_id in input_ids]
|
|
64
|
-
annotation_protos = []
|
|
65
|
-
if len(self._all_annotation_protos) > 0:
|
|
66
|
-
annotation_protos = [self._annotation_protos.get(input_id) for input_id in input_ids]
|
|
67
|
-
annotation_protos = [
|
|
68
|
-
ann_proto for ann_protos in annotation_protos for ann_proto in ann_protos
|
|
69
|
-
]
|
|
70
|
-
|
|
71
|
-
return input_protos, annotation_protos
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
class Chunker:
|
|
75
|
-
"""
|
|
76
|
-
Split an input sequence into small chunks.
|
|
77
|
-
"""
|
|
78
|
-
|
|
79
|
-
def __init__(self, seq: List, size: int) -> None:
|
|
80
|
-
self.seq = seq
|
|
81
|
-
self.size = size
|
|
82
|
-
|
|
83
|
-
def chunk(self) -> List[List]:
|
|
84
|
-
"""
|
|
85
|
-
Chunk input sequence.
|
|
86
|
-
"""
|
|
87
|
-
return [self.seq[pos:pos + self.size] for pos in range(0, len(self.seq), self.size)]
|
|
@@ -1,253 +0,0 @@
|
|
|
1
|
-
import os
|
|
2
|
-
from typing import Iterator, List, Union
|
|
3
|
-
|
|
4
|
-
from clarifai_grpc.grpc.api import resources_pb2
|
|
5
|
-
from google.protobuf.struct_pb2 import Struct
|
|
6
|
-
from tqdm import tqdm
|
|
7
|
-
|
|
8
|
-
from .base import ClarifaiDataset
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
class VisualClassificationDataset(ClarifaiDataset):
|
|
12
|
-
|
|
13
|
-
def __init__(self, datagen_object: Iterator, dataset_id: str, split: str) -> None:
|
|
14
|
-
super().__init__(datagen_object, dataset_id, split)
|
|
15
|
-
self._extract_protos()
|
|
16
|
-
|
|
17
|
-
def create_input_protos(self, image_path: str, labels: List[Union[str, int]], input_id: str,
|
|
18
|
-
dataset_id: str, geo_info: Union[List[float], None],
|
|
19
|
-
metadata: Struct) -> resources_pb2.Input:
|
|
20
|
-
"""
|
|
21
|
-
Create input protos for each image, label input pair.
|
|
22
|
-
Args:
|
|
23
|
-
`image_path`: image path.
|
|
24
|
-
`labels`: image label(s)
|
|
25
|
-
`input_id: unique input id
|
|
26
|
-
`dataset_id`: Clarifai dataset id
|
|
27
|
-
`geo_info`: image longitude, latitude info
|
|
28
|
-
`metadata`: image metadata
|
|
29
|
-
Returns:
|
|
30
|
-
An input proto representing a single row input
|
|
31
|
-
"""
|
|
32
|
-
geo_pb = resources_pb2.Geo(geo_point=resources_pb2.GeoPoint(
|
|
33
|
-
longitude=geo_info[0], latitude=geo_info[1])) if geo_info is not None else None
|
|
34
|
-
|
|
35
|
-
input_proto = resources_pb2.Input(
|
|
36
|
-
id=input_id,
|
|
37
|
-
dataset_ids=[dataset_id],
|
|
38
|
-
data=resources_pb2.Data(
|
|
39
|
-
image=resources_pb2.Image(base64=open(image_path, 'rb').read(),),
|
|
40
|
-
geo=geo_pb,
|
|
41
|
-
concepts=[
|
|
42
|
-
resources_pb2.Concept(
|
|
43
|
-
id=f"id-{''.join(_label.split(' '))}", name=_label, value=1.)\
|
|
44
|
-
for _label in labels
|
|
45
|
-
],
|
|
46
|
-
metadata=metadata))
|
|
47
|
-
|
|
48
|
-
return input_proto
|
|
49
|
-
|
|
50
|
-
def _extract_protos(self) -> None:
|
|
51
|
-
"""
|
|
52
|
-
Create input image protos for each data generator item.
|
|
53
|
-
"""
|
|
54
|
-
for i, item in tqdm(enumerate(self.datagen_object), desc="Creating input protos..."):
|
|
55
|
-
metadata = Struct()
|
|
56
|
-
image_path = item.image_path
|
|
57
|
-
label = item.label if isinstance(item.label, list) else [item.label] # clarifai concept
|
|
58
|
-
input_id = f"{self.dataset_id}-{self.split}-{i}" if item.id is None else f"{self.split}-{str(item.id)}"
|
|
59
|
-
geo_info = item.geo_info
|
|
60
|
-
metadata.update({"filename": os.path.basename(image_path), "split": self.split})
|
|
61
|
-
|
|
62
|
-
self.input_ids.append(input_id)
|
|
63
|
-
input_proto = self.create_input_protos(image_path, label, input_id, self.dataset_id,
|
|
64
|
-
geo_info, metadata)
|
|
65
|
-
self._all_input_protos[input_id] = input_proto
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
class VisualDetectionDataset(ClarifaiDataset):
|
|
69
|
-
"""
|
|
70
|
-
Visual detection dataset proto class.
|
|
71
|
-
"""
|
|
72
|
-
|
|
73
|
-
def __init__(self, datagen_object: Iterator, dataset_id: str, split: str) -> None:
|
|
74
|
-
super().__init__(datagen_object, dataset_id, split)
|
|
75
|
-
self._extract_protos()
|
|
76
|
-
|
|
77
|
-
def create_input_protos(self, image_path: str, input_id: str, dataset_id: str,
|
|
78
|
-
geo_info: Union[List[float], None],
|
|
79
|
-
metadata: Struct) -> resources_pb2.Input:
|
|
80
|
-
"""
|
|
81
|
-
Create input protos for each image, label input pair.
|
|
82
|
-
Args:
|
|
83
|
-
`image_path`: file path to image
|
|
84
|
-
`input_id: unique input id
|
|
85
|
-
`dataset_id`: Clarifai dataset id
|
|
86
|
-
`geo_info`: image longitude, latitude info
|
|
87
|
-
`metadata`: image metadata
|
|
88
|
-
Returns:
|
|
89
|
-
An input proto representing a single row input
|
|
90
|
-
"""
|
|
91
|
-
geo_pb = resources_pb2.Geo(geo_point=resources_pb2.GeoPoint(
|
|
92
|
-
longitude=geo_info[0], latitude=geo_info[1])) if geo_info is not None else None
|
|
93
|
-
input_image_proto = resources_pb2.Input(
|
|
94
|
-
id=input_id,
|
|
95
|
-
dataset_ids=[dataset_id],
|
|
96
|
-
data=resources_pb2.Data(
|
|
97
|
-
image=resources_pb2.Image(base64=open(image_path, 'rb').read(),),
|
|
98
|
-
geo=geo_pb,
|
|
99
|
-
metadata=metadata))
|
|
100
|
-
|
|
101
|
-
return input_image_proto
|
|
102
|
-
|
|
103
|
-
def create_annotation_proto(self, label: str, annotations: List, input_id: str,
|
|
104
|
-
dataset_id: str) -> resources_pb2.Annotation:
|
|
105
|
-
"""
|
|
106
|
-
Create an input proto for each bounding box, label input pair.
|
|
107
|
-
Args:
|
|
108
|
-
`label`: annotation label
|
|
109
|
-
`annotations`: a list of a single bbox's coordinates.
|
|
110
|
-
`input_id: unique input id
|
|
111
|
-
`dataset_id`: Clarifai dataset id
|
|
112
|
-
Returns:
|
|
113
|
-
An input proto representing a single image input
|
|
114
|
-
"""
|
|
115
|
-
input_annot_proto = resources_pb2.Annotation(
|
|
116
|
-
input_id=input_id,
|
|
117
|
-
data=resources_pb2.Data(regions=[
|
|
118
|
-
resources_pb2.Region(
|
|
119
|
-
region_info=resources_pb2.RegionInfo(bounding_box=resources_pb2.BoundingBox(
|
|
120
|
-
# Annotations ordering: [xmin, ymin, xmax, ymax]
|
|
121
|
-
# top_row must be less than bottom row
|
|
122
|
-
# left_col must be less than right col
|
|
123
|
-
top_row=annotations[1], #y_min
|
|
124
|
-
left_col=annotations[0], #x_min
|
|
125
|
-
bottom_row=annotations[3], #y_max
|
|
126
|
-
right_col=annotations[2] #x_max
|
|
127
|
-
)),
|
|
128
|
-
data=resources_pb2.Data(concepts=[
|
|
129
|
-
resources_pb2.Concept(
|
|
130
|
-
id=f"id-{''.join(label.split(' '))}", name=label, value=1.)
|
|
131
|
-
]))
|
|
132
|
-
]))
|
|
133
|
-
|
|
134
|
-
return input_annot_proto
|
|
135
|
-
|
|
136
|
-
def _extract_protos(self) -> None:
|
|
137
|
-
"""
|
|
138
|
-
Create input image protos for each data generator item.
|
|
139
|
-
"""
|
|
140
|
-
for i, item in tqdm(enumerate(self.datagen_object), desc="Creating input protos..."):
|
|
141
|
-
metadata = Struct()
|
|
142
|
-
image = item.image_path
|
|
143
|
-
labels = item.classes # list:[l1,...,ln]
|
|
144
|
-
bboxes = item.bboxes # [[xmin,ymin,xmax,ymax],...,[xmin,ymin,xmax,ymax]]
|
|
145
|
-
input_id = f"{self.dataset_id}-{self.split}-{i}" if item.id is None else f"{self.split}-{str(item.id)}"
|
|
146
|
-
metadata.update({"filename": os.path.basename(image), "split": self.split})
|
|
147
|
-
geo_info = item.geo_info
|
|
148
|
-
|
|
149
|
-
self.input_ids.append(input_id)
|
|
150
|
-
input_image_proto = self.create_input_protos(image, input_id, self.dataset_id, geo_info,
|
|
151
|
-
metadata)
|
|
152
|
-
self._all_input_protos[input_id] = input_image_proto
|
|
153
|
-
|
|
154
|
-
# iter over bboxes and classes
|
|
155
|
-
# one id could have more than one bbox and label
|
|
156
|
-
for i in range(len(bboxes)):
|
|
157
|
-
input_annot_proto = self.create_annotation_proto(labels[i], bboxes[i], input_id,
|
|
158
|
-
self.dataset_id)
|
|
159
|
-
self._all_annotation_protos[input_id].append(input_annot_proto)
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
class VisualSegmentationDataset(ClarifaiDataset):
|
|
163
|
-
"""
|
|
164
|
-
Visual segmentation dataset proto class.
|
|
165
|
-
"""
|
|
166
|
-
|
|
167
|
-
def __init__(self, datagen_object: Iterator, dataset_id: str, split: str) -> None:
|
|
168
|
-
super().__init__(datagen_object, dataset_id, split)
|
|
169
|
-
self._extract_protos()
|
|
170
|
-
|
|
171
|
-
def create_input_protos(self, image_path: str, input_id: str, dataset_id: str,
|
|
172
|
-
geo_info: Union[List[float], None],
|
|
173
|
-
metadata: Struct) -> resources_pb2.Input:
|
|
174
|
-
"""
|
|
175
|
-
Create input protos for each image, label input pair.
|
|
176
|
-
Args:
|
|
177
|
-
`image_path`: absolute image file path
|
|
178
|
-
`input_id: unique input id
|
|
179
|
-
`dataset_id`: Clarifai dataset id
|
|
180
|
-
`geo_info`: image longitude, latitude info
|
|
181
|
-
`metadata`: image metadata
|
|
182
|
-
Returns:
|
|
183
|
-
An input proto representing a single input item
|
|
184
|
-
"""
|
|
185
|
-
geo_pb = resources_pb2.Geo(geo_point=resources_pb2.GeoPoint(
|
|
186
|
-
longitude=geo_info[0], latitude=geo_info[1])) if geo_info is not None else None
|
|
187
|
-
input_image_proto = resources_pb2.Input(
|
|
188
|
-
id=input_id,
|
|
189
|
-
dataset_ids=[dataset_id],
|
|
190
|
-
data=resources_pb2.Data(
|
|
191
|
-
image=resources_pb2.Image(base64=open(image_path, 'rb').read(),),
|
|
192
|
-
geo=geo_pb,
|
|
193
|
-
metadata=metadata))
|
|
194
|
-
|
|
195
|
-
return input_image_proto
|
|
196
|
-
|
|
197
|
-
def create_mask_proto(self, label: str, polygons: List[List[float]], input_id: str,
|
|
198
|
-
dataset_id: str) -> resources_pb2.Annotation:
|
|
199
|
-
"""
|
|
200
|
-
Create an input mask proto for an input polygon/mask and label.
|
|
201
|
-
Args:
|
|
202
|
-
`label`: image label
|
|
203
|
-
`polygons`: Polygon x,y points iterable
|
|
204
|
-
`input_id: unique input id
|
|
205
|
-
`dataset_id`: Clarifai dataset id
|
|
206
|
-
Returns:
|
|
207
|
-
An input proto corresponding to a single image
|
|
208
|
-
"""
|
|
209
|
-
input_mask_proto = resources_pb2.Annotation(
|
|
210
|
-
input_id=input_id,
|
|
211
|
-
data=resources_pb2.Data(regions=[
|
|
212
|
-
resources_pb2.Region(
|
|
213
|
-
region_info=resources_pb2.RegionInfo(polygon=resources_pb2.Polygon(
|
|
214
|
-
points=[
|
|
215
|
-
resources_pb2.Point(
|
|
216
|
-
row=_point[1], # row is y point
|
|
217
|
-
col=_point[0], # col is x point
|
|
218
|
-
visibility="VISIBLE") for _point in polygons
|
|
219
|
-
])),
|
|
220
|
-
data=resources_pb2.Data(concepts=[
|
|
221
|
-
resources_pb2.Concept(
|
|
222
|
-
id=f"id-{''.join(label.split(' '))}", name=label, value=1.)
|
|
223
|
-
]))
|
|
224
|
-
]))
|
|
225
|
-
|
|
226
|
-
return input_mask_proto
|
|
227
|
-
|
|
228
|
-
def _extract_protos(self) -> None:
|
|
229
|
-
"""
|
|
230
|
-
Create input image and annotation protos for each data generator item.
|
|
231
|
-
"""
|
|
232
|
-
for i, item in tqdm(enumerate(self.datagen_object), desc="Creating input protos..."):
|
|
233
|
-
metadata = Struct()
|
|
234
|
-
image = item.image_path # image path
|
|
235
|
-
labels = item.classes # list of class labels
|
|
236
|
-
_polygons = item.polygons # list of polygons: [[[x,y],...,[x,y]],...]
|
|
237
|
-
input_id = f"{self.dataset_id}-{self.split}-{i}" if item.id is None else f"{self.split}-{str(item.id)}"
|
|
238
|
-
metadata.update({"filename": os.path.basename(image), "split": self.split})
|
|
239
|
-
geo_info = item.geo_info
|
|
240
|
-
|
|
241
|
-
self.input_ids.append(input_id)
|
|
242
|
-
input_image_proto = self.create_input_protos(image, input_id, self.dataset_id, geo_info,
|
|
243
|
-
metadata)
|
|
244
|
-
self._all_input_protos[input_id] = input_image_proto
|
|
245
|
-
|
|
246
|
-
## Iterate over each masked image and create a proto for upload to clarifai
|
|
247
|
-
## The length of masks/polygons-list and labels must be equal
|
|
248
|
-
for i, _polygon in enumerate(_polygons):
|
|
249
|
-
try:
|
|
250
|
-
input_mask_proto = self.create_mask_proto(labels[i], _polygon, input_id, self.dataset_id)
|
|
251
|
-
self._all_annotation_protos[input_id].append(input_mask_proto)
|
|
252
|
-
except IndexError:
|
|
253
|
-
continue
|
|
@@ -1,60 +0,0 @@
|
|
|
1
|
-
from typing import Iterator, List
|
|
2
|
-
|
|
3
|
-
from clarifai_grpc.grpc.api import resources_pb2
|
|
4
|
-
from google.protobuf.struct_pb2 import Struct
|
|
5
|
-
from tqdm import tqdm
|
|
6
|
-
|
|
7
|
-
from .base import ClarifaiDataset
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
class TextClassificationDataset(ClarifaiDataset):
|
|
11
|
-
"""
|
|
12
|
-
Upload text classification datasets to clarifai datasets
|
|
13
|
-
"""
|
|
14
|
-
|
|
15
|
-
def __init__(self, datagen_object: Iterator, dataset_id: str, split: str) -> None:
|
|
16
|
-
super().__init__(datagen_object, dataset_id, split)
|
|
17
|
-
self._extract_protos()
|
|
18
|
-
|
|
19
|
-
def create_input_protos(self, text_input: str, labels: List[str], input_id: str, dataset_id: str,
|
|
20
|
-
metadata: Struct) -> resources_pb2.Input:
|
|
21
|
-
"""
|
|
22
|
-
Create input protos for each text, label input pairs.
|
|
23
|
-
Args:
|
|
24
|
-
`text_input`: text string.
|
|
25
|
-
`labels`: text labels
|
|
26
|
-
`input_id: unique input id
|
|
27
|
-
`dataset_id`: Clarifai dataset id
|
|
28
|
-
`metadata`:input metadata
|
|
29
|
-
Returns:
|
|
30
|
-
An input proto representing a single row input
|
|
31
|
-
"""
|
|
32
|
-
input_proto = resources_pb2.Input(
|
|
33
|
-
id=input_id,
|
|
34
|
-
dataset_ids=[dataset_id],
|
|
35
|
-
data=resources_pb2.Data(
|
|
36
|
-
text=resources_pb2.Text(raw=text_input),
|
|
37
|
-
concepts=[
|
|
38
|
-
resources_pb2.Concept(
|
|
39
|
-
id=f"id-{''.join(_label.split(' '))}", name=_label, value=1.)
|
|
40
|
-
for _label in labels
|
|
41
|
-
],
|
|
42
|
-
metadata=metadata))
|
|
43
|
-
|
|
44
|
-
return input_proto
|
|
45
|
-
|
|
46
|
-
def _extract_protos(self) -> None:
|
|
47
|
-
"""
|
|
48
|
-
Creates input protos for each data generator item.
|
|
49
|
-
"""
|
|
50
|
-
for i, item in tqdm(enumerate(self.datagen_object), desc="Loading text data"):
|
|
51
|
-
metadata = Struct()
|
|
52
|
-
text = item.text
|
|
53
|
-
labels = item.labels if isinstance(item.labels, list) else [item.labels] # clarifai concept
|
|
54
|
-
input_id = f"{self.dataset_id}-{self.split}-{i}" if item.id is None else f"{self.split}-{str(item.id)}"
|
|
55
|
-
metadata.update({"split": self.split})
|
|
56
|
-
|
|
57
|
-
self.input_ids.append(input_id)
|
|
58
|
-
input_proto = self.create_input_protos(text, labels, input_id, self.dataset_id, metadata)
|
|
59
|
-
|
|
60
|
-
self._all_input_protos[input_id] = input_proto
|
|
@@ -1,55 +0,0 @@
|
|
|
1
|
-
## Datasets Zoo
|
|
2
|
-
|
|
3
|
-
A collection of data preprocessing modules for popular public datasets to allow for compatible upload into Clarifai user app datasets.
|
|
4
|
-
|
|
5
|
-
## Usage
|
|
6
|
-
|
|
7
|
-
If a dataset module exists in the zoo, uploading the specific dataset can be easily done by simply creating a python script (or via commandline) and specifying the dataset module name in the `from_zoo` parameter of the `UploadConfig` class .i.e.
|
|
8
|
-
|
|
9
|
-
```python
|
|
10
|
-
from clarifai.data_upload.upload import UploadConfig
|
|
11
|
-
|
|
12
|
-
upload_obj = UploadConfig(
|
|
13
|
-
user_id="",
|
|
14
|
-
app_id="",
|
|
15
|
-
pat="", # Clarifai user PAT (not Clarifai app PAT)
|
|
16
|
-
dataset_id="",
|
|
17
|
-
task="",
|
|
18
|
-
from_zoo="coco_detection",
|
|
19
|
-
split="val" # train, val or test depending on the dataset
|
|
20
|
-
)
|
|
21
|
-
# execute data upload to Clarifai app dataset
|
|
22
|
-
upload_obj.upload_to_clarifai()
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
## Zoo Datasets
|
|
26
|
-
|
|
27
|
-
| dataset name | task | module name (.py) | splits |
|
|
28
|
-
| --- | --- | --- | --- |
|
|
29
|
-
| [COCO 2017](https://cocodataset.org/#download) | Detection | `coco_detection` | `train`, `val` |
|
|
30
|
-
| | Segmentation | `coco_segmentation` | `train`, `val` |
|
|
31
|
-
| | Captions | `coco_captions` | `train`, `val` |
|
|
32
|
-
|[xVIEW](http://xviewdataset.org/) | Detection | `xview_detection` | `train`
|
|
33
|
-
| [ImageNet](https://www.image-net.org/) | Classification | `imagenet_classification` | `train`
|
|
34
|
-
## Contributing Modules
|
|
35
|
-
|
|
36
|
-
A dataset (preprocessing) module is a python script that contains a dataset class which implements data download (to download the dataset from a source to local disk dir) & extraction and dataloader methods.
|
|
37
|
-
|
|
38
|
-
The class naming convention is `<datasetname>Dataset`. The dataset class must accept `split` as the only argument in the `__init__` method and the `dataloader` method must be a generator that yields either of `VisualClassificationFeatures()`, `VisualDetectionFeatures()`, `VisualSegmentationFeatures()` or `TextFeatures()` as defined in [clarifai/data_upload/datasets/features.py](datasets/features.py). Other methods can be added as seen fit but `dataloader()` is the main method and must strictly be named `dataloader`.
|
|
39
|
-
Reference can be taken from the existing dataset modules in the zoo for development.
|
|
40
|
-
|
|
41
|
-
## Notes
|
|
42
|
-
|
|
43
|
-
* Dataset in the zoo by default first create a `data` directory in the local directory where the call to `UploadConfig(...).upload_to_clarifai()` is made and then download the data into this `data` directory, preprocess the data and finally execute upload to a Clarifai app dataset. For instance with the COCO dataset modules above, the coco2017 dataset is by default downloaded first into a `data` directory, extracted and then preprocessing is performed on it and finally uploaded to Clarifai.
|
|
44
|
-
|
|
45
|
-
* Taking the above into consideration, to avoid the scripts re-downloading data you already have locally, create a `data` directory in the same directory where you'll make a call to `UploadConfig(...).upload_to_clarifai()` and move your extracted data there. **Ensure that the extracted folder/file names and file structure MATCH those when the downloaded zips are extracted.**
|
|
46
|
-
|
|
47
|
-
* COCO Format: To reuse the coco modules above on your coco format data, ensure the criteria in the two points above is adhered to first. If so, pass the coco module name from any of the above in the zoo to the `from_zoo=` parameter in `UploadConfig()` and finally invoke the `upload_to_clarifai()` method.
|
|
48
|
-
|
|
49
|
-
* xVIEW Dataset: To upload, you have to register and download images,label from [xviewdataset](http://xviewdataset.org/#dataset) follow the above mentioned steps to place extracted folder in `data` directory. Finally pass the xview module name to `from_zoo=` parameter in `UploadConfig()` and invoke the `upload_to_clarifai()` method.
|
|
50
|
-
|
|
51
|
-
* ImageNet Dataset: ImageNet Dataset should be downloaded and placed in the 'data' folder along with the [label mapping file](https://www.kaggle.com/competitions/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txt).
|
|
52
|
-
|
|
53
|
-
<data>/
|
|
54
|
-
├── train/
|
|
55
|
-
├── LOC_synset_mapping.txt
|