spark-nlp 5.3.3__py2.py3-none-any.whl → 5.4.0rc1__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of spark-nlp might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: spark-nlp
3
- Version: 5.3.3
3
+ Version: 5.4.0rc1
4
4
  Summary: John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.
5
5
  Home-page: https://github.com/JohnSnowLabs/spark-nlp
6
6
  Author: John Snow Labs
@@ -197,7 +197,7 @@ To use Spark NLP you need the following requirements:
197
197
 
198
198
  **GPU (optional):**
199
199
 
200
- Spark NLP 5.3.3 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
200
+ Spark NLP 5.4.0-rc1 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
201
201
 
202
202
  - NVIDIA® GPU drivers version 450.80.02 or higher
203
203
  - CUDA® Toolkit 11.2
@@ -213,7 +213,7 @@ $ java -version
213
213
  $ conda create -n sparknlp python=3.7 -y
214
214
  $ conda activate sparknlp
215
215
  # spark-nlp by default is based on pyspark 3.x
216
- $ pip install spark-nlp==5.3.3 pyspark==3.3.1
216
+ $ pip install spark-nlp==5.4.0-rc1 pyspark==3.3.1
217
217
  ```
218
218
 
219
219
  In Python console or Jupyter `Python3` kernel:
@@ -258,7 +258,7 @@ For more examples, you can visit our dedicated [examples](https://github.com/Joh
258
258
 
259
259
  ## Apache Spark Support
260
260
 
261
- Spark NLP *5.3.3* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
261
+ Spark NLP *5.4.0-rc1* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
262
262
 
263
263
  | Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
264
264
  |-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
@@ -302,7 +302,7 @@ Find out more about `Spark NLP` versions from our [release notes](https://github
302
302
 
303
303
  ## Databricks Support
304
304
 
305
- Spark NLP 5.3.3 has been tested and is compatible with the following runtimes:
305
+ Spark NLP 5.4.0-rc1 has been tested and is compatible with the following runtimes:
306
306
 
307
307
  **CPU:**
308
308
 
@@ -375,7 +375,7 @@ Spark NLP 5.3.3 has been tested and is compatible with the following runtimes:
375
375
 
376
376
  ## EMR Support
377
377
 
378
- Spark NLP 5.3.3 has been tested and is compatible with the following EMR releases:
378
+ Spark NLP 5.4.0-rc1 has been tested and is compatible with the following EMR releases:
379
379
 
380
380
  - emr-6.2.0
381
381
  - emr-6.3.0
@@ -425,11 +425,11 @@ Spark NLP supports all major releases of Apache Spark 3.0.x, Apache Spark 3.1.x,
425
425
  ```sh
426
426
  # CPU
427
427
 
428
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
428
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
429
429
 
430
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
430
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
431
431
 
432
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
432
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
433
433
  ```
434
434
 
435
435
  The `spark-nlp` has been published to
@@ -438,11 +438,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
438
438
  ```sh
439
439
  # GPU
440
440
 
441
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.3.3
441
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
442
442
 
443
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.3.3
443
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
444
444
 
445
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.3.3
445
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
446
446
 
447
447
  ```
448
448
 
@@ -452,11 +452,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
452
452
  ```sh
453
453
  # AArch64
454
454
 
455
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.3.3
455
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
456
456
 
457
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.3.3
457
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
458
458
 
459
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.3.3
459
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
460
460
 
461
461
  ```
462
462
 
@@ -466,11 +466,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
466
466
  ```sh
467
467
  # M1/M2 (Apple Silicon)
468
468
 
469
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.3.3
469
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
470
470
 
471
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.3.3
471
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
472
472
 
473
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.3.3
473
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
474
474
 
475
475
  ```
476
476
 
@@ -484,7 +484,7 @@ set in your SparkSession:
484
484
  spark-shell \
485
485
  --driver-memory 16g \
486
486
  --conf spark.kryoserializer.buffer.max=2000M \
487
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
487
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
488
488
  ```
489
489
 
490
490
  ## Scala
@@ -502,7 +502,7 @@ coordinates:
502
502
  <dependency>
503
503
  <groupId>com.johnsnowlabs.nlp</groupId>
504
504
  <artifactId>spark-nlp_2.12</artifactId>
505
- <version>5.3.3</version>
505
+ <version>5.4.0-rc1</version>
506
506
  </dependency>
507
507
  ```
508
508
 
@@ -513,7 +513,7 @@ coordinates:
513
513
  <dependency>
514
514
  <groupId>com.johnsnowlabs.nlp</groupId>
515
515
  <artifactId>spark-nlp-gpu_2.12</artifactId>
516
- <version>5.3.3</version>
516
+ <version>5.4.0-rc1</version>
517
517
  </dependency>
518
518
  ```
519
519
 
@@ -524,7 +524,7 @@ coordinates:
524
524
  <dependency>
525
525
  <groupId>com.johnsnowlabs.nlp</groupId>
526
526
  <artifactId>spark-nlp-aarch64_2.12</artifactId>
527
- <version>5.3.3</version>
527
+ <version>5.4.0-rc1</version>
528
528
  </dependency>
529
529
  ```
530
530
 
@@ -535,7 +535,7 @@ coordinates:
535
535
  <dependency>
536
536
  <groupId>com.johnsnowlabs.nlp</groupId>
537
537
  <artifactId>spark-nlp-silicon_2.12</artifactId>
538
- <version>5.3.3</version>
538
+ <version>5.4.0-rc1</version>
539
539
  </dependency>
540
540
  ```
541
541
 
@@ -545,28 +545,28 @@ coordinates:
545
545
 
546
546
  ```sbtshell
547
547
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
548
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.3.3"
548
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0-rc1"
549
549
  ```
550
550
 
551
551
  **spark-nlp-gpu:**
552
552
 
553
553
  ```sbtshell
554
554
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu
555
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.3.3"
555
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0-rc1"
556
556
  ```
557
557
 
558
558
  **spark-nlp-aarch64:**
559
559
 
560
560
  ```sbtshell
561
561
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64
562
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.3.3"
562
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0-rc1"
563
563
  ```
564
564
 
565
565
  **spark-nlp-silicon:**
566
566
 
567
567
  ```sbtshell
568
568
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon
569
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.3.3"
569
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0-rc1"
570
570
  ```
571
571
 
572
572
  Maven
@@ -588,7 +588,7 @@ If you installed pyspark through pip/conda, you can install `spark-nlp` through
588
588
  Pip:
589
589
 
590
590
  ```bash
591
- pip install spark-nlp==5.3.3
591
+ pip install spark-nlp==5.4.0-rc1
592
592
  ```
593
593
 
594
594
  Conda:
@@ -617,7 +617,7 @@ spark = SparkSession.builder
617
617
  .config("spark.driver.memory", "16G")
618
618
  .config("spark.driver.maxResultSize", "0")
619
619
  .config("spark.kryoserializer.buffer.max", "2000M")
620
- .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3")
620
+ .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1")
621
621
  .getOrCreate()
622
622
  ```
623
623
 
@@ -688,7 +688,7 @@ Use either one of the following options
688
688
  - Add the following Maven Coordinates to the interpreter's library list
689
689
 
690
690
  ```bash
691
- com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
691
+ com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
692
692
  ```
693
693
 
694
694
  - Add a path to pre-built jar from [here](#compiled-jars) in the interpreter's library list making sure the jar is
@@ -699,7 +699,7 @@ com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
699
699
  Apart from the previous step, install the python module through pip
700
700
 
701
701
  ```bash
702
- pip install spark-nlp==5.3.3
702
+ pip install spark-nlp==5.4.0-rc1
703
703
  ```
704
704
 
705
705
  Or you can install `spark-nlp` from inside Zeppelin by using Conda:
@@ -727,7 +727,7 @@ launch the Jupyter from the same Python environment:
727
727
  $ conda create -n sparknlp python=3.8 -y
728
728
  $ conda activate sparknlp
729
729
  # spark-nlp by default is based on pyspark 3.x
730
- $ pip install spark-nlp==5.3.3 pyspark==3.3.1 jupyter
730
+ $ pip install spark-nlp==5.4.0-rc1 pyspark==3.3.1 jupyter
731
731
  $ jupyter notebook
732
732
  ```
733
733
 
@@ -744,7 +744,7 @@ export PYSPARK_PYTHON=python3
744
744
  export PYSPARK_DRIVER_PYTHON=jupyter
745
745
  export PYSPARK_DRIVER_PYTHON_OPTS=notebook
746
746
 
747
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
747
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
748
748
  ```
749
749
 
750
750
  Alternatively, you can mix in using `--jars` option for pyspark + `pip install spark-nlp`
@@ -771,7 +771,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
771
771
  # -s is for spark-nlp
772
772
  # -g will enable upgrading libcudnn8 to 8.1.0 on Google Colab for GPU usage
773
773
  # by default they are set to the latest
774
- !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.3.3
774
+ !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc1
775
775
  ```
776
776
 
777
777
  [Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb)
@@ -794,7 +794,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
794
794
  # -s is for spark-nlp
795
795
  # -g will enable upgrading libcudnn8 to 8.1.0 on Kaggle for GPU usage
796
796
  # by default they are set to the latest
797
- !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.3.3
797
+ !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc1
798
798
  ```
799
799
 
800
800
  [Spark NLP quick start on Kaggle Kernel](https://www.kaggle.com/mozzie/spark-nlp-named-entity-recognition) is a live
@@ -813,9 +813,9 @@ demo on Kaggle Kernel that performs named entity recognitions by using Spark NLP
813
813
 
814
814
  3. In `Libraries` tab inside your cluster you need to follow these steps:
815
815
 
816
- 3.1. Install New -> PyPI -> `spark-nlp==5.3.3` -> Install
816
+ 3.1. Install New -> PyPI -> `spark-nlp==5.4.0-rc1` -> Install
817
817
 
818
- 3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3` -> Install
818
+ 3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1` -> Install
819
819
 
820
820
  4. Now you can attach your notebook to the cluster and use Spark NLP!
821
821
 
@@ -866,7 +866,7 @@ A sample of your software configuration in JSON on S3 (must be public access):
866
866
  "spark.kryoserializer.buffer.max": "2000M",
867
867
  "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
868
868
  "spark.driver.maxResultSize": "0",
869
- "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3"
869
+ "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1"
870
870
  }
871
871
  }]
872
872
  ```
@@ -875,7 +875,7 @@ A sample of AWS CLI to launch EMR cluster:
875
875
 
876
876
  ```.sh
877
877
  aws emr create-cluster \
878
- --name "Spark NLP 5.3.3" \
878
+ --name "Spark NLP 5.4.0-rc1" \
879
879
  --release-label emr-6.2.0 \
880
880
  --applications Name=Hadoop Name=Spark Name=Hive \
881
881
  --instance-type m4.4xlarge \
@@ -939,7 +939,7 @@ gcloud dataproc clusters create ${CLUSTER_NAME} \
939
939
  --enable-component-gateway \
940
940
  --metadata 'PIP_PACKAGES=spark-nlp spark-nlp-display google-cloud-bigquery google-cloud-storage' \
941
941
  --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/python/pip-install.sh \
942
- --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
942
+ --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
943
943
  ```
944
944
 
945
945
  2. On an existing one, you need to install spark-nlp and spark-nlp-display packages from PyPI.
@@ -982,7 +982,7 @@ spark = SparkSession.builder
982
982
  .config("spark.kryoserializer.buffer.max", "2000m")
983
983
  .config("spark.jsl.settings.pretrained.cache_folder", "sample_data/pretrained")
984
984
  .config("spark.jsl.settings.storage.cluster_tmp_dir", "sample_data/storage")
985
- .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3")
985
+ .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1")
986
986
  .getOrCreate()
987
987
  ```
988
988
 
@@ -996,7 +996,7 @@ spark-shell \
996
996
  --conf spark.kryoserializer.buffer.max=2000M \
997
997
  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
998
998
  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
999
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
999
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
1000
1000
  ```
1001
1001
 
1002
1002
  **pyspark:**
@@ -1009,7 +1009,7 @@ pyspark \
1009
1009
  --conf spark.kryoserializer.buffer.max=2000M \
1010
1010
  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
1011
1011
  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
1012
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.3
1012
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
1013
1013
  ```
1014
1014
 
1015
1015
  **Databricks:**
@@ -1281,7 +1281,7 @@ spark = SparkSession.builder
1281
1281
  .config("spark.driver.memory", "16G")
1282
1282
  .config("spark.driver.maxResultSize", "0")
1283
1283
  .config("spark.kryoserializer.buffer.max", "2000M")
1284
- .config("spark.jars", "/tmp/spark-nlp-assembly-5.3.3.jar")
1284
+ .config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0-rc1.jar")
1285
1285
  .getOrCreate()
1286
1286
  ```
1287
1287
 
@@ -1290,7 +1290,7 @@ spark = SparkSession.builder
1290
1290
  version (3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x)
1291
1291
  - If you are local, you can load the Fat JAR from your local FileSystem, however, if you are in a cluster setup you need
1292
1292
  to put the Fat JAR on a distributed FileSystem such as HDFS, DBFS, S3, etc. (
1293
- i.e., `hdfs:///tmp/spark-nlp-assembly-5.3.3.jar`)
1293
+ i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0-rc1.jar`)
1294
1294
 
1295
1295
  Example of using pretrained Models and Pipelines in offline:
1296
1296
 
@@ -1,7 +1,7 @@
1
1
  com/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
2
  com/johnsnowlabs/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
3
3
  com/johnsnowlabs/nlp/__init__.py,sha256=DPIVXtONO5xXyOk-HB0-sNiHAcco17NN13zPS_6Uw8c,294
4
- sparknlp/__init__.py,sha256=ZUkW_iY3tWQwa5XvLKprnbvY0_hTCOHJSYWb-KNrvmE,13588
4
+ sparknlp/__init__.py,sha256=LfjmvNVvUTsHxDW1JoM5qb3yWDjqz97vHf1iSybI1b4,13596
5
5
  sparknlp/annotation.py,sha256=I5zOxG5vV2RfPZfqN9enT1i4mo6oBcn3Lrzs37QiOiA,5635
6
6
  sparknlp/annotation_audio.py,sha256=iRV_InSVhgvAwSRe9NTbUH9v6OGvTM-FPCpSAKVu0mE,1917
7
7
  sparknlp/annotation_image.py,sha256=xhCe8Ko-77XqWVuuYHFrjKqF6zPd8Z-RY_rmZXNwCXU,2547
@@ -80,27 +80,27 @@ sparknlp/annotator/dependency/dependency_parser.py,sha256=SxyvHPp8Hs1Xnm5X1nLTMi
80
80
  sparknlp/annotator/dependency/typed_dependency_parser.py,sha256=60vPdYkbFk9MPGegg3m9Uik9cMXpMZd8tBvXG39gNww,12456
81
81
  sparknlp/annotator/embeddings/__init__.py,sha256=XQ6-UMsfvH54u3f0yceKiM8XJOAugIT3jwHE3ExoppI,2156
82
82
  sparknlp/annotator/embeddings/albert_embeddings.py,sha256=6Rd1LIn8oFIpq_ALcJh-RUjPEO7Ht8wsHY6JHSFyMkw,9995
83
- sparknlp/annotator/embeddings/bert_embeddings.py,sha256=uExpIlJNkQpuoZ3J_Zc2b2dV0hDNCRCAujNR4Lckly4,8369
84
- sparknlp/annotator/embeddings/bert_sentence_embeddings.py,sha256=XHls9qOkurwg9o6nDuwk77KSMNJmv1n4L5pcU22alWA,9054
83
+ sparknlp/annotator/embeddings/bert_embeddings.py,sha256=HVUjkg56kBcpGZCo-fmPG5uatMDF3swW_lnbpy1SgSI,8463
84
+ sparknlp/annotator/embeddings/bert_sentence_embeddings.py,sha256=NQy9KuXT9aKsTpYCR5RAeoFWI2YqEGorbdYrf_0KKmw,9148
85
85
  sparknlp/annotator/embeddings/bge_embeddings.py,sha256=FNmYxcynM1iLJvg5ZNmrZKkyIF0Gtr7G-CgZ72mrVyU,7842
86
86
  sparknlp/annotator/embeddings/camembert_embeddings.py,sha256=dBTXas-2Tas_JUR9Xt_GtHLcyqi_cdvT5EHRnyVrSSQ,8817
87
87
  sparknlp/annotator/embeddings/chunk_embeddings.py,sha256=WUmkJimSuFkdcLJnvcxOV0QlCLgGlhub29ZTrZb70WE,6052
88
88
  sparknlp/annotator/embeddings/deberta_embeddings.py,sha256=_b5nzLb7heFQNN-uT2oBNO6-YmM8bHmAdnGXg47HOWw,8649
89
89
  sparknlp/annotator/embeddings/distil_bert_embeddings.py,sha256=4pyMCsbvvXYeTGIMVUir9wCDKR_1f_HKtXZrTDO1Thc,9275
90
90
  sparknlp/annotator/embeddings/doc2vec.py,sha256=Xk3MdEkXatX9lRgbFbAdnIDrLgIxzUIGWFBZeo9BTq0,13226
91
- sparknlp/annotator/embeddings/e5_embeddings.py,sha256=dfPHCAYpayCUMxXtol0t68cDs8-JVu0M4EslimwNS0Q,7684
91
+ sparknlp/annotator/embeddings/e5_embeddings.py,sha256=_f5k-EDa_zSox4INeLBGS3gYO16WrVVKBsU0guVqxkk,7779
92
92
  sparknlp/annotator/embeddings/elmo_embeddings.py,sha256=KV-KPs0Pq_OpPaHsnqBz2k_S7VdzyFZ4632IeFNKqJ8,9858
93
93
  sparknlp/annotator/embeddings/instructor_embeddings.py,sha256=CTKmbuBOx_KBM4JM-Y1U5LyR-6rrnpoBGbgGE_axS1c,8670
94
94
  sparknlp/annotator/embeddings/longformer_embeddings.py,sha256=jS4fxB5O0-d9ta9VKv8ai-17n5YHt5rML8QxUw7K4Io,8754
95
95
  sparknlp/annotator/embeddings/mpnet_embeddings.py,sha256=2sabImn5spYGzfNwBSH2zUU90Wjqrm2btCVbDbtsqPg,7796
96
- sparknlp/annotator/embeddings/roberta_embeddings.py,sha256=V4HGDUK2YBHhAZd1ygJEGUmxDgul0MrpKDm1UQcNqTs,9135
96
+ sparknlp/annotator/embeddings/roberta_embeddings.py,sha256=q_WHby2lDcPc5bVHkGc6X_GwT3qyDUBLUVz5ZW4HCSY,9229
97
97
  sparknlp/annotator/embeddings/roberta_sentence_embeddings.py,sha256=KVrD4z_tIU-sphK6dmbbnHBBt8-Y89C_BFQAkN99kZo,8181
98
98
  sparknlp/annotator/embeddings/sentence_embeddings.py,sha256=azuA1FKMtTJ9suwJqTEHeWHumT6kYdfURTe_1fsqcB8,5402
99
99
  sparknlp/annotator/embeddings/uae_embeddings.py,sha256=sqTT67vcegVxcyoATISLPJSmOnA6J_otB6iREKOb6e4,8794
100
100
  sparknlp/annotator/embeddings/universal_sentence_encoder.py,sha256=_fTo-K78RjxiIKptpsI32mpW87RFCdXM16epHv4RVQY,8571
101
101
  sparknlp/annotator/embeddings/word2vec.py,sha256=UBhA4qUczQOx1t82Eu51lxx1-wJ_RLnCb__ncowSNhk,13229
102
102
  sparknlp/annotator/embeddings/word_embeddings.py,sha256=CQxjx2yDdmSM9s8D-bzsbUQhT8t1cqC4ynxlf9INpMU,15388
103
- sparknlp/annotator/embeddings/xlm_roberta_embeddings.py,sha256=t-Bg1bQcqI_fIqUWQbHt9rHK2_tyq0YXiq3uMw4xb94,9488
103
+ sparknlp/annotator/embeddings/xlm_roberta_embeddings.py,sha256=S2HHXOrSFXMAyloZUXJFNXL0-9wrZ32blsAhLB3Za1w,9582
104
104
  sparknlp/annotator/embeddings/xlm_roberta_sentence_embeddings.py,sha256=ojxD3H2VgDEn-RzDdCz0X485pojHBAFrlzsNemI05bY,8602
105
105
  sparknlp/annotator/embeddings/xlnet_embeddings.py,sha256=hJrlsJeO3D7uz54xiEiqqXEbq24YGuWz8U652PV9fNE,9336
106
106
  sparknlp/annotator/er/__init__.py,sha256=eF9Z-PanVfZWSVN2HSFbE7QjCDb6NYV5ESn6geYKlek,692
@@ -139,7 +139,7 @@ sparknlp/annotator/sentiment/vivekn_sentiment.py,sha256=4rpXWDgzU6ddnbrSCp9VdLb2
139
139
  sparknlp/annotator/seq2seq/__init__.py,sha256=UQK-_3wLkUdW1piGudCx1_k3Tg3tERZJYOBnfMRj8pA,1011
140
140
  sparknlp/annotator/seq2seq/bart_transformer.py,sha256=I1flM4yeCzEAKOdQllBC30XuedxVJ7ferkFhZ6gwEbE,18481
141
141
  sparknlp/annotator/seq2seq/gpt2_transformer.py,sha256=Oz95R_NRR4tWHu_bW6Ak2832ZILXycp3ify7LfRSi8o,15310
142
- sparknlp/annotator/seq2seq/llama2_transformer.py,sha256=YPge5f4qfv7XZY_LoH2HRzvbZ--XoTTY_BupxxYaCd8,13862
142
+ sparknlp/annotator/seq2seq/llama2_transformer.py,sha256=3LzTR0VerFdFmOizsrs2Q7HTnjELJ5WtfUgx5XnOqGM,13898
143
143
  sparknlp/annotator/seq2seq/m2m100_transformer.py,sha256=fTFGFWaFfJt5kaLvnYknf_23PVyjBuha48asFsE_NaE,16082
144
144
  sparknlp/annotator/seq2seq/marian_transformer.py,sha256=mQ4Ylh7ZzXAOue8f-x0gqzfS3vAz3XUdD7eQ2XhcEs4,13781
145
145
  sparknlp/annotator/seq2seq/t5_transformer.py,sha256=wDVxNLluIU1HGZFqaKKc4YTt4l-elPlAtQ7EEa0f5tg,17308
@@ -183,7 +183,7 @@ sparknlp/common/read_as.py,sha256=imxPGwV7jr4Li_acbo0OAHHRGCBbYv-akzEGaBWEfcY,12
183
183
  sparknlp/common/recursive_annotator_approach.py,sha256=vqugBw22cE3Ff7PIpRlnYFuOlchgL0nM26D8j-NdpqU,1449
184
184
  sparknlp/common/storage.py,sha256=D91H3p8EIjNspjqAYu6ephRpCUtdcAir4_PrAbkIQWE,4842
185
185
  sparknlp/common/utils.py,sha256=Yne6yYcwKxhOZC-U4qfYoDhWUP_6BIaAjI5X_P_df1E,1306
186
- sparknlp/internal/__init__.py,sha256=ymZxTXlIf6e_wWEBCVI727zq2EP4nD5z97BWmJDuKlo,26725
186
+ sparknlp/internal/__init__.py,sha256=7kGauV0ncpqnrPzagaXefApSKyzuxYZTO1myeVZ6LJ8,26929
187
187
  sparknlp/internal/annotator_java_ml.py,sha256=UGPoThG0rGXUOXGSQnDzEDW81Mu1s5RPF29v7DFyE3c,1187
188
188
  sparknlp/internal/annotator_transformer.py,sha256=fXmc2IWXGybqZpbEU9obmbdBYPc798y42zvSB4tqV9U,1448
189
189
  sparknlp/internal/extended_java_wrapper.py,sha256=hwP0133-hDiDf5sBF-P3MtUsuuDj1PpQbtGZQIRwzfk,2240
@@ -225,8 +225,8 @@ sparknlp/training/_tf_graph_builders_1x/ner_dl/dataset_encoder.py,sha256=R4yHFN3
225
225
  sparknlp/training/_tf_graph_builders_1x/ner_dl/ner_model.py,sha256=EoCSdcIjqQ3wv13MAuuWrKV8wyVBP0SbOEW41omHlR0,23189
226
226
  sparknlp/training/_tf_graph_builders_1x/ner_dl/ner_model_saver.py,sha256=k5CQ7gKV6HZbZMB8cKLUJuZxoZWlP_DFWdZ--aIDwsc,2356
227
227
  sparknlp/training/_tf_graph_builders_1x/ner_dl/sentence_grouper.py,sha256=pAxjWhjazSX8Vg0MFqJiuRVw1IbnQNSs-8Xp26L4nko,870
228
- spark_nlp-5.3.3.dist-info/.uuid,sha256=1f6hF51aIuv9yCvh31NU9lOpS34NE-h3a0Et7R9yR6A,36
229
- spark_nlp-5.3.3.dist-info/METADATA,sha256=YSJq8MiAoRizhOjb8zUeMBqNzNAL1rDEVW5MWy_Q37c,57087
230
- spark_nlp-5.3.3.dist-info/WHEEL,sha256=bb2Ot9scclHKMOLDEHY6B2sicWOgugjFKaJsT7vwMQo,110
231
- spark_nlp-5.3.3.dist-info/top_level.txt,sha256=uuytur4pyMRw2H_txNY2ZkaucZHUs22QF8-R03ch_-E,13
232
- spark_nlp-5.3.3.dist-info/RECORD,,
228
+ spark_nlp-5.4.0rc1.dist-info/.uuid,sha256=1f6hF51aIuv9yCvh31NU9lOpS34NE-h3a0Et7R9yR6A,36
229
+ spark_nlp-5.4.0rc1.dist-info/METADATA,sha256=cEBGxVSbWrCQInnlujccn9xgqznzSHsX1QiNjQJmobU,57266
230
+ spark_nlp-5.4.0rc1.dist-info/WHEEL,sha256=bb2Ot9scclHKMOLDEHY6B2sicWOgugjFKaJsT7vwMQo,110
231
+ spark_nlp-5.4.0rc1.dist-info/top_level.txt,sha256=uuytur4pyMRw2H_txNY2ZkaucZHUs22QF8-R03ch_-E,13
232
+ spark_nlp-5.4.0rc1.dist-info/RECORD,,
sparknlp/__init__.py CHANGED
@@ -128,7 +128,7 @@ def start(gpu=False,
128
128
  The initiated Spark session.
129
129
 
130
130
  """
131
- current_version = "5.3.3"
131
+ current_version = "5.4.0-rc1"
132
132
 
133
133
  if params is None:
134
134
  params = {}
@@ -309,4 +309,4 @@ def version():
309
309
  str
310
310
  The current Spark NLP version.
311
311
  """
312
- return '5.3.3'
312
+ return '5.4.0-rc1'
@@ -164,7 +164,7 @@ class BertEmbeddings(AnnotatorModel,
164
164
  )
165
165
 
166
166
  @staticmethod
167
- def loadSavedModel(folder, spark_session):
167
+ def loadSavedModel(folder, spark_session, use_openvino=False):
168
168
  """Loads a locally saved model.
169
169
 
170
170
  Parameters
@@ -173,6 +173,8 @@ class BertEmbeddings(AnnotatorModel,
173
173
  Folder of the saved model
174
174
  spark_session : pyspark.sql.SparkSession
175
175
  The current SparkSession
176
+ use_openvino: bool
177
+ Use OpenVINO backend
176
178
 
177
179
  Returns
178
180
  -------
@@ -180,7 +182,7 @@ class BertEmbeddings(AnnotatorModel,
180
182
  The restored model
181
183
  """
182
184
  from sparknlp.internal import _BertLoader
183
- jModel = _BertLoader(folder, spark_session._jsparkSession)._java_obj
185
+ jModel = _BertLoader(folder, spark_session._jsparkSession, use_openvino)._java_obj
184
186
  return BertEmbeddings(java_model=jModel)
185
187
 
186
188
  @staticmethod
@@ -180,7 +180,7 @@ class BertSentenceEmbeddings(AnnotatorModel,
180
180
  )
181
181
 
182
182
  @staticmethod
183
- def loadSavedModel(folder, spark_session):
183
+ def loadSavedModel(folder, spark_session, use_openvino=False):
184
184
  """Loads a locally saved model.
185
185
 
186
186
  Parameters
@@ -189,6 +189,8 @@ class BertSentenceEmbeddings(AnnotatorModel,
189
189
  Folder of the saved model
190
190
  spark_session : pyspark.sql.SparkSession
191
191
  The current SparkSession
192
+ use_openvino: bool
193
+ Use OpenVINO backend
192
194
 
193
195
  Returns
194
196
  -------
@@ -196,7 +198,7 @@ class BertSentenceEmbeddings(AnnotatorModel,
196
198
  The restored model
197
199
  """
198
200
  from sparknlp.internal import _BertSentenceLoader
199
- jModel = _BertSentenceLoader(folder, spark_session._jsparkSession)._java_obj
201
+ jModel = _BertSentenceLoader(folder, spark_session._jsparkSession, use_openvino)._java_obj
200
202
  return BertSentenceEmbeddings(java_model=jModel)
201
203
 
202
204
  @staticmethod
@@ -149,7 +149,7 @@ class E5Embeddings(AnnotatorModel,
149
149
  )
150
150
 
151
151
  @staticmethod
152
- def loadSavedModel(folder, spark_session):
152
+ def loadSavedModel(folder, spark_session, use_openvino=False):
153
153
  """Loads a locally saved model.
154
154
 
155
155
  Parameters
@@ -158,6 +158,8 @@ class E5Embeddings(AnnotatorModel,
158
158
  Folder of the saved model
159
159
  spark_session : pyspark.sql.SparkSession
160
160
  The current SparkSession
161
+ use_openvino : bool
162
+ Use OpenVINO backend
161
163
 
162
164
  Returns
163
165
  -------
@@ -165,7 +167,7 @@ class E5Embeddings(AnnotatorModel,
165
167
  The restored model
166
168
  """
167
169
  from sparknlp.internal import _E5Loader
168
- jModel = _E5Loader(folder, spark_session._jsparkSession)._java_obj
170
+ jModel = _E5Loader(folder, spark_session._jsparkSession, use_openvino)._java_obj
169
171
  return E5Embeddings(java_model=jModel)
170
172
 
171
173
  @staticmethod
@@ -181,7 +181,7 @@ class RoBertaEmbeddings(AnnotatorModel,
181
181
  )
182
182
 
183
183
  @staticmethod
184
- def loadSavedModel(folder, spark_session):
184
+ def loadSavedModel(folder, spark_session, use_openvino=False):
185
185
  """Loads a locally saved model.
186
186
 
187
187
  Parameters
@@ -190,6 +190,8 @@ class RoBertaEmbeddings(AnnotatorModel,
190
190
  Folder of the saved model
191
191
  spark_session : pyspark.sql.SparkSession
192
192
  The current SparkSession
193
+ use_openvino: bool
194
+ Use OpenVINO backend
193
195
 
194
196
  Returns
195
197
  -------
@@ -197,7 +199,7 @@ class RoBertaEmbeddings(AnnotatorModel,
197
199
  The restored model
198
200
  """
199
201
  from sparknlp.internal import _RoBertaLoader
200
- jModel = _RoBertaLoader(folder, spark_session._jsparkSession)._java_obj
202
+ jModel = _RoBertaLoader(folder, spark_session._jsparkSession, use_openvino)._java_obj
201
203
  return RoBertaEmbeddings(java_model=jModel)
202
204
 
203
205
  @staticmethod
@@ -181,7 +181,7 @@ class XlmRoBertaEmbeddings(AnnotatorModel,
181
181
  )
182
182
 
183
183
  @staticmethod
184
- def loadSavedModel(folder, spark_session):
184
+ def loadSavedModel(folder, spark_session, use_openvino=False):
185
185
  """Loads a locally saved model.
186
186
 
187
187
  Parameters
@@ -190,6 +190,8 @@ class XlmRoBertaEmbeddings(AnnotatorModel,
190
190
  Folder of the saved model
191
191
  spark_session : pyspark.sql.SparkSession
192
192
  The current SparkSession
193
+ use_openvino: bool
194
+ Use OpenVINO backend
193
195
 
194
196
  Returns
195
197
  -------
@@ -197,7 +199,7 @@ class XlmRoBertaEmbeddings(AnnotatorModel,
197
199
  The restored model
198
200
  """
199
201
  from sparknlp.internal import _XlmRoBertaLoader
200
- jModel = _XlmRoBertaLoader(folder, spark_session._jsparkSession)._java_obj
202
+ jModel = _XlmRoBertaLoader(folder, spark_session._jsparkSession, use_openvino)._java_obj
201
203
  return XlmRoBertaEmbeddings(java_model=jModel)
202
204
 
203
205
  @staticmethod
@@ -301,7 +301,7 @@ class LLAMA2Transformer(AnnotatorModel, HasBatchedAnnotate, HasEngine):
301
301
  )
302
302
 
303
303
  @staticmethod
304
- def loadSavedModel(folder, spark_session):
304
+ def loadSavedModel(folder, spark_session, use_openvino = False):
305
305
  """Loads a locally saved model.
306
306
 
307
307
  Parameters
@@ -317,7 +317,7 @@ class LLAMA2Transformer(AnnotatorModel, HasBatchedAnnotate, HasEngine):
317
317
  The restored model
318
318
  """
319
319
  from sparknlp.internal import _LLAMA2Loader
320
- jModel = _LLAMA2Loader(folder, spark_session._jsparkSession)._java_obj
320
+ jModel = _LLAMA2Loader(folder, spark_session._jsparkSession, use_openvino)._java_obj
321
321
  return LLAMA2Transformer(java_model=jModel)
322
322
 
323
323
  @staticmethod
@@ -49,14 +49,14 @@ class _AlbertQuestionAnsweringLoader(ExtendedJavaWrapper):
49
49
 
50
50
 
51
51
  class _BertLoader(ExtendedJavaWrapper):
52
- def __init__(self, path, jspark):
53
- super(_BertLoader, self).__init__("com.johnsnowlabs.nlp.embeddings.BertEmbeddings.loadSavedModel", path, jspark)
52
+ def __init__(self, path, jspark, use_openvino=False):
53
+ super(_BertLoader, self).__init__("com.johnsnowlabs.nlp.embeddings.BertEmbeddings.loadSavedModel", path, jspark, use_openvino)
54
54
 
55
55
 
56
56
  class _BertSentenceLoader(ExtendedJavaWrapper):
57
- def __init__(self, path, jspark):
57
+ def __init__(self, path, jspark, use_openvino=False):
58
58
  super(_BertSentenceLoader, self).__init__(
59
- "com.johnsnowlabs.nlp.embeddings.BertSentenceEmbeddings.loadSavedModel", path, jspark)
59
+ "com.johnsnowlabs.nlp.embeddings.BertSentenceEmbeddings.loadSavedModel", path, jspark, use_openvino)
60
60
 
61
61
 
62
62
  class _BertSequenceClassifierLoader(ExtendedJavaWrapper):
@@ -144,8 +144,8 @@ class _ElmoLoader(ExtendedJavaWrapper):
144
144
 
145
145
 
146
146
  class _E5Loader(ExtendedJavaWrapper):
147
- def __init__(self, path, jspark):
148
- super(_E5Loader, self).__init__("com.johnsnowlabs.nlp.embeddings.E5Embeddings.loadSavedModel", path, jspark)
147
+ def __init__(self, path, jspark, use_openvino=False):
148
+ super(_E5Loader, self).__init__("com.johnsnowlabs.nlp.embeddings.E5Embeddings.loadSavedModel", path, jspark, use_openvino)
149
149
 
150
150
 
151
151
  class _BGELoader(ExtendedJavaWrapper):
@@ -160,9 +160,9 @@ class _GPT2Loader(ExtendedJavaWrapper):
160
160
 
161
161
 
162
162
  class _LLAMA2Loader(ExtendedJavaWrapper):
163
- def __init__(self, path, jspark):
163
+ def __init__(self, path, jspark, use_openvino=False):
164
164
  super(_LLAMA2Loader, self).__init__(
165
- "com.johnsnowlabs.nlp.annotators.seq2seq.LLAMA2Transformer.loadSavedModel", path, jspark)
165
+ "com.johnsnowlabs.nlp.annotators.seq2seq.LLAMA2Transformer.loadSavedModel", path, jspark, use_openvino)
166
166
 
167
167
 
168
168
  class _LongformerLoader(ExtendedJavaWrapper):
@@ -212,9 +212,9 @@ class _MPNetLoader(ExtendedJavaWrapper):
212
212
 
213
213
 
214
214
  class _RoBertaLoader(ExtendedJavaWrapper):
215
- def __init__(self, path, jspark):
215
+ def __init__(self, path, jspark, use_openvino=False):
216
216
  super(_RoBertaLoader, self).__init__("com.johnsnowlabs.nlp.embeddings.RoBertaEmbeddings.loadSavedModel", path,
217
- jspark)
217
+ jspark, use_openvino)
218
218
 
219
219
 
220
220
  class _RoBertaSentenceLoader(ExtendedJavaWrapper):
@@ -261,9 +261,9 @@ class _USELoader(ExtendedJavaWrapper):
261
261
 
262
262
 
263
263
  class _XlmRoBertaLoader(ExtendedJavaWrapper):
264
- def __init__(self, path, jspark):
264
+ def __init__(self, path, jspark, use_openvino=False):
265
265
  super(_XlmRoBertaLoader, self).__init__("com.johnsnowlabs.nlp.embeddings.XlmRoBertaEmbeddings.loadSavedModel",
266
- path, jspark)
266
+ path, jspark, use_openvino)
267
267
 
268
268
 
269
269
  class _XlmRoBertaSentenceLoader(ExtendedJavaWrapper):