spark-nlp 5.4.0rc1__py2.py3-none-any.whl → 5.4.0rc2__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of spark-nlp might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: spark-nlp
3
- Version: 5.4.0rc1
3
+ Version: 5.4.0rc2
4
4
  Summary: John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.
5
5
  Home-page: https://github.com/JohnSnowLabs/spark-nlp
6
6
  Author: John Snow Labs
@@ -197,7 +197,7 @@ To use Spark NLP you need the following requirements:
197
197
 
198
198
  **GPU (optional):**
199
199
 
200
- Spark NLP 5.4.0-rc1 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
200
+ Spark NLP 5.4.0-rc2 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
201
201
 
202
202
  - NVIDIA® GPU drivers version 450.80.02 or higher
203
203
  - CUDA® Toolkit 11.2
@@ -213,7 +213,7 @@ $ java -version
213
213
  $ conda create -n sparknlp python=3.7 -y
214
214
  $ conda activate sparknlp
215
215
  # spark-nlp by default is based on pyspark 3.x
216
- $ pip install spark-nlp==5.4.0-rc1 pyspark==3.3.1
216
+ $ pip install spark-nlp==5.4.0-rc2 pyspark==3.3.1
217
217
  ```
218
218
 
219
219
  In Python console or Jupyter `Python3` kernel:
@@ -258,7 +258,7 @@ For more examples, you can visit our dedicated [examples](https://github.com/Joh
258
258
 
259
259
  ## Apache Spark Support
260
260
 
261
- Spark NLP *5.4.0-rc1* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
261
+ Spark NLP *5.4.0-rc2* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
262
262
 
263
263
  | Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
264
264
  |-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
@@ -302,7 +302,7 @@ Find out more about `Spark NLP` versions from our [release notes](https://github
302
302
 
303
303
  ## Databricks Support
304
304
 
305
- Spark NLP 5.4.0-rc1 has been tested and is compatible with the following runtimes:
305
+ Spark NLP 5.4.0-rc2 has been tested and is compatible with the following runtimes:
306
306
 
307
307
  **CPU:**
308
308
 
@@ -375,7 +375,7 @@ Spark NLP 5.4.0-rc1 has been tested and is compatible with the following runtime
375
375
 
376
376
  ## EMR Support
377
377
 
378
- Spark NLP 5.4.0-rc1 has been tested and is compatible with the following EMR releases:
378
+ Spark NLP 5.4.0-rc2 has been tested and is compatible with the following EMR releases:
379
379
 
380
380
  - emr-6.2.0
381
381
  - emr-6.3.0
@@ -425,11 +425,11 @@ Spark NLP supports all major releases of Apache Spark 3.0.x, Apache Spark 3.1.x,
425
425
  ```sh
426
426
  # CPU
427
427
 
428
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
428
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
429
429
 
430
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
430
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
431
431
 
432
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
432
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
433
433
  ```
434
434
 
435
435
  The `spark-nlp` has been published to
@@ -438,11 +438,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
438
438
  ```sh
439
439
  # GPU
440
440
 
441
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
441
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc2
442
442
 
443
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
443
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc2
444
444
 
445
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc1
445
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0-rc2
446
446
 
447
447
  ```
448
448
 
@@ -452,11 +452,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
452
452
  ```sh
453
453
  # AArch64
454
454
 
455
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
455
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc2
456
456
 
457
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
457
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc2
458
458
 
459
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc1
459
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0-rc2
460
460
 
461
461
  ```
462
462
 
@@ -466,11 +466,11 @@ the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/s
466
466
  ```sh
467
467
  # M1/M2 (Apple Silicon)
468
468
 
469
- spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
469
+ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc2
470
470
 
471
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
471
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc2
472
472
 
473
- spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc1
473
+ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0-rc2
474
474
 
475
475
  ```
476
476
 
@@ -484,7 +484,7 @@ set in your SparkSession:
484
484
  spark-shell \
485
485
  --driver-memory 16g \
486
486
  --conf spark.kryoserializer.buffer.max=2000M \
487
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
487
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
488
488
  ```
489
489
 
490
490
  ## Scala
@@ -502,7 +502,7 @@ coordinates:
502
502
  <dependency>
503
503
  <groupId>com.johnsnowlabs.nlp</groupId>
504
504
  <artifactId>spark-nlp_2.12</artifactId>
505
- <version>5.4.0-rc1</version>
505
+ <version>5.4.0-rc2</version>
506
506
  </dependency>
507
507
  ```
508
508
 
@@ -513,7 +513,7 @@ coordinates:
513
513
  <dependency>
514
514
  <groupId>com.johnsnowlabs.nlp</groupId>
515
515
  <artifactId>spark-nlp-gpu_2.12</artifactId>
516
- <version>5.4.0-rc1</version>
516
+ <version>5.4.0-rc2</version>
517
517
  </dependency>
518
518
  ```
519
519
 
@@ -524,7 +524,7 @@ coordinates:
524
524
  <dependency>
525
525
  <groupId>com.johnsnowlabs.nlp</groupId>
526
526
  <artifactId>spark-nlp-aarch64_2.12</artifactId>
527
- <version>5.4.0-rc1</version>
527
+ <version>5.4.0-rc2</version>
528
528
  </dependency>
529
529
  ```
530
530
 
@@ -535,7 +535,7 @@ coordinates:
535
535
  <dependency>
536
536
  <groupId>com.johnsnowlabs.nlp</groupId>
537
537
  <artifactId>spark-nlp-silicon_2.12</artifactId>
538
- <version>5.4.0-rc1</version>
538
+ <version>5.4.0-rc2</version>
539
539
  </dependency>
540
540
  ```
541
541
 
@@ -545,28 +545,28 @@ coordinates:
545
545
 
546
546
  ```sbtshell
547
547
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
548
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0-rc1"
548
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0-rc2"
549
549
  ```
550
550
 
551
551
  **spark-nlp-gpu:**
552
552
 
553
553
  ```sbtshell
554
554
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu
555
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0-rc1"
555
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0-rc2"
556
556
  ```
557
557
 
558
558
  **spark-nlp-aarch64:**
559
559
 
560
560
  ```sbtshell
561
561
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64
562
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0-rc1"
562
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0-rc2"
563
563
  ```
564
564
 
565
565
  **spark-nlp-silicon:**
566
566
 
567
567
  ```sbtshell
568
568
  // https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon
569
- libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0-rc1"
569
+ libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0-rc2"
570
570
  ```
571
571
 
572
572
  Maven
@@ -588,7 +588,7 @@ If you installed pyspark through pip/conda, you can install `spark-nlp` through
588
588
  Pip:
589
589
 
590
590
  ```bash
591
- pip install spark-nlp==5.4.0-rc1
591
+ pip install spark-nlp==5.4.0-rc2
592
592
  ```
593
593
 
594
594
  Conda:
@@ -617,7 +617,7 @@ spark = SparkSession.builder
617
617
  .config("spark.driver.memory", "16G")
618
618
  .config("spark.driver.maxResultSize", "0")
619
619
  .config("spark.kryoserializer.buffer.max", "2000M")
620
- .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1")
620
+ .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2")
621
621
  .getOrCreate()
622
622
  ```
623
623
 
@@ -688,7 +688,7 @@ Use either one of the following options
688
688
  - Add the following Maven Coordinates to the interpreter's library list
689
689
 
690
690
  ```bash
691
- com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
691
+ com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
692
692
  ```
693
693
 
694
694
  - Add a path to pre-built jar from [here](#compiled-jars) in the interpreter's library list making sure the jar is
@@ -699,7 +699,7 @@ com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
699
699
  Apart from the previous step, install the python module through pip
700
700
 
701
701
  ```bash
702
- pip install spark-nlp==5.4.0-rc1
702
+ pip install spark-nlp==5.4.0-rc2
703
703
  ```
704
704
 
705
705
  Or you can install `spark-nlp` from inside Zeppelin by using Conda:
@@ -727,7 +727,7 @@ launch the Jupyter from the same Python environment:
727
727
  $ conda create -n sparknlp python=3.8 -y
728
728
  $ conda activate sparknlp
729
729
  # spark-nlp by default is based on pyspark 3.x
730
- $ pip install spark-nlp==5.4.0-rc1 pyspark==3.3.1 jupyter
730
+ $ pip install spark-nlp==5.4.0-rc2 pyspark==3.3.1 jupyter
731
731
  $ jupyter notebook
732
732
  ```
733
733
 
@@ -744,7 +744,7 @@ export PYSPARK_PYTHON=python3
744
744
  export PYSPARK_DRIVER_PYTHON=jupyter
745
745
  export PYSPARK_DRIVER_PYTHON_OPTS=notebook
746
746
 
747
- pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
747
+ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
748
748
  ```
749
749
 
750
750
  Alternatively, you can mix in using `--jars` option for pyspark + `pip install spark-nlp`
@@ -771,7 +771,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
771
771
  # -s is for spark-nlp
772
772
  # -g will enable upgrading libcudnn8 to 8.1.0 on Google Colab for GPU usage
773
773
  # by default they are set to the latest
774
- !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc1
774
+ !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc2
775
775
  ```
776
776
 
777
777
  [Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb)
@@ -794,7 +794,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
794
794
  # -s is for spark-nlp
795
795
  # -g will enable upgrading libcudnn8 to 8.1.0 on Kaggle for GPU usage
796
796
  # by default they are set to the latest
797
- !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc1
797
+ !wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc2
798
798
  ```
799
799
 
800
800
  [Spark NLP quick start on Kaggle Kernel](https://www.kaggle.com/mozzie/spark-nlp-named-entity-recognition) is a live
@@ -813,9 +813,9 @@ demo on Kaggle Kernel that performs named entity recognitions by using Spark NLP
813
813
 
814
814
  3. In `Libraries` tab inside your cluster you need to follow these steps:
815
815
 
816
- 3.1. Install New -> PyPI -> `spark-nlp==5.4.0-rc1` -> Install
816
+ 3.1. Install New -> PyPI -> `spark-nlp==5.4.0-rc2` -> Install
817
817
 
818
- 3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1` -> Install
818
+ 3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2` -> Install
819
819
 
820
820
  4. Now you can attach your notebook to the cluster and use Spark NLP!
821
821
 
@@ -866,7 +866,7 @@ A sample of your software configuration in JSON on S3 (must be public access):
866
866
  "spark.kryoserializer.buffer.max": "2000M",
867
867
  "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
868
868
  "spark.driver.maxResultSize": "0",
869
- "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1"
869
+ "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2"
870
870
  }
871
871
  }]
872
872
  ```
@@ -875,7 +875,7 @@ A sample of AWS CLI to launch EMR cluster:
875
875
 
876
876
  ```.sh
877
877
  aws emr create-cluster \
878
- --name "Spark NLP 5.4.0-rc1" \
878
+ --name "Spark NLP 5.4.0-rc2" \
879
879
  --release-label emr-6.2.0 \
880
880
  --applications Name=Hadoop Name=Spark Name=Hive \
881
881
  --instance-type m4.4xlarge \
@@ -939,7 +939,7 @@ gcloud dataproc clusters create ${CLUSTER_NAME} \
939
939
  --enable-component-gateway \
940
940
  --metadata 'PIP_PACKAGES=spark-nlp spark-nlp-display google-cloud-bigquery google-cloud-storage' \
941
941
  --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/python/pip-install.sh \
942
- --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
942
+ --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
943
943
  ```
944
944
 
945
945
  2. On an existing one, you need to install spark-nlp and spark-nlp-display packages from PyPI.
@@ -982,7 +982,7 @@ spark = SparkSession.builder
982
982
  .config("spark.kryoserializer.buffer.max", "2000m")
983
983
  .config("spark.jsl.settings.pretrained.cache_folder", "sample_data/pretrained")
984
984
  .config("spark.jsl.settings.storage.cluster_tmp_dir", "sample_data/storage")
985
- .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1")
985
+ .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2")
986
986
  .getOrCreate()
987
987
  ```
988
988
 
@@ -996,7 +996,7 @@ spark-shell \
996
996
  --conf spark.kryoserializer.buffer.max=2000M \
997
997
  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
998
998
  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
999
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
999
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
1000
1000
  ```
1001
1001
 
1002
1002
  **pyspark:**
@@ -1009,7 +1009,7 @@ pyspark \
1009
1009
  --conf spark.kryoserializer.buffer.max=2000M \
1010
1010
  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
1011
1011
  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
1012
- --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc1
1012
+ --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
1013
1013
  ```
1014
1014
 
1015
1015
  **Databricks:**
@@ -1281,7 +1281,7 @@ spark = SparkSession.builder
1281
1281
  .config("spark.driver.memory", "16G")
1282
1282
  .config("spark.driver.maxResultSize", "0")
1283
1283
  .config("spark.kryoserializer.buffer.max", "2000M")
1284
- .config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0-rc1.jar")
1284
+ .config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0-rc2.jar")
1285
1285
  .getOrCreate()
1286
1286
  ```
1287
1287
 
@@ -1290,7 +1290,7 @@ spark = SparkSession.builder
1290
1290
  version (3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x)
1291
1291
  - If you are local, you can load the Fat JAR from your local FileSystem, however, if you are in a cluster setup you need
1292
1292
  to put the Fat JAR on a distributed FileSystem such as HDFS, DBFS, S3, etc. (
1293
- i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0-rc1.jar`)
1293
+ i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0-rc2.jar`)
1294
1294
 
1295
1295
  Example of using pretrained Models and Pipelines in offline:
1296
1296
 
@@ -1,7 +1,7 @@
1
1
  com/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
2
  com/johnsnowlabs/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
3
3
  com/johnsnowlabs/nlp/__init__.py,sha256=DPIVXtONO5xXyOk-HB0-sNiHAcco17NN13zPS_6Uw8c,294
4
- sparknlp/__init__.py,sha256=LfjmvNVvUTsHxDW1JoM5qb3yWDjqz97vHf1iSybI1b4,13596
4
+ sparknlp/__init__.py,sha256=0vaMXy2bEKgeX3NK4XpFkj-uCvq07R2AWdTbYLtapNg,13596
5
5
  sparknlp/annotation.py,sha256=I5zOxG5vV2RfPZfqN9enT1i4mo6oBcn3Lrzs37QiOiA,5635
6
6
  sparknlp/annotation_audio.py,sha256=iRV_InSVhgvAwSRe9NTbUH9v6OGvTM-FPCpSAKVu0mE,1917
7
7
  sparknlp/annotation_image.py,sha256=xhCe8Ko-77XqWVuuYHFrjKqF6zPd8Z-RY_rmZXNwCXU,2547
@@ -28,7 +28,7 @@ sparknlp/annotator/audio/__init__.py,sha256=dXjtvi5c0aTZFq1Q_JciUd1uFTBVSJoUdcq0
28
28
  sparknlp/annotator/audio/hubert_for_ctc.py,sha256=76PfwPZZvOHU5kfDqLueCFbmqa4W8pMNRGoCvOqjsEA,7859
29
29
  sparknlp/annotator/audio/wav2vec2_for_ctc.py,sha256=K78P1U6vA4O1UufsLYzy0H7arsKNmwPcIV7kzDFsA5Q,6210
30
30
  sparknlp/annotator/audio/whisper_for_ctc.py,sha256=uII51umuohqwnAW0Q7VdxEFyr_j5LMnfpcRlf8TbetA,9800
31
- sparknlp/annotator/classifier_dl/__init__.py,sha256=tGg78A8LUgobZFre_3ySN51KGNyl0Zx0inxT9yfL_g8,3686
31
+ sparknlp/annotator/classifier_dl/__init__.py,sha256=dsbceLBdAsk0VlvgcCcGANHMcyFMKi7-sdyu-Eg41ws,3763
32
32
  sparknlp/annotator/classifier_dl/albert_for_question_answering.py,sha256=LG2dL6Fky1T35yXTUZBfIihIIGnkRFQ7ECQ3HRXXEG8,6517
33
33
  sparknlp/annotator/classifier_dl/albert_for_sequence_classification.py,sha256=kWx7f9pcKE2qw319gn8FN0Md5dX38gbmfeoY9gWCLNk,7842
34
34
  sparknlp/annotator/classifier_dl/albert_for_token_classification.py,sha256=5rdsjWnsAVmtP-idU7ATKJ8lkH2rtlKZLnpi4Mq27eI,6839
@@ -54,6 +54,7 @@ sparknlp/annotator/classifier_dl/longformer_for_sequence_classification.py,sha25
54
54
  sparknlp/annotator/classifier_dl/longformer_for_token_classification.py,sha256=RmiFuBRhIAoJoQ8Rgcu997-PxBK1hhWuLVlS1qztMyk,6848
55
55
  sparknlp/annotator/classifier_dl/mpnet_for_question_answering.py,sha256=w9hHLrQbDIUHAdCKiXNDneAbohMKopixAKU2wkYkqbs,5522
56
56
  sparknlp/annotator/classifier_dl/mpnet_for_sequence_classification.py,sha256=M__giFElL6Q3I88QD6OoXDzdQDk_Zp5sS__Kh_XpLdo,7308
57
+ sparknlp/annotator/classifier_dl/mpnet_for_token_classification.py,sha256=SgFAJcv7ZE3BmJOehK_CjAaueqaaK6PR33zA5aE9-Ww,6754
57
58
  sparknlp/annotator/classifier_dl/multi_classifier_dl.py,sha256=ylKQzS7ROyeKeiOF4BZiIkQV1sfrnfUUQ9LXFSFK_Vo,16045
58
59
  sparknlp/annotator/classifier_dl/roberta_for_question_answering.py,sha256=WRxu1uhXnY9C4UHdtJ8qiVGhPSX7sCdSaML0AWHOdJw,6471
59
60
  sparknlp/annotator/classifier_dl/roberta_for_sequence_classification.py,sha256=z97uH5WkG8kPX1Y9qtpLwD7egl0kzbVmxtq4xzZgNNI,7857
@@ -63,7 +64,7 @@ sparknlp/annotator/classifier_dl/sentiment_dl.py,sha256=6Z7X3-ykxoaUz6vz-YIXkv2m
63
64
  sparknlp/annotator/classifier_dl/tapas_for_question_answering.py,sha256=2YBODMDUZT-j5ceOFTixrEkOqrztIM1kU-tsW_wao18,6317
64
65
  sparknlp/annotator/classifier_dl/xlm_roberta_for_question_answering.py,sha256=t_zCnKGCjDccKNj_2mjRkysOaNCWNBMKXehbuFSphQc,6538
65
66
  sparknlp/annotator/classifier_dl/xlm_roberta_for_sequence_classification.py,sha256=sudgwa8_QZQzaYvEMSt6J1bDDwyK2Hp1VFhh98P08hY,7930
66
- sparknlp/annotator/classifier_dl/xlm_roberta_for_token_classification.py,sha256=pe4Y1XDxDMQs1q32bwhbPC5_oKcJ4n5JFu-dsofLdSA,6850
67
+ sparknlp/annotator/classifier_dl/xlm_roberta_for_token_classification.py,sha256=ub5mMiZYKP4eBmXRzjkjfv_FFFR8E01XJs0RC__RxPo,6808
67
68
  sparknlp/annotator/classifier_dl/xlm_roberta_for_zero_shot_classification.py,sha256=4dBzpPj-VJcZul5hGcyjYkVMQ1PiaXZEGwvEaob3rss,8899
68
69
  sparknlp/annotator/classifier_dl/xlnet_for_sequence_classification.py,sha256=CI9Ah2lyHkqwDHWGCbkk_gPbQd0NudpC7oXiHtWOucs,7811
69
70
  sparknlp/annotator/classifier_dl/xlnet_for_token_classification.py,sha256=SndQpIfslsSYEOX-myLjpUS6-wVIeDG8MOhJYcu2_7M,6739
@@ -82,17 +83,17 @@ sparknlp/annotator/embeddings/__init__.py,sha256=XQ6-UMsfvH54u3f0yceKiM8XJOAugIT
82
83
  sparknlp/annotator/embeddings/albert_embeddings.py,sha256=6Rd1LIn8oFIpq_ALcJh-RUjPEO7Ht8wsHY6JHSFyMkw,9995
83
84
  sparknlp/annotator/embeddings/bert_embeddings.py,sha256=HVUjkg56kBcpGZCo-fmPG5uatMDF3swW_lnbpy1SgSI,8463
84
85
  sparknlp/annotator/embeddings/bert_sentence_embeddings.py,sha256=NQy9KuXT9aKsTpYCR5RAeoFWI2YqEGorbdYrf_0KKmw,9148
85
- sparknlp/annotator/embeddings/bge_embeddings.py,sha256=FNmYxcynM1iLJvg5ZNmrZKkyIF0Gtr7G-CgZ72mrVyU,7842
86
+ sparknlp/annotator/embeddings/bge_embeddings.py,sha256=hXFFd9HOru1w2L9N5YGSZlaKyxqMsZccpaI4Z8-bNUE,7919
86
87
  sparknlp/annotator/embeddings/camembert_embeddings.py,sha256=dBTXas-2Tas_JUR9Xt_GtHLcyqi_cdvT5EHRnyVrSSQ,8817
87
88
  sparknlp/annotator/embeddings/chunk_embeddings.py,sha256=WUmkJimSuFkdcLJnvcxOV0QlCLgGlhub29ZTrZb70WE,6052
88
89
  sparknlp/annotator/embeddings/deberta_embeddings.py,sha256=_b5nzLb7heFQNN-uT2oBNO6-YmM8bHmAdnGXg47HOWw,8649
89
90
  sparknlp/annotator/embeddings/distil_bert_embeddings.py,sha256=4pyMCsbvvXYeTGIMVUir9wCDKR_1f_HKtXZrTDO1Thc,9275
90
91
  sparknlp/annotator/embeddings/doc2vec.py,sha256=Xk3MdEkXatX9lRgbFbAdnIDrLgIxzUIGWFBZeo9BTq0,13226
91
- sparknlp/annotator/embeddings/e5_embeddings.py,sha256=_f5k-EDa_zSox4INeLBGS3gYO16WrVVKBsU0guVqxkk,7779
92
+ sparknlp/annotator/embeddings/e5_embeddings.py,sha256=Esuvrq9JlogGaSSzFVVDkOFMwgYwFwr17I62ZiCDm0k,7858
92
93
  sparknlp/annotator/embeddings/elmo_embeddings.py,sha256=KV-KPs0Pq_OpPaHsnqBz2k_S7VdzyFZ4632IeFNKqJ8,9858
93
94
  sparknlp/annotator/embeddings/instructor_embeddings.py,sha256=CTKmbuBOx_KBM4JM-Y1U5LyR-6rrnpoBGbgGE_axS1c,8670
94
95
  sparknlp/annotator/embeddings/longformer_embeddings.py,sha256=jS4fxB5O0-d9ta9VKv8ai-17n5YHt5rML8QxUw7K4Io,8754
95
- sparknlp/annotator/embeddings/mpnet_embeddings.py,sha256=2sabImn5spYGzfNwBSH2zUU90Wjqrm2btCVbDbtsqPg,7796
96
+ sparknlp/annotator/embeddings/mpnet_embeddings.py,sha256=7d6E4lS7jjkppDPvty1UHNNrbykkriFiysrxZ_RzL0U,7875
96
97
  sparknlp/annotator/embeddings/roberta_embeddings.py,sha256=q_WHby2lDcPc5bVHkGc6X_GwT3qyDUBLUVz5ZW4HCSY,9229
97
98
  sparknlp/annotator/embeddings/roberta_sentence_embeddings.py,sha256=KVrD4z_tIU-sphK6dmbbnHBBt8-Y89C_BFQAkN99kZo,8181
98
99
  sparknlp/annotator/embeddings/sentence_embeddings.py,sha256=azuA1FKMtTJ9suwJqTEHeWHumT6kYdfURTe_1fsqcB8,5402
@@ -136,12 +137,14 @@ sparknlp/annotator/sentence/sentence_detector_dl.py,sha256=-Osj9Bm9KyZRTAWkOsK9c
136
137
  sparknlp/annotator/sentiment/__init__.py,sha256=Lq3vKaZS1YATLMg0VNXSVtkWL5q5G9taGBvdrvSwnfg,766
137
138
  sparknlp/annotator/sentiment/sentiment_detector.py,sha256=m545NGU0Xzg_PO6_qIfpli1uZj7JQcyFgqe9R6wAPFI,8154
138
139
  sparknlp/annotator/sentiment/vivekn_sentiment.py,sha256=4rpXWDgzU6ddnbrSCp9VdLb2epCc9oZ3c6XcqxEw8nk,9655
139
- sparknlp/annotator/seq2seq/__init__.py,sha256=UQK-_3wLkUdW1piGudCx1_k3Tg3tERZJYOBnfMRj8pA,1011
140
+ sparknlp/annotator/seq2seq/__init__.py,sha256=3pF-b9ubgAs8ofggiNyuc1NQseq_oe231UVjVkZWTmU,1130
140
141
  sparknlp/annotator/seq2seq/bart_transformer.py,sha256=I1flM4yeCzEAKOdQllBC30XuedxVJ7ferkFhZ6gwEbE,18481
141
142
  sparknlp/annotator/seq2seq/gpt2_transformer.py,sha256=Oz95R_NRR4tWHu_bW6Ak2832ZILXycp3ify7LfRSi8o,15310
142
143
  sparknlp/annotator/seq2seq/llama2_transformer.py,sha256=3LzTR0VerFdFmOizsrs2Q7HTnjELJ5WtfUgx5XnOqGM,13898
143
- sparknlp/annotator/seq2seq/m2m100_transformer.py,sha256=fTFGFWaFfJt5kaLvnYknf_23PVyjBuha48asFsE_NaE,16082
144
+ sparknlp/annotator/seq2seq/m2m100_transformer.py,sha256=uIL9RZuuryTIdAy9TbJf9wbz6RekhW8S079bJhaB6i4,16116
144
145
  sparknlp/annotator/seq2seq/marian_transformer.py,sha256=mQ4Ylh7ZzXAOue8f-x0gqzfS3vAz3XUdD7eQ2XhcEs4,13781
146
+ sparknlp/annotator/seq2seq/mistral_transformer.py,sha256=hq5-Emut7qYnwFolYQ6cFOEY4j5-8PdlPi2Vs72qCig,14254
147
+ sparknlp/annotator/seq2seq/phi2_transformer.py,sha256=YuqEcvJunKKZMmfqD3thXHR5FsPbqjjwbHFExWjbDWk,13796
145
148
  sparknlp/annotator/seq2seq/t5_transformer.py,sha256=wDVxNLluIU1HGZFqaKKc4YTt4l-elPlAtQ7EEa0f5tg,17308
146
149
  sparknlp/annotator/similarity/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
147
150
  sparknlp/annotator/similarity/document_similarity_ranker.py,sha256=OFAXEBuALFJglwThsGK8YaJ_pgW1tcevB7jVq-8SyKM,14991
@@ -183,7 +186,7 @@ sparknlp/common/read_as.py,sha256=imxPGwV7jr4Li_acbo0OAHHRGCBbYv-akzEGaBWEfcY,12
183
186
  sparknlp/common/recursive_annotator_approach.py,sha256=vqugBw22cE3Ff7PIpRlnYFuOlchgL0nM26D8j-NdpqU,1449
184
187
  sparknlp/common/storage.py,sha256=D91H3p8EIjNspjqAYu6ephRpCUtdcAir4_PrAbkIQWE,4842
185
188
  sparknlp/common/utils.py,sha256=Yne6yYcwKxhOZC-U4qfYoDhWUP_6BIaAjI5X_P_df1E,1306
186
- sparknlp/internal/__init__.py,sha256=7kGauV0ncpqnrPzagaXefApSKyzuxYZTO1myeVZ6LJ8,26929
189
+ sparknlp/internal/__init__.py,sha256=X38S3vTHB0c4EkzczDv-J7hpJl0g6A9Xe_3u8jGJTCU,30239
187
190
  sparknlp/internal/annotator_java_ml.py,sha256=UGPoThG0rGXUOXGSQnDzEDW81Mu1s5RPF29v7DFyE3c,1187
188
191
  sparknlp/internal/annotator_transformer.py,sha256=fXmc2IWXGybqZpbEU9obmbdBYPc798y42zvSB4tqV9U,1448
189
192
  sparknlp/internal/extended_java_wrapper.py,sha256=hwP0133-hDiDf5sBF-P3MtUsuuDj1PpQbtGZQIRwzfk,2240
@@ -225,8 +228,8 @@ sparknlp/training/_tf_graph_builders_1x/ner_dl/dataset_encoder.py,sha256=R4yHFN3
225
228
  sparknlp/training/_tf_graph_builders_1x/ner_dl/ner_model.py,sha256=EoCSdcIjqQ3wv13MAuuWrKV8wyVBP0SbOEW41omHlR0,23189
226
229
  sparknlp/training/_tf_graph_builders_1x/ner_dl/ner_model_saver.py,sha256=k5CQ7gKV6HZbZMB8cKLUJuZxoZWlP_DFWdZ--aIDwsc,2356
227
230
  sparknlp/training/_tf_graph_builders_1x/ner_dl/sentence_grouper.py,sha256=pAxjWhjazSX8Vg0MFqJiuRVw1IbnQNSs-8Xp26L4nko,870
228
- spark_nlp-5.4.0rc1.dist-info/.uuid,sha256=1f6hF51aIuv9yCvh31NU9lOpS34NE-h3a0Et7R9yR6A,36
229
- spark_nlp-5.4.0rc1.dist-info/METADATA,sha256=cEBGxVSbWrCQInnlujccn9xgqznzSHsX1QiNjQJmobU,57266
230
- spark_nlp-5.4.0rc1.dist-info/WHEEL,sha256=bb2Ot9scclHKMOLDEHY6B2sicWOgugjFKaJsT7vwMQo,110
231
- spark_nlp-5.4.0rc1.dist-info/top_level.txt,sha256=uuytur4pyMRw2H_txNY2ZkaucZHUs22QF8-R03ch_-E,13
232
- spark_nlp-5.4.0rc1.dist-info/RECORD,,
231
+ spark_nlp-5.4.0rc2.dist-info/.uuid,sha256=1f6hF51aIuv9yCvh31NU9lOpS34NE-h3a0Et7R9yR6A,36
232
+ spark_nlp-5.4.0rc2.dist-info/METADATA,sha256=2nVIdNrRFI1jA81egkqysz0xNJ5W95-_fn8wX3wkaLE,57266
233
+ spark_nlp-5.4.0rc2.dist-info/WHEEL,sha256=bb2Ot9scclHKMOLDEHY6B2sicWOgugjFKaJsT7vwMQo,110
234
+ spark_nlp-5.4.0rc2.dist-info/top_level.txt,sha256=uuytur4pyMRw2H_txNY2ZkaucZHUs22QF8-R03ch_-E,13
235
+ spark_nlp-5.4.0rc2.dist-info/RECORD,,
sparknlp/__init__.py CHANGED
@@ -128,7 +128,7 @@ def start(gpu=False,
128
128
  The initiated Spark session.
129
129
 
130
130
  """
131
- current_version = "5.4.0-rc1"
131
+ current_version = "5.4.0-rc2"
132
132
 
133
133
  if params is None:
134
134
  params = {}
@@ -309,4 +309,4 @@ def version():
309
309
  str
310
310
  The current Spark NLP version.
311
311
  """
312
- return '5.4.0-rc1'
312
+ return '5.4.0-rc2'
@@ -51,3 +51,4 @@ from sparknlp.annotator.classifier_dl.bart_for_zero_shot_classification import *
51
51
  from sparknlp.annotator.classifier_dl.deberta_for_zero_shot_classification import *
52
52
  from sparknlp.annotator.classifier_dl.mpnet_for_sequence_classification import *
53
53
  from sparknlp.annotator.classifier_dl.mpnet_for_question_answering import *
54
+ from sparknlp.annotator.classifier_dl.mpnet_for_token_classification import *
@@ -0,0 +1,173 @@
1
+ # Copyright 2017-2022 John Snow Labs
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ """Contains classes for MPNetForTokenClassification."""
15
+
16
+ from sparknlp.common import *
17
+
18
+
19
+ class MPNetForTokenClassification(AnnotatorModel,
20
+ HasCaseSensitiveProperties,
21
+ HasBatchedAnnotate,
22
+ HasEngine,
23
+ HasMaxSentenceLengthLimit):
24
+ """MPNetForTokenClassification can load XLM-RoBERTa Models with a token
25
+ classification head on top (a linear layer on top of the hidden-states
26
+ output) e.g. for Named-Entity-Recognition (NER) tasks.
27
+
28
+ Pretrained models can be loaded with :meth:`.pretrained` of the companion
29
+ object:
30
+
31
+ >>> token_classifier = MPNetForTokenClassification.pretrained() \\
32
+ ... .setInputCols(["token", "document"]) \\
33
+ ... .setOutputCol("label")
34
+ The default model is ``"mpnet_base_token_classifier"``, if no
35
+ name is provided.
36
+
37
+ For available pretrained models please see the `Models Hub
38
+ <https://sparknlp.org/models?task=Named+Entity+Recognition>`__.
39
+ To see which models are compatible and how to import them see
40
+ `Import Transformers into Spark NLP 🚀
41
+ <https://github.com/JohnSnowLabs/spark-nlp/discussions/5669>`_.
42
+
43
+ ====================== ======================
44
+ Input Annotation types Output Annotation type
45
+ ====================== ======================
46
+ ``DOCUMENT, TOKEN`` ``NAMED_ENTITY``
47
+ ====================== ======================
48
+
49
+ Parameters
50
+ ----------
51
+ batchSize
52
+ Batch size. Large values allows faster processing but requires more
53
+ memory, by default 8
54
+ caseSensitive
55
+ Whether to ignore case in tokens for embeddings matching, by default
56
+ True
57
+ configProtoBytes
58
+ ConfigProto from tensorflow, serialized into byte array.
59
+ maxSentenceLength
60
+ Max sentence length to process, by default 128
61
+
62
+ Examples
63
+ --------
64
+ >>> import sparknlp
65
+ >>> from sparknlp.base import *
66
+ >>> from sparknlp.annotator import *
67
+ >>> from pyspark.ml import Pipeline
68
+ >>> documentAssembler = DocumentAssembler() \\
69
+ ... .setInputCol("text") \\
70
+ ... .setOutputCol("document")
71
+ >>> tokenizer = Tokenizer() \\
72
+ ... .setInputCols(["document"]) \\
73
+ ... .setOutputCol("token")
74
+ >>> tokenClassifier = MPNetForTokenClassification.pretrained() \\
75
+ ... .setInputCols(["token", "document"]) \\
76
+ ... .setOutputCol("label") \\
77
+ ... .setCaseSensitive(True)
78
+ >>> pipeline = Pipeline().setStages([
79
+ ... documentAssembler,
80
+ ... tokenizer,
81
+ ... tokenClassifier
82
+ ... ])
83
+ >>> data = spark.createDataFrame([["John Lenon was born in London and lived in Paris. My name is Sarah and I live in London"]]).toDF("text")
84
+ >>> result = pipeline.fit(data).transform(data)
85
+ >>> result.select("label.result").show(truncate=False)
86
+ +------------------------------------------------------------------------------------+
87
+ |result |
88
+ +------------------------------------------------------------------------------------+
89
+ |[B-PER, I-PER, O, O, O, B-LOC, O, O, O, B-LOC, O, O, O, O, B-PER, O, O, O, O, B-LOC]|
90
+ +------------------------------------------------------------------------------------+
91
+ """
92
+ name = "MPNetForTokenClassification"
93
+
94
+ inputAnnotatorTypes = [AnnotatorType.DOCUMENT, AnnotatorType.TOKEN]
95
+
96
+ outputAnnotatorType = AnnotatorType.NAMED_ENTITY
97
+
98
+ configProtoBytes = Param(Params._dummy(),
99
+ "configProtoBytes",
100
+ "ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()",
101
+ TypeConverters.toListInt)
102
+
103
+ def getClasses(self):
104
+ """
105
+ Returns labels used to train this model
106
+ """
107
+ return self._call_java("getClasses")
108
+
109
+ def setConfigProtoBytes(self, b):
110
+ """Sets configProto from tensorflow, serialized into byte array.
111
+
112
+ Parameters
113
+ ----------
114
+ b : List[int]
115
+ ConfigProto from tensorflow, serialized into byte array
116
+ """
117
+ return self._set(configProtoBytes=b)
118
+
119
+ @keyword_only
120
+ def __init__(self, classname="com.johnsnowlabs.nlp.annotators.classifier.dl.MPNetForTokenClassification",
121
+ java_model=None):
122
+ super(MPNetForTokenClassification, self).__init__(
123
+ classname=classname,
124
+ java_model=java_model
125
+ )
126
+ self._setDefault(
127
+ batchSize=8,
128
+ maxSentenceLength=128,
129
+ caseSensitive=True
130
+ )
131
+
132
+ @staticmethod
133
+ def loadSavedModel(folder, spark_session):
134
+ """Loads a locally saved model.
135
+
136
+ Parameters
137
+ ----------
138
+ folder : str
139
+ Folder of the saved model
140
+ spark_session : pyspark.sql.SparkSession
141
+ The current SparkSession
142
+
143
+ Returns
144
+ -------
145
+ XlmRoBertaForTokenClassification
146
+ The restored model
147
+ """
148
+ from sparknlp.internal import _MPNetForTokenClassifierLoader
149
+ jModel = _MPNetForTokenClassifierLoader(folder, spark_session._jsparkSession)._java_obj
150
+ return MPNetForTokenClassification(java_model=jModel)
151
+
152
+ @staticmethod
153
+ def pretrained(name="mpnet_base_token_classifier", lang="en", remote_loc=None):
154
+ """Downloads and loads a pretrained model.
155
+
156
+ Parameters
157
+ ----------
158
+ name : str, optional
159
+ Name of the pretrained model, by default
160
+ "mpnet_base_token_classifier"
161
+ lang : str, optional
162
+ Language of the pretrained model, by default "en"
163
+ remote_loc : str, optional
164
+ Optional remote address of the resource, by default None. Will use
165
+ Spark NLPs repositories otherwise.
166
+
167
+ Returns
168
+ -------
169
+ XlmRoBertaForTokenClassification
170
+ The restored model
171
+ """
172
+ from sparknlp.pretrained import ResourceDownloader
173
+ return ResourceDownloader.downloadModel(MPNetForTokenClassification, name, lang, remote_loc)
@@ -31,7 +31,7 @@ class XlmRoBertaForTokenClassification(AnnotatorModel,
31
31
  >>> token_classifier = XlmRoBertaForTokenClassification.pretrained() \\
32
32
  ... .setInputCols(["token", "document"]) \\
33
33
  ... .setOutputCol("label")
34
- The default model is ``"xlm_roberta_base_token_classifier_conll03"``, if no
34
+ The default model is ``"mpnet_base_token_classifier"``, if no
35
35
  name is provided.
36
36
 
37
37
  For available pretrained models please see the `Models Hub
@@ -150,14 +150,14 @@ class XlmRoBertaForTokenClassification(AnnotatorModel,
150
150
  return XlmRoBertaForTokenClassification(java_model=jModel)
151
151
 
152
152
  @staticmethod
153
- def pretrained(name="xlm_roberta_base_token_classifier_conll03", lang="en", remote_loc=None):
153
+ def pretrained(name="mpnet_base_token_classifier", lang="en", remote_loc=None):
154
154
  """Downloads and loads a pretrained model.
155
155
 
156
156
  Parameters
157
157
  ----------
158
158
  name : str, optional
159
159
  Name of the pretrained model, by default
160
- "xlm_roberta_base_token_classifier_conll03"
160
+ "mpnet_base_token_classifier"
161
161
  lang : str, optional
162
162
  Language of the pretrained model, by default "en"
163
163
  remote_loc : str, optional