@synsci/cli-darwin-x64 1.1.97 → 1.1.99
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/synsc +0 -0
- package/package.json +1 -1
- package/bin/skills/accelerate/SKILL.md +0 -332
- package/bin/skills/accelerate/references/custom-plugins.md +0 -453
- package/bin/skills/accelerate/references/megatron-integration.md +0 -489
- package/bin/skills/accelerate/references/performance.md +0 -525
- package/bin/skills/adaptyv/SKILL.md +0 -114
- package/bin/skills/adaptyv/reference/api_reference.md +0 -308
- package/bin/skills/adaptyv/reference/examples.md +0 -913
- package/bin/skills/adaptyv/reference/experiments.md +0 -360
- package/bin/skills/adaptyv/reference/protein_optimization.md +0 -637
- package/bin/skills/aeon/SKILL.md +0 -374
- package/bin/skills/aeon/references/anomaly_detection.md +0 -154
- package/bin/skills/aeon/references/classification.md +0 -144
- package/bin/skills/aeon/references/clustering.md +0 -123
- package/bin/skills/aeon/references/datasets_benchmarking.md +0 -387
- package/bin/skills/aeon/references/distances.md +0 -256
- package/bin/skills/aeon/references/forecasting.md +0 -140
- package/bin/skills/aeon/references/networks.md +0 -289
- package/bin/skills/aeon/references/regression.md +0 -118
- package/bin/skills/aeon/references/segmentation.md +0 -163
- package/bin/skills/aeon/references/similarity_search.md +0 -187
- package/bin/skills/aeon/references/transformations.md +0 -246
- package/bin/skills/alphafold-database/SKILL.md +0 -513
- package/bin/skills/alphafold-database/references/api_reference.md +0 -423
- package/bin/skills/anndata/SKILL.md +0 -400
- package/bin/skills/anndata/references/best_practices.md +0 -525
- package/bin/skills/anndata/references/concatenation.md +0 -396
- package/bin/skills/anndata/references/data_structure.md +0 -314
- package/bin/skills/anndata/references/io_operations.md +0 -404
- package/bin/skills/anndata/references/manipulation.md +0 -516
- package/bin/skills/arboreto/SKILL.md +0 -243
- package/bin/skills/arboreto/references/algorithms.md +0 -138
- package/bin/skills/arboreto/references/basic_inference.md +0 -151
- package/bin/skills/arboreto/references/distributed_computing.md +0 -242
- package/bin/skills/arboreto/scripts/basic_grn_inference.py +0 -97
- package/bin/skills/astropy/SKILL.md +0 -331
- package/bin/skills/astropy/references/coordinates.md +0 -273
- package/bin/skills/astropy/references/cosmology.md +0 -307
- package/bin/skills/astropy/references/fits.md +0 -396
- package/bin/skills/astropy/references/tables.md +0 -489
- package/bin/skills/astropy/references/time.md +0 -404
- package/bin/skills/astropy/references/units.md +0 -178
- package/bin/skills/astropy/references/wcs_and_other_modules.md +0 -373
- package/bin/skills/audiocraft/SKILL.md +0 -564
- package/bin/skills/audiocraft/references/advanced-usage.md +0 -666
- package/bin/skills/audiocraft/references/troubleshooting.md +0 -504
- package/bin/skills/autogpt/SKILL.md +0 -403
- package/bin/skills/autogpt/references/advanced-usage.md +0 -535
- package/bin/skills/autogpt/references/troubleshooting.md +0 -420
- package/bin/skills/awq/SKILL.md +0 -310
- package/bin/skills/awq/references/advanced-usage.md +0 -324
- package/bin/skills/awq/references/troubleshooting.md +0 -344
- package/bin/skills/axolotl/SKILL.md +0 -158
- package/bin/skills/axolotl/references/api.md +0 -5548
- package/bin/skills/axolotl/references/dataset-formats.md +0 -1029
- package/bin/skills/axolotl/references/index.md +0 -15
- package/bin/skills/axolotl/references/other.md +0 -3563
- package/bin/skills/benchling-integration/SKILL.md +0 -480
- package/bin/skills/benchling-integration/references/api_endpoints.md +0 -883
- package/bin/skills/benchling-integration/references/authentication.md +0 -379
- package/bin/skills/benchling-integration/references/sdk_reference.md +0 -774
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +0 -405
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +0 -393
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +0 -424
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +0 -394
- package/bin/skills/biopython/SKILL.md +0 -443
- package/bin/skills/biopython/references/advanced.md +0 -577
- package/bin/skills/biopython/references/alignment.md +0 -362
- package/bin/skills/biopython/references/blast.md +0 -455
- package/bin/skills/biopython/references/databases.md +0 -484
- package/bin/skills/biopython/references/phylogenetics.md +0 -566
- package/bin/skills/biopython/references/sequence_io.md +0 -285
- package/bin/skills/biopython/references/structure.md +0 -564
- package/bin/skills/biorxiv-database/SKILL.md +0 -483
- package/bin/skills/biorxiv-database/references/api_reference.md +0 -280
- package/bin/skills/biorxiv-database/scripts/biorxiv_search.py +0 -445
- package/bin/skills/bioservices/SKILL.md +0 -361
- package/bin/skills/bioservices/references/identifier_mapping.md +0 -685
- package/bin/skills/bioservices/references/services_reference.md +0 -636
- package/bin/skills/bioservices/references/workflow_patterns.md +0 -811
- package/bin/skills/bioservices/scripts/batch_id_converter.py +0 -347
- package/bin/skills/bioservices/scripts/compound_cross_reference.py +0 -378
- package/bin/skills/bioservices/scripts/pathway_analysis.py +0 -309
- package/bin/skills/bioservices/scripts/protein_analysis_workflow.py +0 -408
- package/bin/skills/bitsandbytes/SKILL.md +0 -411
- package/bin/skills/bitsandbytes/references/memory-optimization.md +0 -521
- package/bin/skills/bitsandbytes/references/qlora-training.md +0 -521
- package/bin/skills/bitsandbytes/references/quantization-formats.md +0 -447
- package/bin/skills/blip-2/SKILL.md +0 -564
- package/bin/skills/blip-2/references/advanced-usage.md +0 -680
- package/bin/skills/blip-2/references/troubleshooting.md +0 -526
- package/bin/skills/brenda-database/SKILL.md +0 -719
- package/bin/skills/brenda-database/references/api_reference.md +0 -537
- package/bin/skills/brenda-database/scripts/brenda_queries.py +0 -844
- package/bin/skills/brenda-database/scripts/brenda_visualization.py +0 -772
- package/bin/skills/brenda-database/scripts/enzyme_pathway_builder.py +0 -1053
- package/bin/skills/cellxgene-census/SKILL.md +0 -511
- package/bin/skills/cellxgene-census/references/census_schema.md +0 -182
- package/bin/skills/cellxgene-census/references/common_patterns.md +0 -351
- package/bin/skills/chembl-database/SKILL.md +0 -389
- package/bin/skills/chembl-database/references/api_reference.md +0 -272
- package/bin/skills/chembl-database/scripts/example_queries.py +0 -278
- package/bin/skills/chroma/SKILL.md +0 -406
- package/bin/skills/chroma/references/integration.md +0 -38
- package/bin/skills/cirq/SKILL.md +0 -346
- package/bin/skills/cirq/references/building.md +0 -307
- package/bin/skills/cirq/references/experiments.md +0 -572
- package/bin/skills/cirq/references/hardware.md +0 -515
- package/bin/skills/cirq/references/noise.md +0 -515
- package/bin/skills/cirq/references/simulation.md +0 -350
- package/bin/skills/cirq/references/transformation.md +0 -416
- package/bin/skills/citation-management/SKILL.md +0 -1109
- package/bin/skills/citation-management/assets/bibtex_template.bib +0 -264
- package/bin/skills/citation-management/assets/citation_checklist.md +0 -386
- package/bin/skills/citation-management/references/bibtex_formatting.md +0 -908
- package/bin/skills/citation-management/references/citation_validation.md +0 -794
- package/bin/skills/citation-management/references/google_scholar_search.md +0 -725
- package/bin/skills/citation-management/references/metadata_extraction.md +0 -870
- package/bin/skills/citation-management/references/pubmed_search.md +0 -839
- package/bin/skills/citation-management/scripts/doi_to_bibtex.py +0 -182
- package/bin/skills/citation-management/scripts/extract_metadata.py +0 -570
- package/bin/skills/citation-management/scripts/format_bibtex.py +0 -349
- package/bin/skills/citation-management/scripts/search_google_scholar.py +0 -251
- package/bin/skills/citation-management/scripts/search_pubmed.py +0 -348
- package/bin/skills/citation-management/scripts/validate_citations.py +0 -494
- package/bin/skills/clinical-decision-support/README.md +0 -129
- package/bin/skills/clinical-decision-support/SKILL.md +0 -506
- package/bin/skills/clinical-decision-support/assets/biomarker_report_template.tex +0 -380
- package/bin/skills/clinical-decision-support/assets/clinical_pathway_template.tex +0 -222
- package/bin/skills/clinical-decision-support/assets/cohort_analysis_template.tex +0 -359
- package/bin/skills/clinical-decision-support/assets/color_schemes.tex +0 -149
- package/bin/skills/clinical-decision-support/assets/example_gbm_cohort.md +0 -208
- package/bin/skills/clinical-decision-support/assets/recommendation_strength_guide.md +0 -328
- package/bin/skills/clinical-decision-support/assets/treatment_recommendation_template.tex +0 -529
- package/bin/skills/clinical-decision-support/references/biomarker_classification.md +0 -719
- package/bin/skills/clinical-decision-support/references/clinical_decision_algorithms.md +0 -604
- package/bin/skills/clinical-decision-support/references/evidence_synthesis.md +0 -840
- package/bin/skills/clinical-decision-support/references/outcome_analysis.md +0 -640
- package/bin/skills/clinical-decision-support/references/patient_cohort_analysis.md +0 -427
- package/bin/skills/clinical-decision-support/references/treatment_recommendations.md +0 -521
- package/bin/skills/clinical-decision-support/scripts/biomarker_classifier.py +0 -383
- package/bin/skills/clinical-decision-support/scripts/build_decision_tree.py +0 -417
- package/bin/skills/clinical-decision-support/scripts/create_cohort_tables.py +0 -509
- package/bin/skills/clinical-decision-support/scripts/generate_survival_analysis.py +0 -441
- package/bin/skills/clinical-decision-support/scripts/validate_cds_document.py +0 -326
- package/bin/skills/clinical-reports/IMPLEMENTATION_SUMMARY.md +0 -641
- package/bin/skills/clinical-reports/README.md +0 -236
- package/bin/skills/clinical-reports/SKILL.md +0 -1127
- package/bin/skills/clinical-reports/assets/case_report_template.md +0 -352
- package/bin/skills/clinical-reports/assets/clinical_trial_csr_template.md +0 -353
- package/bin/skills/clinical-reports/assets/clinical_trial_sae_template.md +0 -359
- package/bin/skills/clinical-reports/assets/consult_note_template.md +0 -305
- package/bin/skills/clinical-reports/assets/discharge_summary_template.md +0 -453
- package/bin/skills/clinical-reports/assets/hipaa_compliance_checklist.md +0 -395
- package/bin/skills/clinical-reports/assets/history_physical_template.md +0 -305
- package/bin/skills/clinical-reports/assets/lab_report_template.md +0 -309
- package/bin/skills/clinical-reports/assets/pathology_report_template.md +0 -249
- package/bin/skills/clinical-reports/assets/quality_checklist.md +0 -338
- package/bin/skills/clinical-reports/assets/radiology_report_template.md +0 -318
- package/bin/skills/clinical-reports/assets/soap_note_template.md +0 -253
- package/bin/skills/clinical-reports/references/case_report_guidelines.md +0 -570
- package/bin/skills/clinical-reports/references/clinical_trial_reporting.md +0 -693
- package/bin/skills/clinical-reports/references/data_presentation.md +0 -530
- package/bin/skills/clinical-reports/references/diagnostic_reports_standards.md +0 -629
- package/bin/skills/clinical-reports/references/medical_terminology.md +0 -588
- package/bin/skills/clinical-reports/references/patient_documentation.md +0 -744
- package/bin/skills/clinical-reports/references/peer_review_standards.md +0 -585
- package/bin/skills/clinical-reports/references/regulatory_compliance.md +0 -577
- package/bin/skills/clinical-reports/scripts/check_deidentification.py +0 -332
- package/bin/skills/clinical-reports/scripts/compliance_checker.py +0 -78
- package/bin/skills/clinical-reports/scripts/extract_clinical_data.py +0 -97
- package/bin/skills/clinical-reports/scripts/format_adverse_events.py +0 -97
- package/bin/skills/clinical-reports/scripts/generate_report_template.py +0 -149
- package/bin/skills/clinical-reports/scripts/terminology_validator.py +0 -126
- package/bin/skills/clinical-reports/scripts/validate_case_report.py +0 -323
- package/bin/skills/clinical-reports/scripts/validate_trial_report.py +0 -88
- package/bin/skills/clinicaltrials-database/SKILL.md +0 -507
- package/bin/skills/clinicaltrials-database/references/api_reference.md +0 -358
- package/bin/skills/clinicaltrials-database/scripts/query_clinicaltrials.py +0 -215
- package/bin/skills/clinpgx-database/SKILL.md +0 -638
- package/bin/skills/clinpgx-database/references/api_reference.md +0 -757
- package/bin/skills/clinpgx-database/scripts/query_clinpgx.py +0 -518
- package/bin/skills/clinvar-database/SKILL.md +0 -362
- package/bin/skills/clinvar-database/references/api_reference.md +0 -227
- package/bin/skills/clinvar-database/references/clinical_significance.md +0 -218
- package/bin/skills/clinvar-database/references/data_formats.md +0 -358
- package/bin/skills/clip/SKILL.md +0 -253
- package/bin/skills/clip/references/applications.md +0 -207
- package/bin/skills/cobrapy/SKILL.md +0 -463
- package/bin/skills/cobrapy/references/api_quick_reference.md +0 -655
- package/bin/skills/cobrapy/references/workflows.md +0 -593
- package/bin/skills/colab-finetuning/SKILL.md +0 -153
- package/bin/skills/colab-finetuning/references/bridge-setup.md +0 -68
- package/bin/skills/colab-finetuning/references/gpu-tiers.md +0 -54
- package/bin/skills/colab-finetuning/references/troubleshooting.md +0 -79
- package/bin/skills/constitutional-ai/SKILL.md +0 -290
- package/bin/skills/cosmic-database/SKILL.md +0 -336
- package/bin/skills/cosmic-database/references/cosmic_data_reference.md +0 -220
- package/bin/skills/cosmic-database/scripts/download_cosmic.py +0 -231
- package/bin/skills/crewai/SKILL.md +0 -498
- package/bin/skills/crewai/references/flows.md +0 -438
- package/bin/skills/crewai/references/tools.md +0 -429
- package/bin/skills/crewai/references/troubleshooting.md +0 -480
- package/bin/skills/dask/SKILL.md +0 -456
- package/bin/skills/dask/references/arrays.md +0 -497
- package/bin/skills/dask/references/bags.md +0 -468
- package/bin/skills/dask/references/best-practices.md +0 -277
- package/bin/skills/dask/references/dataframes.md +0 -368
- package/bin/skills/dask/references/futures.md +0 -541
- package/bin/skills/dask/references/schedulers.md +0 -504
- package/bin/skills/datacommons-client/SKILL.md +0 -255
- package/bin/skills/datacommons-client/references/getting_started.md +0 -417
- package/bin/skills/datacommons-client/references/node.md +0 -250
- package/bin/skills/datacommons-client/references/observation.md +0 -185
- package/bin/skills/datacommons-client/references/resolve.md +0 -246
- package/bin/skills/datamol/SKILL.md +0 -706
- package/bin/skills/datamol/references/conformers_module.md +0 -131
- package/bin/skills/datamol/references/core_api.md +0 -130
- package/bin/skills/datamol/references/descriptors_viz.md +0 -195
- package/bin/skills/datamol/references/fragments_scaffolds.md +0 -174
- package/bin/skills/datamol/references/io_module.md +0 -109
- package/bin/skills/datamol/references/reactions_data.md +0 -218
- package/bin/skills/deepchem/SKILL.md +0 -597
- package/bin/skills/deepchem/references/api_reference.md +0 -303
- package/bin/skills/deepchem/references/workflows.md +0 -491
- package/bin/skills/deepchem/scripts/graph_neural_network.py +0 -338
- package/bin/skills/deepchem/scripts/predict_solubility.py +0 -224
- package/bin/skills/deepchem/scripts/transfer_learning.py +0 -375
- package/bin/skills/deepspeed/SKILL.md +0 -141
- package/bin/skills/deepspeed/references/08.md +0 -17
- package/bin/skills/deepspeed/references/09.md +0 -173
- package/bin/skills/deepspeed/references/2020.md +0 -378
- package/bin/skills/deepspeed/references/2023.md +0 -279
- package/bin/skills/deepspeed/references/assets.md +0 -179
- package/bin/skills/deepspeed/references/index.md +0 -35
- package/bin/skills/deepspeed/references/mii.md +0 -118
- package/bin/skills/deepspeed/references/other.md +0 -1191
- package/bin/skills/deepspeed/references/tutorials.md +0 -6554
- package/bin/skills/deeptools/SKILL.md +0 -531
- package/bin/skills/deeptools/assets/quick_reference.md +0 -58
- package/bin/skills/deeptools/references/effective_genome_sizes.md +0 -116
- package/bin/skills/deeptools/references/normalization_methods.md +0 -410
- package/bin/skills/deeptools/references/tools_reference.md +0 -533
- package/bin/skills/deeptools/references/workflows.md +0 -474
- package/bin/skills/deeptools/scripts/validate_files.py +0 -195
- package/bin/skills/deeptools/scripts/workflow_generator.py +0 -454
- package/bin/skills/denario/SKILL.md +0 -215
- package/bin/skills/denario/references/examples.md +0 -494
- package/bin/skills/denario/references/installation.md +0 -213
- package/bin/skills/denario/references/llm_configuration.md +0 -265
- package/bin/skills/denario/references/research_pipeline.md +0 -471
- package/bin/skills/diffdock/SKILL.md +0 -483
- package/bin/skills/diffdock/assets/batch_template.csv +0 -4
- package/bin/skills/diffdock/assets/custom_inference_config.yaml +0 -90
- package/bin/skills/diffdock/references/confidence_and_limitations.md +0 -182
- package/bin/skills/diffdock/references/parameters_reference.md +0 -163
- package/bin/skills/diffdock/references/workflows_examples.md +0 -392
- package/bin/skills/diffdock/scripts/analyze_results.py +0 -334
- package/bin/skills/diffdock/scripts/prepare_batch_csv.py +0 -254
- package/bin/skills/diffdock/scripts/setup_check.py +0 -278
- package/bin/skills/dnanexus-integration/SKILL.md +0 -383
- package/bin/skills/dnanexus-integration/references/app-development.md +0 -247
- package/bin/skills/dnanexus-integration/references/configuration.md +0 -646
- package/bin/skills/dnanexus-integration/references/data-operations.md +0 -400
- package/bin/skills/dnanexus-integration/references/job-execution.md +0 -412
- package/bin/skills/dnanexus-integration/references/python-sdk.md +0 -523
- package/bin/skills/document-skills/docx/LICENSE.txt +0 -30
- package/bin/skills/document-skills/docx/SKILL.md +0 -233
- package/bin/skills/document-skills/docx/docx-js.md +0 -350
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +0 -1499
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +0 -146
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +0 -1085
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +0 -11
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +0 -3081
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +0 -23
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +0 -185
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +0 -287
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +0 -1676
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +0 -28
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +0 -144
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +0 -174
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +0 -25
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +0 -18
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +0 -59
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +0 -56
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +0 -195
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +0 -582
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +0 -25
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +0 -4439
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +0 -570
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +0 -509
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +0 -12
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +0 -108
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +0 -96
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +0 -3646
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +0 -116
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +0 -42
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +0 -50
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +0 -49
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +0 -33
- package/bin/skills/document-skills/docx/ooxml/schemas/mce/mc.xsd +0 -75
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2010.xsd +0 -560
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2012.xsd +0 -67
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2018.xsd +0 -14
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cex-2018.xsd +0 -20
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cid-2016.xsd +0 -13
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +0 -4
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-symex-2015.xsd +0 -8
- package/bin/skills/document-skills/docx/ooxml/scripts/pack.py +0 -159
- package/bin/skills/document-skills/docx/ooxml/scripts/unpack.py +0 -29
- package/bin/skills/document-skills/docx/ooxml/scripts/validate.py +0 -69
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/__init__.py +0 -15
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/base.py +0 -951
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/docx.py +0 -274
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/pptx.py +0 -315
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/redlining.py +0 -279
- package/bin/skills/document-skills/docx/ooxml.md +0 -610
- package/bin/skills/document-skills/docx/scripts/__init__.py +0 -1
- package/bin/skills/document-skills/docx/scripts/document.py +0 -1276
- package/bin/skills/document-skills/docx/scripts/templates/comments.xml +0 -3
- package/bin/skills/document-skills/docx/scripts/templates/commentsExtended.xml +0 -3
- package/bin/skills/document-skills/docx/scripts/templates/commentsExtensible.xml +0 -3
- package/bin/skills/document-skills/docx/scripts/templates/commentsIds.xml +0 -3
- package/bin/skills/document-skills/docx/scripts/templates/people.xml +0 -3
- package/bin/skills/document-skills/docx/scripts/utilities.py +0 -374
- package/bin/skills/document-skills/pdf/LICENSE.txt +0 -30
- package/bin/skills/document-skills/pdf/SKILL.md +0 -330
- package/bin/skills/document-skills/pdf/forms.md +0 -205
- package/bin/skills/document-skills/pdf/reference.md +0 -612
- package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes.py +0 -70
- package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes_test.py +0 -226
- package/bin/skills/document-skills/pdf/scripts/check_fillable_fields.py +0 -12
- package/bin/skills/document-skills/pdf/scripts/convert_pdf_to_images.py +0 -35
- package/bin/skills/document-skills/pdf/scripts/create_validation_image.py +0 -41
- package/bin/skills/document-skills/pdf/scripts/extract_form_field_info.py +0 -152
- package/bin/skills/document-skills/pdf/scripts/fill_fillable_fields.py +0 -114
- package/bin/skills/document-skills/pdf/scripts/fill_pdf_form_with_annotations.py +0 -108
- package/bin/skills/document-skills/pptx/LICENSE.txt +0 -30
- package/bin/skills/document-skills/pptx/SKILL.md +0 -520
- package/bin/skills/document-skills/pptx/html2pptx.md +0 -625
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +0 -1499
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +0 -146
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +0 -1085
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +0 -11
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +0 -3081
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +0 -23
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +0 -185
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +0 -287
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +0 -1676
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +0 -28
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +0 -144
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +0 -174
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +0 -25
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +0 -18
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +0 -59
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +0 -56
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +0 -195
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +0 -582
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +0 -25
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +0 -4439
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +0 -570
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +0 -509
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +0 -12
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +0 -108
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +0 -96
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +0 -3646
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +0 -116
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +0 -42
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +0 -50
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +0 -49
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +0 -33
- package/bin/skills/document-skills/pptx/ooxml/schemas/mce/mc.xsd +0 -75
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2010.xsd +0 -560
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2012.xsd +0 -67
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2018.xsd +0 -14
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cex-2018.xsd +0 -20
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cid-2016.xsd +0 -13
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +0 -4
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-symex-2015.xsd +0 -8
- package/bin/skills/document-skills/pptx/ooxml/scripts/pack.py +0 -159
- package/bin/skills/document-skills/pptx/ooxml/scripts/unpack.py +0 -29
- package/bin/skills/document-skills/pptx/ooxml/scripts/validate.py +0 -69
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/__init__.py +0 -15
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/base.py +0 -951
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/docx.py +0 -274
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/pptx.py +0 -315
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/redlining.py +0 -279
- package/bin/skills/document-skills/pptx/ooxml.md +0 -427
- package/bin/skills/document-skills/pptx/scripts/html2pptx.js +0 -979
- package/bin/skills/document-skills/pptx/scripts/inventory.py +0 -1020
- package/bin/skills/document-skills/pptx/scripts/rearrange.py +0 -231
- package/bin/skills/document-skills/pptx/scripts/replace.py +0 -385
- package/bin/skills/document-skills/pptx/scripts/thumbnail.py +0 -450
- package/bin/skills/document-skills/xlsx/LICENSE.txt +0 -30
- package/bin/skills/document-skills/xlsx/SKILL.md +0 -325
- package/bin/skills/document-skills/xlsx/recalc.py +0 -178
- package/bin/skills/drugbank-database/SKILL.md +0 -190
- package/bin/skills/drugbank-database/references/chemical-analysis.md +0 -590
- package/bin/skills/drugbank-database/references/data-access.md +0 -242
- package/bin/skills/drugbank-database/references/drug-queries.md +0 -386
- package/bin/skills/drugbank-database/references/interactions.md +0 -425
- package/bin/skills/drugbank-database/references/targets-pathways.md +0 -518
- package/bin/skills/drugbank-database/scripts/drugbank_helper.py +0 -350
- package/bin/skills/dspy/SKILL.md +0 -590
- package/bin/skills/dspy/references/examples.md +0 -663
- package/bin/skills/dspy/references/modules.md +0 -475
- package/bin/skills/dspy/references/optimizers.md +0 -566
- package/bin/skills/ena-database/SKILL.md +0 -204
- package/bin/skills/ena-database/references/api_reference.md +0 -490
- package/bin/skills/ensembl-database/SKILL.md +0 -311
- package/bin/skills/ensembl-database/references/api_endpoints.md +0 -346
- package/bin/skills/ensembl-database/scripts/ensembl_query.py +0 -427
- package/bin/skills/esm/SKILL.md +0 -306
- package/bin/skills/esm/references/esm-c-api.md +0 -583
- package/bin/skills/esm/references/esm3-api.md +0 -452
- package/bin/skills/esm/references/forge-api.md +0 -657
- package/bin/skills/esm/references/workflows.md +0 -685
- package/bin/skills/etetoolkit/SKILL.md +0 -623
- package/bin/skills/etetoolkit/references/api_reference.md +0 -583
- package/bin/skills/etetoolkit/references/visualization.md +0 -783
- package/bin/skills/etetoolkit/references/workflows.md +0 -774
- package/bin/skills/etetoolkit/scripts/quick_visualize.py +0 -214
- package/bin/skills/etetoolkit/scripts/tree_operations.py +0 -229
- package/bin/skills/exploratory-data-analysis/SKILL.md +0 -446
- package/bin/skills/exploratory-data-analysis/assets/report_template.md +0 -196
- package/bin/skills/exploratory-data-analysis/references/bioinformatics_genomics_formats.md +0 -664
- package/bin/skills/exploratory-data-analysis/references/chemistry_molecular_formats.md +0 -664
- package/bin/skills/exploratory-data-analysis/references/general_scientific_formats.md +0 -518
- package/bin/skills/exploratory-data-analysis/references/microscopy_imaging_formats.md +0 -620
- package/bin/skills/exploratory-data-analysis/references/proteomics_metabolomics_formats.md +0 -517
- package/bin/skills/exploratory-data-analysis/references/spectroscopy_analytical_formats.md +0 -633
- package/bin/skills/exploratory-data-analysis/scripts/eda_analyzer.py +0 -547
- package/bin/skills/faiss/SKILL.md +0 -221
- package/bin/skills/faiss/references/index_types.md +0 -280
- package/bin/skills/fda-database/SKILL.md +0 -518
- package/bin/skills/fda-database/references/animal_veterinary.md +0 -377
- package/bin/skills/fda-database/references/api_basics.md +0 -687
- package/bin/skills/fda-database/references/devices.md +0 -632
- package/bin/skills/fda-database/references/drugs.md +0 -468
- package/bin/skills/fda-database/references/foods.md +0 -374
- package/bin/skills/fda-database/references/other.md +0 -472
- package/bin/skills/fda-database/scripts/fda_examples.py +0 -335
- package/bin/skills/fda-database/scripts/fda_query.py +0 -440
- package/bin/skills/fireworks-ai/SKILL.md +0 -665
- package/bin/skills/flash-attention/SKILL.md +0 -367
- package/bin/skills/flash-attention/references/benchmarks.md +0 -215
- package/bin/skills/flash-attention/references/transformers-integration.md +0 -293
- package/bin/skills/flowio/SKILL.md +0 -608
- package/bin/skills/flowio/references/api_reference.md +0 -372
- package/bin/skills/fluidsim/SKILL.md +0 -349
- package/bin/skills/fluidsim/references/advanced_features.md +0 -398
- package/bin/skills/fluidsim/references/installation.md +0 -68
- package/bin/skills/fluidsim/references/output_analysis.md +0 -283
- package/bin/skills/fluidsim/references/parameters.md +0 -198
- package/bin/skills/fluidsim/references/simulation_workflow.md +0 -172
- package/bin/skills/fluidsim/references/solvers.md +0 -94
- package/bin/skills/fred-economic-data/SKILL.md +0 -433
- package/bin/skills/fred-economic-data/references/api_basics.md +0 -212
- package/bin/skills/fred-economic-data/references/categories.md +0 -442
- package/bin/skills/fred-economic-data/references/geofred.md +0 -588
- package/bin/skills/fred-economic-data/references/releases.md +0 -642
- package/bin/skills/fred-economic-data/references/series.md +0 -584
- package/bin/skills/fred-economic-data/references/sources.md +0 -423
- package/bin/skills/fred-economic-data/references/tags.md +0 -485
- package/bin/skills/fred-economic-data/scripts/fred_examples.py +0 -354
- package/bin/skills/fred-economic-data/scripts/fred_query.py +0 -590
- package/bin/skills/gene-database/SKILL.md +0 -179
- package/bin/skills/gene-database/references/api_reference.md +0 -404
- package/bin/skills/gene-database/references/common_workflows.md +0 -428
- package/bin/skills/gene-database/scripts/batch_gene_lookup.py +0 -298
- package/bin/skills/gene-database/scripts/fetch_gene_data.py +0 -277
- package/bin/skills/gene-database/scripts/query_gene.py +0 -251
- package/bin/skills/generate-image/SKILL.md +0 -178
- package/bin/skills/generate-image/scripts/generate_image.py +0 -254
- package/bin/skills/geniml/SKILL.md +0 -318
- package/bin/skills/geniml/references/bedspace.md +0 -127
- package/bin/skills/geniml/references/consensus_peaks.md +0 -238
- package/bin/skills/geniml/references/region2vec.md +0 -90
- package/bin/skills/geniml/references/scembed.md +0 -197
- package/bin/skills/geniml/references/utilities.md +0 -385
- package/bin/skills/geo-database/SKILL.md +0 -815
- package/bin/skills/geo-database/references/geo_reference.md +0 -829
- package/bin/skills/geopandas/SKILL.md +0 -251
- package/bin/skills/geopandas/references/crs-management.md +0 -243
- package/bin/skills/geopandas/references/data-io.md +0 -165
- package/bin/skills/geopandas/references/data-structures.md +0 -70
- package/bin/skills/geopandas/references/geometric-operations.md +0 -221
- package/bin/skills/geopandas/references/spatial-analysis.md +0 -184
- package/bin/skills/geopandas/references/visualization.md +0 -243
- package/bin/skills/get-available-resources/SKILL.md +0 -277
- package/bin/skills/get-available-resources/scripts/detect_resources.py +0 -401
- package/bin/skills/gget/SKILL.md +0 -871
- package/bin/skills/gget/references/database_info.md +0 -300
- package/bin/skills/gget/references/module_reference.md +0 -467
- package/bin/skills/gget/references/workflows.md +0 -814
- package/bin/skills/gget/scripts/batch_sequence_analysis.py +0 -191
- package/bin/skills/gget/scripts/enrichment_pipeline.py +0 -235
- package/bin/skills/gget/scripts/gene_analysis.py +0 -161
- package/bin/skills/gguf/SKILL.md +0 -427
- package/bin/skills/gguf/references/advanced-usage.md +0 -504
- package/bin/skills/gguf/references/troubleshooting.md +0 -442
- package/bin/skills/gptq/SKILL.md +0 -450
- package/bin/skills/gptq/references/calibration.md +0 -337
- package/bin/skills/gptq/references/integration.md +0 -129
- package/bin/skills/gptq/references/troubleshooting.md +0 -95
- package/bin/skills/groq/SKILL.md +0 -347
- package/bin/skills/grpo-rl-training/README.md +0 -97
- package/bin/skills/grpo-rl-training/SKILL.md +0 -572
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +0 -393
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +0 -228
- package/bin/skills/gtars/SKILL.md +0 -285
- package/bin/skills/gtars/references/cli.md +0 -222
- package/bin/skills/gtars/references/coverage.md +0 -172
- package/bin/skills/gtars/references/overlap.md +0 -156
- package/bin/skills/gtars/references/python-api.md +0 -211
- package/bin/skills/gtars/references/refget.md +0 -147
- package/bin/skills/gtars/references/tokenizers.md +0 -103
- package/bin/skills/guidance/SKILL.md +0 -572
- package/bin/skills/guidance/references/backends.md +0 -554
- package/bin/skills/guidance/references/constraints.md +0 -674
- package/bin/skills/guidance/references/examples.md +0 -767
- package/bin/skills/gwas-database/SKILL.md +0 -608
- package/bin/skills/gwas-database/references/api_reference.md +0 -793
- package/bin/skills/histolab/SKILL.md +0 -678
- package/bin/skills/histolab/references/filters_preprocessing.md +0 -514
- package/bin/skills/histolab/references/slide_management.md +0 -172
- package/bin/skills/histolab/references/tile_extraction.md +0 -421
- package/bin/skills/histolab/references/tissue_masks.md +0 -251
- package/bin/skills/histolab/references/visualization.md +0 -547
- package/bin/skills/hmdb-database/SKILL.md +0 -196
- package/bin/skills/hmdb-database/references/hmdb_data_fields.md +0 -267
- package/bin/skills/hqq/SKILL.md +0 -445
- package/bin/skills/hqq/references/advanced-usage.md +0 -528
- package/bin/skills/hqq/references/troubleshooting.md +0 -503
- package/bin/skills/hugging-face-cli/SKILL.md +0 -191
- package/bin/skills/hugging-face-cli/references/commands.md +0 -954
- package/bin/skills/hugging-face-cli/references/examples.md +0 -374
- package/bin/skills/hugging-face-datasets/SKILL.md +0 -547
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +0 -239
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +0 -196
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +0 -176
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +0 -522
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +0 -844
- package/bin/skills/hugging-face-datasets/templates/chat.json +0 -55
- package/bin/skills/hugging-face-datasets/templates/classification.json +0 -62
- package/bin/skills/hugging-face-datasets/templates/completion.json +0 -51
- package/bin/skills/hugging-face-datasets/templates/custom.json +0 -75
- package/bin/skills/hugging-face-datasets/templates/qa.json +0 -54
- package/bin/skills/hugging-face-datasets/templates/tabular.json +0 -81
- package/bin/skills/hugging-face-evaluation/SKILL.md +0 -656
- package/bin/skills/hugging-face-evaluation/examples/.env.example +0 -7
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +0 -382
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +0 -141
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +0 -135
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +0 -50
- package/bin/skills/hugging-face-evaluation/requirements.txt +0 -20
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +0 -1374
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +0 -104
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +0 -317
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +0 -303
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +0 -98
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +0 -331
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +0 -206
- package/bin/skills/hugging-face-jobs/SKILL.md +0 -1040
- package/bin/skills/hugging-face-jobs/index.html +0 -216
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +0 -336
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +0 -352
- package/bin/skills/hugging-face-jobs/references/token_usage.md +0 -546
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +0 -475
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +0 -718
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +0 -546
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +0 -587
- package/bin/skills/hugging-face-model-trainer/SKILL.md +0 -710
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +0 -296
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +0 -283
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +0 -364
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +0 -371
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +0 -189
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +0 -150
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +0 -203
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +0 -282
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +0 -424
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +0 -417
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +0 -150
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +0 -106
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +0 -89
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +0 -122
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +0 -627
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +0 -327
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +0 -216
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +0 -508
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +0 -299
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +0 -358
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +0 -319
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +0 -201
- package/bin/skills/hugging-face-tool-builder/SKILL.md +0 -115
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +0 -57
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +0 -40
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +0 -57
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +0 -230
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +0 -96
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +0 -188
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +0 -171
- package/bin/skills/hugging-face-trackio/.claude-plugin/plugin.json +0 -19
- package/bin/skills/hugging-face-trackio/SKILL.md +0 -65
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +0 -206
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +0 -223
- package/bin/skills/huggingface-tokenizers/SKILL.md +0 -516
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +0 -653
- package/bin/skills/huggingface-tokenizers/references/integration.md +0 -637
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +0 -723
- package/bin/skills/huggingface-tokenizers/references/training.md +0 -565
- package/bin/skills/hypogenic/SKILL.md +0 -655
- package/bin/skills/hypogenic/references/config_template.yaml +0 -150
- package/bin/skills/hypothesis-generation/SKILL.md +0 -293
- package/bin/skills/hypothesis-generation/assets/FORMATTING_GUIDE.md +0 -672
- package/bin/skills/hypothesis-generation/assets/hypothesis_generation.sty +0 -307
- package/bin/skills/hypothesis-generation/assets/hypothesis_report_template.tex +0 -572
- package/bin/skills/hypothesis-generation/references/experimental_design_patterns.md +0 -329
- package/bin/skills/hypothesis-generation/references/hypothesis_quality_criteria.md +0 -198
- package/bin/skills/hypothesis-generation/references/literature_search_strategies.md +0 -622
- package/bin/skills/imaging-data-commons/SKILL.md +0 -1182
- package/bin/skills/imaging-data-commons/references/bigquery_guide.md +0 -556
- package/bin/skills/imaging-data-commons/references/cli_guide.md +0 -272
- package/bin/skills/imaging-data-commons/references/cloud_storage_guide.md +0 -333
- package/bin/skills/imaging-data-commons/references/dicomweb_guide.md +0 -399
- package/bin/skills/infographics/SKILL.md +0 -563
- package/bin/skills/infographics/references/color_palettes.md +0 -496
- package/bin/skills/infographics/references/design_principles.md +0 -636
- package/bin/skills/infographics/references/infographic_types.md +0 -907
- package/bin/skills/infographics/scripts/generate_infographic.py +0 -234
- package/bin/skills/infographics/scripts/generate_infographic_ai.py +0 -1290
- package/bin/skills/instructor/SKILL.md +0 -740
- package/bin/skills/instructor/references/examples.md +0 -107
- package/bin/skills/instructor/references/providers.md +0 -70
- package/bin/skills/instructor/references/validation.md +0 -606
- package/bin/skills/iso-13485-certification/SKILL.md +0 -680
- package/bin/skills/iso-13485-certification/assets/templates/procedures/CAPA-procedure-template.md +0 -453
- package/bin/skills/iso-13485-certification/assets/templates/procedures/document-control-procedure-template.md +0 -567
- package/bin/skills/iso-13485-certification/assets/templates/quality-manual-template.md +0 -521
- package/bin/skills/iso-13485-certification/references/gap-analysis-checklist.md +0 -568
- package/bin/skills/iso-13485-certification/references/iso-13485-requirements.md +0 -610
- package/bin/skills/iso-13485-certification/references/mandatory-documents.md +0 -606
- package/bin/skills/iso-13485-certification/references/quality-manual-guide.md +0 -688
- package/bin/skills/iso-13485-certification/scripts/gap_analyzer.py +0 -440
- package/bin/skills/kegg-database/SKILL.md +0 -377
- package/bin/skills/kegg-database/references/kegg_reference.md +0 -326
- package/bin/skills/kegg-database/scripts/kegg_api.py +0 -251
- package/bin/skills/knowledge-distillation/SKILL.md +0 -458
- package/bin/skills/knowledge-distillation/references/minillm.md +0 -334
- package/bin/skills/labarchive-integration/SKILL.md +0 -268
- package/bin/skills/labarchive-integration/references/api_reference.md +0 -342
- package/bin/skills/labarchive-integration/references/authentication_guide.md +0 -357
- package/bin/skills/labarchive-integration/references/integrations.md +0 -425
- package/bin/skills/labarchive-integration/scripts/entry_operations.py +0 -334
- package/bin/skills/labarchive-integration/scripts/notebook_operations.py +0 -269
- package/bin/skills/labarchive-integration/scripts/setup_config.py +0 -205
- package/bin/skills/lambda-labs/SKILL.md +0 -545
- package/bin/skills/lambda-labs/references/advanced-usage.md +0 -611
- package/bin/skills/lambda-labs/references/troubleshooting.md +0 -530
- package/bin/skills/lamindb/SKILL.md +0 -390
- package/bin/skills/lamindb/references/annotation-validation.md +0 -513
- package/bin/skills/lamindb/references/core-concepts.md +0 -380
- package/bin/skills/lamindb/references/data-management.md +0 -433
- package/bin/skills/lamindb/references/integrations.md +0 -642
- package/bin/skills/lamindb/references/ontologies.md +0 -497
- package/bin/skills/lamindb/references/setup-deployment.md +0 -733
- package/bin/skills/langchain/SKILL.md +0 -480
- package/bin/skills/langchain/references/agents.md +0 -499
- package/bin/skills/langchain/references/integration.md +0 -562
- package/bin/skills/langchain/references/rag.md +0 -600
- package/bin/skills/langsmith/SKILL.md +0 -422
- package/bin/skills/langsmith/references/advanced-usage.md +0 -548
- package/bin/skills/langsmith/references/troubleshooting.md +0 -537
- package/bin/skills/latchbio-integration/SKILL.md +0 -353
- package/bin/skills/latchbio-integration/references/data-management.md +0 -427
- package/bin/skills/latchbio-integration/references/resource-configuration.md +0 -429
- package/bin/skills/latchbio-integration/references/verified-workflows.md +0 -487
- package/bin/skills/latchbio-integration/references/workflow-creation.md +0 -254
- package/bin/skills/latex-posters/README.md +0 -417
- package/bin/skills/latex-posters/SKILL.md +0 -1602
- package/bin/skills/latex-posters/assets/baposter_template.tex +0 -257
- package/bin/skills/latex-posters/assets/beamerposter_template.tex +0 -244
- package/bin/skills/latex-posters/assets/poster_quality_checklist.md +0 -358
- package/bin/skills/latex-posters/assets/tikzposter_template.tex +0 -251
- package/bin/skills/latex-posters/references/latex_poster_packages.md +0 -745
- package/bin/skills/latex-posters/references/poster_content_guide.md +0 -748
- package/bin/skills/latex-posters/references/poster_design_principles.md +0 -806
- package/bin/skills/latex-posters/references/poster_layout_design.md +0 -900
- package/bin/skills/latex-posters/scripts/review_poster.sh +0 -214
- package/bin/skills/literature-review/SKILL.md +0 -641
- package/bin/skills/literature-review/assets/review_template.md +0 -412
- package/bin/skills/literature-review/references/citation_styles.md +0 -166
- package/bin/skills/literature-review/references/database_strategies.md +0 -455
- package/bin/skills/literature-review/scripts/generate_pdf.py +0 -184
- package/bin/skills/literature-review/scripts/search_databases.py +0 -310
- package/bin/skills/literature-review/scripts/verify_citations.py +0 -218
- package/bin/skills/litgpt/SKILL.md +0 -469
- package/bin/skills/litgpt/references/custom-models.md +0 -568
- package/bin/skills/litgpt/references/distributed-training.md +0 -451
- package/bin/skills/litgpt/references/supported-models.md +0 -336
- package/bin/skills/litgpt/references/training-recipes.md +0 -619
- package/bin/skills/llama-cpp/SKILL.md +0 -258
- package/bin/skills/llama-cpp/references/optimization.md +0 -89
- package/bin/skills/llama-cpp/references/quantization.md +0 -213
- package/bin/skills/llama-cpp/references/server.md +0 -125
- package/bin/skills/llama-factory/SKILL.md +0 -80
- package/bin/skills/llama-factory/references/_images.md +0 -23
- package/bin/skills/llama-factory/references/advanced.md +0 -1055
- package/bin/skills/llama-factory/references/getting_started.md +0 -349
- package/bin/skills/llama-factory/references/index.md +0 -19
- package/bin/skills/llama-factory/references/other.md +0 -31
- package/bin/skills/llamaguard/SKILL.md +0 -337
- package/bin/skills/llamaindex/SKILL.md +0 -569
- package/bin/skills/llamaindex/references/agents.md +0 -83
- package/bin/skills/llamaindex/references/data_connectors.md +0 -108
- package/bin/skills/llamaindex/references/query_engines.md +0 -406
- package/bin/skills/llava/SKILL.md +0 -304
- package/bin/skills/llava/references/training.md +0 -197
- package/bin/skills/llm-as-judge-evaluation/SKILL.md +0 -385
- package/bin/skills/llm-as-judge-evaluation/references/pairwise-comparison.md +0 -95
- package/bin/skills/llm-as-judge-evaluation/references/scoring-rubrics.md +0 -169
- package/bin/skills/lm-evaluation-harness/SKILL.md +0 -490
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +0 -490
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +0 -488
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +0 -602
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +0 -519
- package/bin/skills/long-context/SKILL.md +0 -536
- package/bin/skills/long-context/references/extension_methods.md +0 -468
- package/bin/skills/long-context/references/fine_tuning.md +0 -611
- package/bin/skills/long-context/references/rope.md +0 -402
- package/bin/skills/mamba/SKILL.md +0 -260
- package/bin/skills/mamba/references/architecture-details.md +0 -206
- package/bin/skills/mamba/references/benchmarks.md +0 -255
- package/bin/skills/mamba/references/training-guide.md +0 -388
- package/bin/skills/market-research-reports/SKILL.md +0 -904
- package/bin/skills/market-research-reports/assets/FORMATTING_GUIDE.md +0 -428
- package/bin/skills/market-research-reports/assets/market_report_template.tex +0 -1380
- package/bin/skills/market-research-reports/assets/market_research.sty +0 -564
- package/bin/skills/market-research-reports/references/data_analysis_patterns.md +0 -548
- package/bin/skills/market-research-reports/references/report_structure_guide.md +0 -999
- package/bin/skills/market-research-reports/references/visual_generation_guide.md +0 -1077
- package/bin/skills/market-research-reports/scripts/generate_market_visuals.py +0 -472
- package/bin/skills/markitdown/INSTALLATION_GUIDE.md +0 -318
- package/bin/skills/markitdown/LICENSE.txt +0 -22
- package/bin/skills/markitdown/OPENROUTER_INTEGRATION.md +0 -359
- package/bin/skills/markitdown/QUICK_REFERENCE.md +0 -309
- package/bin/skills/markitdown/README.md +0 -184
- package/bin/skills/markitdown/SKILL.md +0 -486
- package/bin/skills/markitdown/SKILL_SUMMARY.md +0 -307
- package/bin/skills/markitdown/assets/example_usage.md +0 -463
- package/bin/skills/markitdown/references/api_reference.md +0 -399
- package/bin/skills/markitdown/references/file_formats.md +0 -542
- package/bin/skills/markitdown/scripts/batch_convert.py +0 -195
- package/bin/skills/markitdown/scripts/convert_literature.py +0 -262
- package/bin/skills/markitdown/scripts/convert_with_ai.py +0 -224
- package/bin/skills/matchms/SKILL.md +0 -203
- package/bin/skills/matchms/references/filtering.md +0 -288
- package/bin/skills/matchms/references/importing_exporting.md +0 -416
- package/bin/skills/matchms/references/similarity.md +0 -380
- package/bin/skills/matchms/references/workflows.md +0 -647
- package/bin/skills/matlab/SKILL.md +0 -376
- package/bin/skills/matlab/references/data-import-export.md +0 -479
- package/bin/skills/matlab/references/executing-scripts.md +0 -444
- package/bin/skills/matlab/references/graphics-visualization.md +0 -579
- package/bin/skills/matlab/references/mathematics.md +0 -553
- package/bin/skills/matlab/references/matrices-arrays.md +0 -349
- package/bin/skills/matlab/references/octave-compatibility.md +0 -544
- package/bin/skills/matlab/references/programming.md +0 -672
- package/bin/skills/matlab/references/python-integration.md +0 -433
- package/bin/skills/matplotlib/SKILL.md +0 -361
- package/bin/skills/matplotlib/references/api_reference.md +0 -412
- package/bin/skills/matplotlib/references/common_issues.md +0 -563
- package/bin/skills/matplotlib/references/plot_types.md +0 -476
- package/bin/skills/matplotlib/references/styling_guide.md +0 -589
- package/bin/skills/matplotlib/scripts/plot_template.py +0 -401
- package/bin/skills/matplotlib/scripts/style_configurator.py +0 -409
- package/bin/skills/medchem/SKILL.md +0 -406
- package/bin/skills/medchem/references/api_guide.md +0 -600
- package/bin/skills/medchem/references/rules_catalog.md +0 -604
- package/bin/skills/medchem/scripts/filter_molecules.py +0 -418
- package/bin/skills/megatron-core/SKILL.md +0 -366
- package/bin/skills/megatron-core/references/benchmarks.md +0 -249
- package/bin/skills/megatron-core/references/parallelism-guide.md +0 -404
- package/bin/skills/megatron-core/references/production-examples.md +0 -473
- package/bin/skills/megatron-core/references/training-recipes.md +0 -547
- package/bin/skills/metabolomics-workbench-database/SKILL.md +0 -259
- package/bin/skills/metabolomics-workbench-database/references/api_reference.md +0 -494
- package/bin/skills/miles/SKILL.md +0 -315
- package/bin/skills/miles/references/api-reference.md +0 -141
- package/bin/skills/miles/references/troubleshooting.md +0 -352
- package/bin/skills/ml-paper-writing/SKILL.md +0 -937
- package/bin/skills/ml-paper-writing/references/checklists.md +0 -361
- package/bin/skills/ml-paper-writing/references/citation-workflow.md +0 -562
- package/bin/skills/ml-paper-writing/references/reviewer-guidelines.md +0 -367
- package/bin/skills/ml-paper-writing/references/sources.md +0 -159
- package/bin/skills/ml-paper-writing/references/writing-guide.md +0 -476
- package/bin/skills/ml-paper-writing/templates/README.md +0 -251
- package/bin/skills/ml-paper-writing/templates/aaai2026/README.md +0 -534
- package/bin/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-supp.tex +0 -144
- package/bin/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-template.tex +0 -952
- package/bin/skills/ml-paper-writing/templates/aaai2026/aaai2026.bib +0 -111
- package/bin/skills/ml-paper-writing/templates/aaai2026/aaai2026.bst +0 -1493
- package/bin/skills/ml-paper-writing/templates/aaai2026/aaai2026.sty +0 -315
- package/bin/skills/ml-paper-writing/templates/acl/README.md +0 -50
- package/bin/skills/ml-paper-writing/templates/acl/acl.sty +0 -312
- package/bin/skills/ml-paper-writing/templates/acl/acl_latex.tex +0 -377
- package/bin/skills/ml-paper-writing/templates/acl/acl_lualatex.tex +0 -101
- package/bin/skills/ml-paper-writing/templates/acl/acl_natbib.bst +0 -1940
- package/bin/skills/ml-paper-writing/templates/acl/anthology.bib.txt +0 -26
- package/bin/skills/ml-paper-writing/templates/acl/custom.bib +0 -70
- package/bin/skills/ml-paper-writing/templates/acl/formatting.md +0 -326
- package/bin/skills/ml-paper-writing/templates/colm2025/README.md +0 -3
- package/bin/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bib +0 -11
- package/bin/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bst +0 -1440
- package/bin/skills/ml-paper-writing/templates/colm2025/colm2025_conference.pdf +0 -0
- package/bin/skills/ml-paper-writing/templates/colm2025/colm2025_conference.sty +0 -218
- package/bin/skills/ml-paper-writing/templates/colm2025/colm2025_conference.tex +0 -305
- package/bin/skills/ml-paper-writing/templates/colm2025/fancyhdr.sty +0 -485
- package/bin/skills/ml-paper-writing/templates/colm2025/math_commands.tex +0 -508
- package/bin/skills/ml-paper-writing/templates/colm2025/natbib.sty +0 -1246
- package/bin/skills/ml-paper-writing/templates/iclr2026/fancyhdr.sty +0 -485
- package/bin/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bib +0 -24
- package/bin/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bst +0 -1440
- package/bin/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.pdf +0 -0
- package/bin/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.sty +0 -246
- package/bin/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.tex +0 -414
- package/bin/skills/ml-paper-writing/templates/iclr2026/math_commands.tex +0 -508
- package/bin/skills/ml-paper-writing/templates/iclr2026/natbib.sty +0 -1246
- package/bin/skills/ml-paper-writing/templates/icml2026/algorithm.sty +0 -79
- package/bin/skills/ml-paper-writing/templates/icml2026/algorithmic.sty +0 -201
- package/bin/skills/ml-paper-writing/templates/icml2026/example_paper.bib +0 -75
- package/bin/skills/ml-paper-writing/templates/icml2026/example_paper.pdf +0 -0
- package/bin/skills/ml-paper-writing/templates/icml2026/example_paper.tex +0 -662
- package/bin/skills/ml-paper-writing/templates/icml2026/fancyhdr.sty +0 -864
- package/bin/skills/ml-paper-writing/templates/icml2026/icml2026.bst +0 -1443
- package/bin/skills/ml-paper-writing/templates/icml2026/icml2026.sty +0 -767
- package/bin/skills/ml-paper-writing/templates/icml2026/icml_numpapers.pdf +0 -0
- package/bin/skills/ml-paper-writing/templates/neurips2025/Makefile +0 -36
- package/bin/skills/ml-paper-writing/templates/neurips2025/extra_pkgs.tex +0 -53
- package/bin/skills/ml-paper-writing/templates/neurips2025/main.tex +0 -38
- package/bin/skills/ml-paper-writing/templates/neurips2025/neurips.sty +0 -382
- package/bin/skills/mlflow/SKILL.md +0 -704
- package/bin/skills/mlflow/references/deployment.md +0 -744
- package/bin/skills/mlflow/references/model-registry.md +0 -770
- package/bin/skills/mlflow/references/tracking.md +0 -680
- package/bin/skills/modal/SKILL.md +0 -418
- package/bin/skills/modal/references/advanced-patterns.md +0 -695
- package/bin/skills/modal/references/examples-catalog.md +0 -423
- package/bin/skills/modal/references/troubleshooting.md +0 -494
- package/bin/skills/modal-research-gpu/SKILL.md +0 -238
- package/bin/skills/model-economics/SKILL.md +0 -238
- package/bin/skills/model-merging/SKILL.md +0 -539
- package/bin/skills/model-merging/references/evaluation.md +0 -462
- package/bin/skills/model-merging/references/examples.md +0 -428
- package/bin/skills/model-merging/references/methods.md +0 -352
- package/bin/skills/model-pruning/SKILL.md +0 -495
- package/bin/skills/model-pruning/references/wanda.md +0 -347
- package/bin/skills/moe-training/SKILL.md +0 -526
- package/bin/skills/moe-training/references/architectures.md +0 -432
- package/bin/skills/moe-training/references/inference.md +0 -348
- package/bin/skills/moe-training/references/training.md +0 -425
- package/bin/skills/molfeat/SKILL.md +0 -511
- package/bin/skills/molfeat/references/api_reference.md +0 -428
- package/bin/skills/molfeat/references/available_featurizers.md +0 -333
- package/bin/skills/molfeat/references/examples.md +0 -723
- package/bin/skills/nanogpt/SKILL.md +0 -290
- package/bin/skills/nanogpt/references/architecture.md +0 -382
- package/bin/skills/nanogpt/references/data.md +0 -476
- package/bin/skills/nanogpt/references/training.md +0 -564
- package/bin/skills/nemo-curator/SKILL.md +0 -383
- package/bin/skills/nemo-curator/references/deduplication.md +0 -87
- package/bin/skills/nemo-curator/references/filtering.md +0 -102
- package/bin/skills/nemo-evaluator/SKILL.md +0 -494
- package/bin/skills/nemo-evaluator/references/adapter-system.md +0 -340
- package/bin/skills/nemo-evaluator/references/configuration.md +0 -447
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +0 -315
- package/bin/skills/nemo-evaluator/references/execution-backends.md +0 -361
- package/bin/skills/nemo-guardrails/SKILL.md +0 -297
- package/bin/skills/networkx/SKILL.md +0 -437
- package/bin/skills/networkx/references/algorithms.md +0 -383
- package/bin/skills/networkx/references/generators.md +0 -378
- package/bin/skills/networkx/references/graph-basics.md +0 -283
- package/bin/skills/networkx/references/io.md +0 -441
- package/bin/skills/networkx/references/visualization.md +0 -529
- package/bin/skills/neurokit2/SKILL.md +0 -356
- package/bin/skills/neurokit2/references/bio_module.md +0 -417
- package/bin/skills/neurokit2/references/complexity.md +0 -715
- package/bin/skills/neurokit2/references/ecg_cardiac.md +0 -355
- package/bin/skills/neurokit2/references/eda.md +0 -497
- package/bin/skills/neurokit2/references/eeg.md +0 -506
- package/bin/skills/neurokit2/references/emg.md +0 -408
- package/bin/skills/neurokit2/references/eog.md +0 -407
- package/bin/skills/neurokit2/references/epochs_events.md +0 -471
- package/bin/skills/neurokit2/references/hrv.md +0 -480
- package/bin/skills/neurokit2/references/ppg.md +0 -413
- package/bin/skills/neurokit2/references/rsp.md +0 -510
- package/bin/skills/neurokit2/references/signal_processing.md +0 -648
- package/bin/skills/neuropixels-analysis/SKILL.md +0 -350
- package/bin/skills/neuropixels-analysis/assets/analysis_template.py +0 -271
- package/bin/skills/neuropixels-analysis/references/AI_CURATION.md +0 -345
- package/bin/skills/neuropixels-analysis/references/ANALYSIS.md +0 -392
- package/bin/skills/neuropixels-analysis/references/AUTOMATED_CURATION.md +0 -358
- package/bin/skills/neuropixels-analysis/references/MOTION_CORRECTION.md +0 -323
- package/bin/skills/neuropixels-analysis/references/PREPROCESSING.md +0 -273
- package/bin/skills/neuropixels-analysis/references/QUALITY_METRICS.md +0 -359
- package/bin/skills/neuropixels-analysis/references/SPIKE_SORTING.md +0 -339
- package/bin/skills/neuropixels-analysis/references/api_reference.md +0 -415
- package/bin/skills/neuropixels-analysis/references/plotting_guide.md +0 -454
- package/bin/skills/neuropixels-analysis/references/standard_workflow.md +0 -385
- package/bin/skills/neuropixels-analysis/scripts/compute_metrics.py +0 -178
- package/bin/skills/neuropixels-analysis/scripts/explore_recording.py +0 -168
- package/bin/skills/neuropixels-analysis/scripts/export_to_phy.py +0 -79
- package/bin/skills/neuropixels-analysis/scripts/neuropixels_pipeline.py +0 -432
- package/bin/skills/neuropixels-analysis/scripts/preprocess_recording.py +0 -122
- package/bin/skills/neuropixels-analysis/scripts/run_sorting.py +0 -98
- package/bin/skills/nnsight/SKILL.md +0 -436
- package/bin/skills/nnsight/references/README.md +0 -78
- package/bin/skills/nnsight/references/api.md +0 -344
- package/bin/skills/nnsight/references/tutorials.md +0 -300
- package/bin/skills/offer-k-dense-web/SKILL.md +0 -21
- package/bin/skills/omero-integration/SKILL.md +0 -251
- package/bin/skills/omero-integration/references/advanced.md +0 -631
- package/bin/skills/omero-integration/references/connection.md +0 -369
- package/bin/skills/omero-integration/references/data_access.md +0 -544
- package/bin/skills/omero-integration/references/image_processing.md +0 -665
- package/bin/skills/omero-integration/references/metadata.md +0 -688
- package/bin/skills/omero-integration/references/rois.md +0 -648
- package/bin/skills/omero-integration/references/scripts.md +0 -637
- package/bin/skills/omero-integration/references/tables.md +0 -532
- package/bin/skills/openalex-database/SKILL.md +0 -494
- package/bin/skills/openalex-database/references/api_guide.md +0 -371
- package/bin/skills/openalex-database/references/common_queries.md +0 -381
- package/bin/skills/openalex-database/scripts/openalex_client.py +0 -337
- package/bin/skills/openalex-database/scripts/query_helpers.py +0 -306
- package/bin/skills/openrlhf/SKILL.md +0 -249
- package/bin/skills/openrlhf/references/algorithm-comparison.md +0 -404
- package/bin/skills/openrlhf/references/custom-rewards.md +0 -530
- package/bin/skills/openrlhf/references/hybrid-engine.md +0 -287
- package/bin/skills/openrlhf/references/multi-node-training.md +0 -454
- package/bin/skills/opentargets-database/SKILL.md +0 -373
- package/bin/skills/opentargets-database/references/api_reference.md +0 -249
- package/bin/skills/opentargets-database/references/evidence_types.md +0 -306
- package/bin/skills/opentargets-database/references/target_annotations.md +0 -401
- package/bin/skills/opentargets-database/scripts/query_opentargets.py +0 -403
- package/bin/skills/opentrons-integration/SKILL.md +0 -573
- package/bin/skills/opentrons-integration/references/api_reference.md +0 -366
- package/bin/skills/opentrons-integration/scripts/basic_protocol_template.py +0 -67
- package/bin/skills/opentrons-integration/scripts/pcr_setup_template.py +0 -154
- package/bin/skills/opentrons-integration/scripts/serial_dilution_template.py +0 -96
- package/bin/skills/outlines/SKILL.md +0 -652
- package/bin/skills/outlines/references/backends.md +0 -615
- package/bin/skills/outlines/references/examples.md +0 -773
- package/bin/skills/outlines/references/json_generation.md +0 -652
- package/bin/skills/paper-2-web/SKILL.md +0 -491
- package/bin/skills/paper-2-web/references/installation.md +0 -141
- package/bin/skills/paper-2-web/references/paper2poster.md +0 -346
- package/bin/skills/paper-2-web/references/paper2video.md +0 -305
- package/bin/skills/paper-2-web/references/paper2web.md +0 -187
- package/bin/skills/paper-2-web/references/usage_examples.md +0 -436
- package/bin/skills/pathml/SKILL.md +0 -166
- package/bin/skills/pathml/references/data_management.md +0 -742
- package/bin/skills/pathml/references/graphs.md +0 -653
- package/bin/skills/pathml/references/image_loading.md +0 -448
- package/bin/skills/pathml/references/machine_learning.md +0 -725
- package/bin/skills/pathml/references/multiparametric.md +0 -686
- package/bin/skills/pathml/references/preprocessing.md +0 -722
- package/bin/skills/pdb-database/SKILL.md +0 -309
- package/bin/skills/pdb-database/references/api_reference.md +0 -617
- package/bin/skills/peer-review/SKILL.md +0 -702
- package/bin/skills/peer-review/references/calibration_guidelines.md +0 -196
- package/bin/skills/peer-review/references/common_issues.md +0 -552
- package/bin/skills/peer-review/references/paper_mechanics.md +0 -269
- package/bin/skills/peer-review/references/reporting_standards.md +0 -290
- package/bin/skills/peer-review/references/scoring_rubric.md +0 -239
- package/bin/skills/peft/SKILL.md +0 -431
- package/bin/skills/peft/references/advanced-usage.md +0 -514
- package/bin/skills/peft/references/troubleshooting.md +0 -480
- package/bin/skills/pennylane/SKILL.md +0 -226
- package/bin/skills/pennylane/references/advanced_features.md +0 -667
- package/bin/skills/pennylane/references/devices_backends.md +0 -596
- package/bin/skills/pennylane/references/getting_started.md +0 -227
- package/bin/skills/pennylane/references/optimization.md +0 -671
- package/bin/skills/pennylane/references/quantum_chemistry.md +0 -567
- package/bin/skills/pennylane/references/quantum_circuits.md +0 -437
- package/bin/skills/pennylane/references/quantum_ml.md +0 -571
- package/bin/skills/perplexity-search/SKILL.md +0 -448
- package/bin/skills/perplexity-search/assets/.env.example +0 -16
- package/bin/skills/perplexity-search/references/model_comparison.md +0 -386
- package/bin/skills/perplexity-search/references/openrouter_setup.md +0 -454
- package/bin/skills/perplexity-search/references/search_strategies.md +0 -258
- package/bin/skills/perplexity-search/scripts/perplexity_search.py +0 -277
- package/bin/skills/perplexity-search/scripts/setup_env.py +0 -171
- package/bin/skills/phoenix/SKILL.md +0 -475
- package/bin/skills/phoenix/references/advanced-usage.md +0 -619
- package/bin/skills/phoenix/references/troubleshooting.md +0 -538
- package/bin/skills/pinecone/SKILL.md +0 -358
- package/bin/skills/pinecone/references/deployment.md +0 -181
- package/bin/skills/plotly/SKILL.md +0 -267
- package/bin/skills/plotly/references/chart-types.md +0 -488
- package/bin/skills/plotly/references/export-interactivity.md +0 -453
- package/bin/skills/plotly/references/graph-objects.md +0 -302
- package/bin/skills/plotly/references/layouts-styling.md +0 -457
- package/bin/skills/plotly/references/plotly-express.md +0 -213
- package/bin/skills/polars/SKILL.md +0 -387
- package/bin/skills/polars/references/best_practices.md +0 -649
- package/bin/skills/polars/references/core_concepts.md +0 -378
- package/bin/skills/polars/references/io_guide.md +0 -557
- package/bin/skills/polars/references/operations.md +0 -602
- package/bin/skills/polars/references/pandas_migration.md +0 -417
- package/bin/skills/polars/references/transformations.md +0 -549
- package/bin/skills/pptx-posters/SKILL.md +0 -410
- package/bin/skills/pptx-posters/assets/poster_html_template.html +0 -257
- package/bin/skills/pptx-posters/assets/poster_quality_checklist.md +0 -358
- package/bin/skills/pptx-posters/references/poster_content_guide.md +0 -748
- package/bin/skills/pptx-posters/references/poster_design_principles.md +0 -806
- package/bin/skills/pptx-posters/references/poster_layout_design.md +0 -900
- package/bin/skills/prime-intellect-lab/README.md +0 -69
- package/bin/skills/prime-intellect-lab/SKILL.md +0 -598
- package/bin/skills/prime-intellect-lab/templates/basic_rl_training.toml +0 -82
- package/bin/skills/protocolsio-integration/SKILL.md +0 -421
- package/bin/skills/protocolsio-integration/references/additional_features.md +0 -387
- package/bin/skills/protocolsio-integration/references/authentication.md +0 -100
- package/bin/skills/protocolsio-integration/references/discussions.md +0 -225
- package/bin/skills/protocolsio-integration/references/file_manager.md +0 -412
- package/bin/skills/protocolsio-integration/references/protocols_api.md +0 -294
- package/bin/skills/protocolsio-integration/references/workspaces.md +0 -293
- package/bin/skills/pubchem-database/SKILL.md +0 -574
- package/bin/skills/pubchem-database/references/api_reference.md +0 -440
- package/bin/skills/pubchem-database/scripts/bioactivity_query.py +0 -367
- package/bin/skills/pubchem-database/scripts/compound_search.py +0 -297
- package/bin/skills/pubmed-database/SKILL.md +0 -460
- package/bin/skills/pubmed-database/references/api_reference.md +0 -298
- package/bin/skills/pubmed-database/references/common_queries.md +0 -453
- package/bin/skills/pubmed-database/references/search_syntax.md +0 -436
- package/bin/skills/pufferlib/SKILL.md +0 -436
- package/bin/skills/pufferlib/references/environments.md +0 -508
- package/bin/skills/pufferlib/references/integration.md +0 -621
- package/bin/skills/pufferlib/references/policies.md +0 -653
- package/bin/skills/pufferlib/references/training.md +0 -360
- package/bin/skills/pufferlib/references/vectorization.md +0 -557
- package/bin/skills/pufferlib/scripts/env_template.py +0 -340
- package/bin/skills/pufferlib/scripts/train_template.py +0 -239
- package/bin/skills/pydeseq2/SKILL.md +0 -559
- package/bin/skills/pydeseq2/references/api_reference.md +0 -228
- package/bin/skills/pydeseq2/references/workflow_guide.md +0 -582
- package/bin/skills/pydeseq2/scripts/run_deseq2_analysis.py +0 -353
- package/bin/skills/pydicom/SKILL.md +0 -434
- package/bin/skills/pydicom/references/common_tags.md +0 -228
- package/bin/skills/pydicom/references/transfer_syntaxes.md +0 -352
- package/bin/skills/pydicom/scripts/anonymize_dicom.py +0 -137
- package/bin/skills/pydicom/scripts/dicom_to_image.py +0 -172
- package/bin/skills/pydicom/scripts/extract_metadata.py +0 -173
- package/bin/skills/pyhealth/SKILL.md +0 -491
- package/bin/skills/pyhealth/references/datasets.md +0 -178
- package/bin/skills/pyhealth/references/medical_coding.md +0 -284
- package/bin/skills/pyhealth/references/models.md +0 -594
- package/bin/skills/pyhealth/references/preprocessing.md +0 -638
- package/bin/skills/pyhealth/references/tasks.md +0 -379
- package/bin/skills/pyhealth/references/training_evaluation.md +0 -648
- package/bin/skills/pylabrobot/SKILL.md +0 -185
- package/bin/skills/pylabrobot/references/analytical-equipment.md +0 -464
- package/bin/skills/pylabrobot/references/hardware-backends.md +0 -480
- package/bin/skills/pylabrobot/references/liquid-handling.md +0 -403
- package/bin/skills/pylabrobot/references/material-handling.md +0 -620
- package/bin/skills/pylabrobot/references/resources.md +0 -489
- package/bin/skills/pylabrobot/references/visualization.md +0 -532
- package/bin/skills/pymatgen/SKILL.md +0 -691
- package/bin/skills/pymatgen/references/analysis_modules.md +0 -530
- package/bin/skills/pymatgen/references/core_classes.md +0 -318
- package/bin/skills/pymatgen/references/io_formats.md +0 -469
- package/bin/skills/pymatgen/references/materials_project_api.md +0 -517
- package/bin/skills/pymatgen/references/transformations_workflows.md +0 -591
- package/bin/skills/pymatgen/scripts/phase_diagram_generator.py +0 -233
- package/bin/skills/pymatgen/scripts/structure_analyzer.py +0 -266
- package/bin/skills/pymatgen/scripts/structure_converter.py +0 -169
- package/bin/skills/pymc/SKILL.md +0 -572
- package/bin/skills/pymc/assets/hierarchical_model_template.py +0 -333
- package/bin/skills/pymc/assets/linear_regression_template.py +0 -241
- package/bin/skills/pymc/references/distributions.md +0 -320
- package/bin/skills/pymc/references/sampling_inference.md +0 -424
- package/bin/skills/pymc/references/workflows.md +0 -526
- package/bin/skills/pymc/scripts/model_comparison.py +0 -387
- package/bin/skills/pymc/scripts/model_diagnostics.py +0 -350
- package/bin/skills/pymoo/SKILL.md +0 -571
- package/bin/skills/pymoo/references/algorithms.md +0 -180
- package/bin/skills/pymoo/references/constraints_mcdm.md +0 -417
- package/bin/skills/pymoo/references/operators.md +0 -345
- package/bin/skills/pymoo/references/problems.md +0 -265
- package/bin/skills/pymoo/references/visualization.md +0 -353
- package/bin/skills/pymoo/scripts/custom_problem_example.py +0 -181
- package/bin/skills/pymoo/scripts/decision_making_example.py +0 -161
- package/bin/skills/pymoo/scripts/many_objective_example.py +0 -72
- package/bin/skills/pymoo/scripts/multi_objective_example.py +0 -63
- package/bin/skills/pymoo/scripts/single_objective_example.py +0 -59
- package/bin/skills/pyopenms/SKILL.md +0 -217
- package/bin/skills/pyopenms/references/data_structures.md +0 -497
- package/bin/skills/pyopenms/references/feature_detection.md +0 -410
- package/bin/skills/pyopenms/references/file_io.md +0 -349
- package/bin/skills/pyopenms/references/identification.md +0 -422
- package/bin/skills/pyopenms/references/metabolomics.md +0 -482
- package/bin/skills/pyopenms/references/signal_processing.md +0 -433
- package/bin/skills/pysam/SKILL.md +0 -265
- package/bin/skills/pysam/references/alignment_files.md +0 -280
- package/bin/skills/pysam/references/common_workflows.md +0 -520
- package/bin/skills/pysam/references/sequence_files.md +0 -407
- package/bin/skills/pysam/references/variant_files.md +0 -365
- package/bin/skills/pytdc/SKILL.md +0 -460
- package/bin/skills/pytdc/references/datasets.md +0 -246
- package/bin/skills/pytdc/references/oracles.md +0 -400
- package/bin/skills/pytdc/references/utilities.md +0 -684
- package/bin/skills/pytdc/scripts/benchmark_evaluation.py +0 -327
- package/bin/skills/pytdc/scripts/load_and_split_data.py +0 -214
- package/bin/skills/pytdc/scripts/molecular_generation.py +0 -404
- package/bin/skills/pytorch-fsdp/SKILL.md +0 -126
- package/bin/skills/pytorch-fsdp/references/index.md +0 -7
- package/bin/skills/pytorch-fsdp/references/other.md +0 -4249
- package/bin/skills/pytorch-lightning/SKILL.md +0 -346
- package/bin/skills/pytorch-lightning/references/callbacks.md +0 -436
- package/bin/skills/pytorch-lightning/references/distributed.md +0 -490
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +0 -556
- package/bin/skills/pyvene/SKILL.md +0 -473
- package/bin/skills/pyvene/references/README.md +0 -73
- package/bin/skills/pyvene/references/api.md +0 -383
- package/bin/skills/pyvene/references/tutorials.md +0 -376
- package/bin/skills/qdrant/SKILL.md +0 -493
- package/bin/skills/qdrant/references/advanced-usage.md +0 -648
- package/bin/skills/qdrant/references/troubleshooting.md +0 -631
- package/bin/skills/qiskit/SKILL.md +0 -275
- package/bin/skills/qiskit/references/algorithms.md +0 -607
- package/bin/skills/qiskit/references/backends.md +0 -433
- package/bin/skills/qiskit/references/circuits.md +0 -197
- package/bin/skills/qiskit/references/patterns.md +0 -533
- package/bin/skills/qiskit/references/primitives.md +0 -277
- package/bin/skills/qiskit/references/setup.md +0 -99
- package/bin/skills/qiskit/references/transpilation.md +0 -286
- package/bin/skills/qiskit/references/visualization.md +0 -415
- package/bin/skills/qutip/SKILL.md +0 -318
- package/bin/skills/qutip/references/advanced.md +0 -555
- package/bin/skills/qutip/references/analysis.md +0 -523
- package/bin/skills/qutip/references/core_concepts.md +0 -293
- package/bin/skills/qutip/references/time_evolution.md +0 -348
- package/bin/skills/qutip/references/visualization.md +0 -431
- package/bin/skills/ray-data/SKILL.md +0 -326
- package/bin/skills/ray-data/references/integration.md +0 -82
- package/bin/skills/ray-data/references/transformations.md +0 -83
- package/bin/skills/ray-train/SKILL.md +0 -406
- package/bin/skills/ray-train/references/multi-node.md +0 -628
- package/bin/skills/rdkit/SKILL.md +0 -780
- package/bin/skills/rdkit/references/api_reference.md +0 -432
- package/bin/skills/rdkit/references/descriptors_reference.md +0 -595
- package/bin/skills/rdkit/references/smarts_patterns.md +0 -668
- package/bin/skills/rdkit/scripts/molecular_properties.py +0 -243
- package/bin/skills/rdkit/scripts/similarity_search.py +0 -297
- package/bin/skills/rdkit/scripts/substructure_filter.py +0 -386
- package/bin/skills/reactome-database/SKILL.md +0 -278
- package/bin/skills/reactome-database/references/api_reference.md +0 -465
- package/bin/skills/reactome-database/scripts/reactome_query.py +0 -286
- package/bin/skills/research-grants/README.md +0 -285
- package/bin/skills/research-grants/SKILL.md +0 -938
- package/bin/skills/research-grants/assets/budget_justification_template.md +0 -453
- package/bin/skills/research-grants/assets/nih_specific_aims_template.md +0 -166
- package/bin/skills/research-grants/assets/nsf_project_summary_template.md +0 -92
- package/bin/skills/research-grants/references/broader_impacts.md +0 -392
- package/bin/skills/research-grants/references/darpa_guidelines.md +0 -636
- package/bin/skills/research-grants/references/doe_guidelines.md +0 -586
- package/bin/skills/research-grants/references/nih_guidelines.md +0 -851
- package/bin/skills/research-grants/references/nsf_guidelines.md +0 -570
- package/bin/skills/research-grants/references/specific_aims_guide.md +0 -458
- package/bin/skills/research-lookup/README.md +0 -156
- package/bin/skills/research-lookup/SKILL.md +0 -606
- package/bin/skills/research-lookup/examples.py +0 -174
- package/bin/skills/research-lookup/lookup.py +0 -187
- package/bin/skills/research-lookup/research_lookup.py +0 -483
- package/bin/skills/research-lookup/scripts/research_lookup.py +0 -483
- package/bin/skills/rowan/SKILL.md +0 -427
- package/bin/skills/rowan/references/api_reference.md +0 -413
- package/bin/skills/rowan/references/molecule_handling.md +0 -429
- package/bin/skills/rowan/references/proteins_and_organization.md +0 -499
- package/bin/skills/rowan/references/rdkit_native.md +0 -438
- package/bin/skills/rowan/references/results_interpretation.md +0 -481
- package/bin/skills/rowan/references/workflow_types.md +0 -591
- package/bin/skills/rwkv/SKILL.md +0 -260
- package/bin/skills/rwkv/references/architecture-details.md +0 -344
- package/bin/skills/rwkv/references/rwkv7.md +0 -386
- package/bin/skills/rwkv/references/state-management.md +0 -369
- package/bin/skills/saelens/SKILL.md +0 -386
- package/bin/skills/saelens/references/README.md +0 -70
- package/bin/skills/saelens/references/api.md +0 -333
- package/bin/skills/saelens/references/tutorials.md +0 -318
- package/bin/skills/scanpy/SKILL.md +0 -386
- package/bin/skills/scanpy/assets/analysis_template.py +0 -295
- package/bin/skills/scanpy/references/api_reference.md +0 -251
- package/bin/skills/scanpy/references/plotting_guide.md +0 -352
- package/bin/skills/scanpy/references/standard_workflow.md +0 -206
- package/bin/skills/scanpy/scripts/qc_analysis.py +0 -200
- package/bin/skills/scholar-evaluation/SKILL.md +0 -289
- package/bin/skills/scholar-evaluation/references/evaluation_framework.md +0 -663
- package/bin/skills/scholar-evaluation/scripts/calculate_scores.py +0 -366
- package/bin/skills/scientific-brainstorming/SKILL.md +0 -191
- package/bin/skills/scientific-brainstorming/references/brainstorming_methods.md +0 -326
- package/bin/skills/scientific-critical-thinking/SKILL.md +0 -566
- package/bin/skills/scientific-critical-thinking/references/common_biases.md +0 -364
- package/bin/skills/scientific-critical-thinking/references/evidence_hierarchy.md +0 -484
- package/bin/skills/scientific-critical-thinking/references/experimental_design.md +0 -496
- package/bin/skills/scientific-critical-thinking/references/logical_fallacies.md +0 -478
- package/bin/skills/scientific-critical-thinking/references/scientific_method.md +0 -169
- package/bin/skills/scientific-critical-thinking/references/statistical_pitfalls.md +0 -506
- package/bin/skills/scientific-schematics/QUICK_REFERENCE.md +0 -207
- package/bin/skills/scientific-schematics/README.md +0 -327
- package/bin/skills/scientific-schematics/SKILL.md +0 -615
- package/bin/skills/scientific-schematics/example_usage.sh +0 -89
- package/bin/skills/scientific-schematics/references/best_practices.md +0 -559
- package/bin/skills/scientific-schematics/scripts/generate_schematic.py +0 -135
- package/bin/skills/scientific-schematics/scripts/generate_schematic_ai.py +0 -837
- package/bin/skills/scientific-schematics/test_ai_generation.py +0 -243
- package/bin/skills/scientific-slides/SKILL.md +0 -942
- package/bin/skills/scientific-slides/assets/timing_guidelines.md +0 -597
- package/bin/skills/scientific-slides/references/data_visualization_slides.md +0 -708
- package/bin/skills/scientific-slides/references/presentation_structure.md +0 -642
- package/bin/skills/scientific-slides/references/slide_design_principles.md +0 -849
- package/bin/skills/scientific-slides/references/talk_types_guide.md +0 -687
- package/bin/skills/scientific-slides/references/visual_review_workflow.md +0 -775
- package/bin/skills/scientific-slides/scripts/generate_slide_image.py +0 -143
- package/bin/skills/scientific-slides/scripts/generate_slide_image_ai.py +0 -748
- package/bin/skills/scientific-slides/scripts/pdf_to_images.py +0 -201
- package/bin/skills/scientific-slides/scripts/slides_to_pdf.py +0 -220
- package/bin/skills/scientific-slides/scripts/validate_presentation.py +0 -367
- package/bin/skills/scientific-visualization/SKILL.md +0 -779
- package/bin/skills/scientific-visualization/assets/color_palettes.py +0 -197
- package/bin/skills/scientific-visualization/assets/nature.mplstyle +0 -63
- package/bin/skills/scientific-visualization/assets/presentation.mplstyle +0 -61
- package/bin/skills/scientific-visualization/assets/publication.mplstyle +0 -68
- package/bin/skills/scientific-visualization/references/color_palettes.md +0 -348
- package/bin/skills/scientific-visualization/references/journal_requirements.md +0 -320
- package/bin/skills/scientific-visualization/references/matplotlib_examples.md +0 -620
- package/bin/skills/scientific-visualization/references/publication_guidelines.md +0 -205
- package/bin/skills/scientific-visualization/scripts/figure_export.py +0 -343
- package/bin/skills/scientific-visualization/scripts/style_presets.py +0 -416
- package/bin/skills/scientific-writing/SKILL.md +0 -714
- package/bin/skills/scientific-writing/assets/REPORT_FORMATTING_GUIDE.md +0 -574
- package/bin/skills/scientific-writing/assets/scientific_report.sty +0 -606
- package/bin/skills/scientific-writing/assets/scientific_report_template.tex +0 -449
- package/bin/skills/scientific-writing/references/citation_styles.md +0 -720
- package/bin/skills/scientific-writing/references/figures_tables.md +0 -806
- package/bin/skills/scientific-writing/references/imrad_structure.md +0 -686
- package/bin/skills/scientific-writing/references/professional_report_formatting.md +0 -664
- package/bin/skills/scientific-writing/references/reporting_guidelines.md +0 -748
- package/bin/skills/scientific-writing/references/writing_principles.md +0 -824
- package/bin/skills/scikit-bio/SKILL.md +0 -437
- package/bin/skills/scikit-bio/references/api_reference.md +0 -749
- package/bin/skills/scikit-learn/SKILL.md +0 -521
- package/bin/skills/scikit-learn/references/model_evaluation.md +0 -592
- package/bin/skills/scikit-learn/references/pipelines_and_composition.md +0 -612
- package/bin/skills/scikit-learn/references/preprocessing.md +0 -606
- package/bin/skills/scikit-learn/references/quick_reference.md +0 -433
- package/bin/skills/scikit-learn/references/supervised_learning.md +0 -378
- package/bin/skills/scikit-learn/references/unsupervised_learning.md +0 -505
- package/bin/skills/scikit-learn/scripts/classification_pipeline.py +0 -257
- package/bin/skills/scikit-learn/scripts/clustering_analysis.py +0 -386
- package/bin/skills/scikit-survival/SKILL.md +0 -399
- package/bin/skills/scikit-survival/references/competing-risks.md +0 -397
- package/bin/skills/scikit-survival/references/cox-models.md +0 -182
- package/bin/skills/scikit-survival/references/data-handling.md +0 -494
- package/bin/skills/scikit-survival/references/ensemble-models.md +0 -327
- package/bin/skills/scikit-survival/references/evaluation-metrics.md +0 -378
- package/bin/skills/scikit-survival/references/svm-models.md +0 -411
- package/bin/skills/scvi-tools/SKILL.md +0 -190
- package/bin/skills/scvi-tools/references/differential-expression.md +0 -581
- package/bin/skills/scvi-tools/references/models-atac-seq.md +0 -321
- package/bin/skills/scvi-tools/references/models-multimodal.md +0 -367
- package/bin/skills/scvi-tools/references/models-scrna-seq.md +0 -330
- package/bin/skills/scvi-tools/references/models-spatial.md +0 -438
- package/bin/skills/scvi-tools/references/models-specialized.md +0 -408
- package/bin/skills/scvi-tools/references/theoretical-foundations.md +0 -438
- package/bin/skills/scvi-tools/references/workflows.md +0 -546
- package/bin/skills/seaborn/SKILL.md +0 -673
- package/bin/skills/seaborn/references/examples.md +0 -822
- package/bin/skills/seaborn/references/function_reference.md +0 -770
- package/bin/skills/seaborn/references/objects_interface.md +0 -964
- package/bin/skills/segment-anything/SKILL.md +0 -500
- package/bin/skills/segment-anything/references/advanced-usage.md +0 -589
- package/bin/skills/segment-anything/references/troubleshooting.md +0 -484
- package/bin/skills/sentence-transformers/SKILL.md +0 -255
- package/bin/skills/sentence-transformers/references/models.md +0 -123
- package/bin/skills/sentencepiece/SKILL.md +0 -235
- package/bin/skills/sentencepiece/references/algorithms.md +0 -200
- package/bin/skills/sentencepiece/references/training.md +0 -304
- package/bin/skills/sglang/SKILL.md +0 -442
- package/bin/skills/sglang/references/deployment.md +0 -490
- package/bin/skills/sglang/references/radix-attention.md +0 -413
- package/bin/skills/sglang/references/structured-generation.md +0 -541
- package/bin/skills/shap/SKILL.md +0 -566
- package/bin/skills/shap/references/explainers.md +0 -339
- package/bin/skills/shap/references/plots.md +0 -507
- package/bin/skills/shap/references/theory.md +0 -449
- package/bin/skills/shap/references/workflows.md +0 -605
- package/bin/skills/simpo/SKILL.md +0 -219
- package/bin/skills/simpo/references/datasets.md +0 -478
- package/bin/skills/simpo/references/hyperparameters.md +0 -452
- package/bin/skills/simpo/references/loss-functions.md +0 -350
- package/bin/skills/simpy/SKILL.md +0 -429
- package/bin/skills/simpy/references/events.md +0 -374
- package/bin/skills/simpy/references/monitoring.md +0 -475
- package/bin/skills/simpy/references/process-interaction.md +0 -424
- package/bin/skills/simpy/references/real-time.md +0 -395
- package/bin/skills/simpy/references/resources.md +0 -275
- package/bin/skills/simpy/scripts/basic_simulation_template.py +0 -193
- package/bin/skills/simpy/scripts/resource_monitor.py +0 -345
- package/bin/skills/skypilot/SKILL.md +0 -509
- package/bin/skills/skypilot/references/advanced-usage.md +0 -491
- package/bin/skills/skypilot/references/troubleshooting.md +0 -570
- package/bin/skills/slime/SKILL.md +0 -464
- package/bin/skills/slime/references/api-reference.md +0 -392
- package/bin/skills/slime/references/troubleshooting.md +0 -386
- package/bin/skills/speculative-decoding/SKILL.md +0 -467
- package/bin/skills/speculative-decoding/references/lookahead.md +0 -309
- package/bin/skills/speculative-decoding/references/medusa.md +0 -350
- package/bin/skills/stable-baselines3/SKILL.md +0 -299
- package/bin/skills/stable-baselines3/references/algorithms.md +0 -333
- package/bin/skills/stable-baselines3/references/callbacks.md +0 -556
- package/bin/skills/stable-baselines3/references/custom_environments.md +0 -526
- package/bin/skills/stable-baselines3/references/vectorized_envs.md +0 -568
- package/bin/skills/stable-baselines3/scripts/custom_env_template.py +0 -314
- package/bin/skills/stable-baselines3/scripts/evaluate_agent.py +0 -245
- package/bin/skills/stable-baselines3/scripts/train_rl_agent.py +0 -165
- package/bin/skills/stable-diffusion/SKILL.md +0 -519
- package/bin/skills/stable-diffusion/references/advanced-usage.md +0 -716
- package/bin/skills/stable-diffusion/references/troubleshooting.md +0 -555
- package/bin/skills/statistical-analysis/SKILL.md +0 -632
- package/bin/skills/statistical-analysis/references/assumptions_and_diagnostics.md +0 -369
- package/bin/skills/statistical-analysis/references/bayesian_statistics.md +0 -661
- package/bin/skills/statistical-analysis/references/effect_sizes_and_power.md +0 -581
- package/bin/skills/statistical-analysis/references/reporting_standards.md +0 -469
- package/bin/skills/statistical-analysis/references/test_selection_guide.md +0 -129
- package/bin/skills/statistical-analysis/scripts/assumption_checks.py +0 -539
- package/bin/skills/statsmodels/SKILL.md +0 -614
- package/bin/skills/statsmodels/references/discrete_choice.md +0 -669
- package/bin/skills/statsmodels/references/glm.md +0 -619
- package/bin/skills/statsmodels/references/linear_models.md +0 -447
- package/bin/skills/statsmodels/references/stats_diagnostics.md +0 -859
- package/bin/skills/statsmodels/references/time_series.md +0 -716
- package/bin/skills/string-database/SKILL.md +0 -534
- package/bin/skills/string-database/references/string_reference.md +0 -455
- package/bin/skills/string-database/scripts/string_api.py +0 -369
- package/bin/skills/sympy/SKILL.md +0 -500
- package/bin/skills/sympy/references/advanced-topics.md +0 -635
- package/bin/skills/sympy/references/code-generation-printing.md +0 -599
- package/bin/skills/sympy/references/core-capabilities.md +0 -348
- package/bin/skills/sympy/references/matrices-linear-algebra.md +0 -526
- package/bin/skills/sympy/references/physics-mechanics.md +0 -592
- package/bin/skills/tensorboard/SKILL.md +0 -629
- package/bin/skills/tensorboard/references/integrations.md +0 -638
- package/bin/skills/tensorboard/references/profiling.md +0 -545
- package/bin/skills/tensorboard/references/visualization.md +0 -620
- package/bin/skills/tensorpool/SKILL.md +0 -519
- package/bin/skills/tensorrt-llm/SKILL.md +0 -187
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +0 -298
- package/bin/skills/tensorrt-llm/references/optimization.md +0 -242
- package/bin/skills/tensorrt-llm/references/serving.md +0 -470
- package/bin/skills/tinker/SKILL.md +0 -466
- package/bin/skills/tinker/references/api-reference.md +0 -168
- package/bin/skills/tinker/references/dpo-and-preference.md +0 -174
- package/bin/skills/tinker/references/evaluations.md +0 -183
- package/bin/skills/tinker/references/getting-started.md +0 -157
- package/bin/skills/tinker/references/loss-functions.md +0 -163
- package/bin/skills/tinker/references/models-and-lora.md +0 -148
- package/bin/skills/tinker/references/recipes.md +0 -326
- package/bin/skills/tinker/references/reinforcement-learning.md +0 -357
- package/bin/skills/tinker/references/rendering.md +0 -255
- package/bin/skills/tinker/references/supervised-learning.md +0 -256
- package/bin/skills/tinker-training-cost/SKILL.md +0 -187
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +0 -123
- package/bin/skills/together-ai/SKILL.md +0 -722
- package/bin/skills/torch_geometric/SKILL.md +0 -676
- package/bin/skills/torch_geometric/references/datasets_reference.md +0 -574
- package/bin/skills/torch_geometric/references/layers_reference.md +0 -485
- package/bin/skills/torch_geometric/references/transforms_reference.md +0 -679
- package/bin/skills/torch_geometric/scripts/benchmark_model.py +0 -309
- package/bin/skills/torch_geometric/scripts/create_gnn_template.py +0 -529
- package/bin/skills/torch_geometric/scripts/visualize_graph.py +0 -313
- package/bin/skills/torchdrug/SKILL.md +0 -450
- package/bin/skills/torchdrug/references/core_concepts.md +0 -565
- package/bin/skills/torchdrug/references/datasets.md +0 -380
- package/bin/skills/torchdrug/references/knowledge_graphs.md +0 -320
- package/bin/skills/torchdrug/references/models_architectures.md +0 -541
- package/bin/skills/torchdrug/references/molecular_generation.md +0 -352
- package/bin/skills/torchdrug/references/molecular_property_prediction.md +0 -169
- package/bin/skills/torchdrug/references/protein_modeling.md +0 -272
- package/bin/skills/torchdrug/references/retrosynthesis.md +0 -436
- package/bin/skills/torchforge/SKILL.md +0 -433
- package/bin/skills/torchforge/references/api-reference.md +0 -327
- package/bin/skills/torchforge/references/troubleshooting.md +0 -409
- package/bin/skills/torchtitan/SKILL.md +0 -358
- package/bin/skills/torchtitan/references/checkpoint.md +0 -181
- package/bin/skills/torchtitan/references/custom-models.md +0 -258
- package/bin/skills/torchtitan/references/float8.md +0 -133
- package/bin/skills/torchtitan/references/fsdp.md +0 -126
- package/bin/skills/training-data-pipeline/SKILL.md +0 -427
- package/bin/skills/training-data-pipeline/references/data-quality.md +0 -136
- package/bin/skills/training-data-pipeline/references/frontier-distillation.md +0 -129
- package/bin/skills/training-data-pipeline/references/production-data-formatting.md +0 -126
- package/bin/skills/transformer-lens/SKILL.md +0 -346
- package/bin/skills/transformer-lens/references/README.md +0 -54
- package/bin/skills/transformer-lens/references/api.md +0 -362
- package/bin/skills/transformer-lens/references/tutorials.md +0 -339
- package/bin/skills/transformers/SKILL.md +0 -164
- package/bin/skills/transformers/references/generation.md +0 -467
- package/bin/skills/transformers/references/models.md +0 -361
- package/bin/skills/transformers/references/pipelines.md +0 -335
- package/bin/skills/transformers/references/tokenizers.md +0 -447
- package/bin/skills/transformers/references/training.md +0 -500
- package/bin/skills/treatment-plans/README.md +0 -488
- package/bin/skills/treatment-plans/SKILL.md +0 -1579
- package/bin/skills/treatment-plans/assets/STYLING_QUICK_REFERENCE.md +0 -185
- package/bin/skills/treatment-plans/assets/chronic_disease_management_plan.tex +0 -665
- package/bin/skills/treatment-plans/assets/general_medical_treatment_plan.tex +0 -547
- package/bin/skills/treatment-plans/assets/medical_treatment_plan.sty +0 -222
- package/bin/skills/treatment-plans/assets/mental_health_treatment_plan.tex +0 -774
- package/bin/skills/treatment-plans/assets/one_page_treatment_plan.tex +0 -193
- package/bin/skills/treatment-plans/assets/pain_management_plan.tex +0 -799
- package/bin/skills/treatment-plans/assets/perioperative_care_plan.tex +0 -753
- package/bin/skills/treatment-plans/assets/quality_checklist.md +0 -471
- package/bin/skills/treatment-plans/assets/rehabilitation_treatment_plan.tex +0 -756
- package/bin/skills/treatment-plans/references/goal_setting_frameworks.md +0 -411
- package/bin/skills/treatment-plans/references/intervention_guidelines.md +0 -507
- package/bin/skills/treatment-plans/references/regulatory_compliance.md +0 -476
- package/bin/skills/treatment-plans/references/specialty_specific_guidelines.md +0 -655
- package/bin/skills/treatment-plans/references/treatment_plan_standards.md +0 -485
- package/bin/skills/treatment-plans/scripts/check_completeness.py +0 -322
- package/bin/skills/treatment-plans/scripts/generate_template.py +0 -233
- package/bin/skills/treatment-plans/scripts/timeline_generator.py +0 -385
- package/bin/skills/treatment-plans/scripts/validate_treatment_plan.py +0 -369
- package/bin/skills/trl-fine-tuning/SKILL.md +0 -455
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +0 -227
- package/bin/skills/trl-fine-tuning/references/online-rl.md +0 -82
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +0 -122
- package/bin/skills/trl-fine-tuning/references/sft-training.md +0 -168
- package/bin/skills/umap-learn/SKILL.md +0 -479
- package/bin/skills/umap-learn/references/api_reference.md +0 -532
- package/bin/skills/uniprot-database/SKILL.md +0 -195
- package/bin/skills/uniprot-database/references/api_examples.md +0 -413
- package/bin/skills/uniprot-database/references/api_fields.md +0 -275
- package/bin/skills/uniprot-database/references/id_mapping_databases.md +0 -285
- package/bin/skills/uniprot-database/references/query_syntax.md +0 -256
- package/bin/skills/uniprot-database/scripts/uniprot_client.py +0 -341
- package/bin/skills/unsloth/SKILL.md +0 -635
- package/bin/skills/unsloth/docs/advanced-rl.md +0 -222
- package/bin/skills/unsloth/docs/chat-templates.md +0 -141
- package/bin/skills/unsloth/docs/datasets.md +0 -489
- package/bin/skills/unsloth/docs/docker-extended.md +0 -99
- package/bin/skills/unsloth/docs/dynamic-ggufs-2.0.md +0 -116
- package/bin/skills/unsloth/docs/dynamic-ggufs-aider.md +0 -118
- package/bin/skills/unsloth/docs/faq.md +0 -91
- package/bin/skills/unsloth/docs/fp16-vs-bf16.md +0 -61
- package/bin/skills/unsloth/docs/fp8-rl.md +0 -224
- package/bin/skills/unsloth/docs/glm-4.7-flash.md +0 -997
- package/bin/skills/unsloth/docs/inference-deployment-overview.md +0 -17
- package/bin/skills/unsloth/docs/inference.md +0 -27
- package/bin/skills/unsloth/docs/installation-docker.md +0 -155
- package/bin/skills/unsloth/docs/installation-pip.md +0 -148
- package/bin/skills/unsloth/docs/kernels-packing.md +0 -190
- package/bin/skills/unsloth/docs/kimi-k2.5.md +0 -634
- package/bin/skills/unsloth/docs/lm-studio.md +0 -235
- package/bin/skills/unsloth/docs/lora-hot-swapping.md +0 -75
- package/bin/skills/unsloth/docs/lora-hyperparameters.md +0 -363
- package/bin/skills/unsloth/docs/memory-efficient-rl.md +0 -267
- package/bin/skills/unsloth/docs/model-selection.md +0 -70
- package/bin/skills/unsloth/docs/models.md +0 -532
- package/bin/skills/unsloth/docs/multi-gpu-ddp.md +0 -90
- package/bin/skills/unsloth/docs/notebooks.md +0 -223
- package/bin/skills/unsloth/docs/overview.md +0 -110
- package/bin/skills/unsloth/docs/qwen3-coder-next-extended.md +0 -900
- package/bin/skills/unsloth/docs/qwen3-coder-next.md +0 -900
- package/bin/skills/unsloth/docs/requirements.md +0 -45
- package/bin/skills/unsloth/docs/reward-hacking.md +0 -25
- package/bin/skills/unsloth/docs/saving-to-gguf.md +0 -138
- package/bin/skills/unsloth/docs/saving-to-ollama.md +0 -46
- package/bin/skills/unsloth/docs/sglang-guide.md +0 -278
- package/bin/skills/unsloth/docs/speculative-decoding.md +0 -70
- package/bin/skills/unsloth/docs/tool-calling.md +0 -334
- package/bin/skills/unsloth/docs/troubleshooting-faq.md +0 -204
- package/bin/skills/unsloth/docs/troubleshooting-inference.md +0 -26
- package/bin/skills/unsloth/docs/tts-fine-tuning.md +0 -149
- package/bin/skills/unsloth/docs/tutorial-grpo.md +0 -273
- package/bin/skills/unsloth/docs/tutorial-llama3-ollama.md +0 -356
- package/bin/skills/unsloth/docs/vision-fine-tuning.md +0 -135
- package/bin/skills/unsloth/docs/vision-rl.md +0 -170
- package/bin/skills/unsloth/docs/vllm-engine-arguments.md +0 -43
- package/bin/skills/unsloth/docs/vllm-guide.md +0 -98
- package/bin/skills/uspto-database/SKILL.md +0 -607
- package/bin/skills/uspto-database/references/additional_apis.md +0 -394
- package/bin/skills/uspto-database/references/patentsearch_api.md +0 -266
- package/bin/skills/uspto-database/references/peds_api.md +0 -212
- package/bin/skills/uspto-database/references/trademark_api.md +0 -358
- package/bin/skills/uspto-database/scripts/patent_search.py +0 -290
- package/bin/skills/uspto-database/scripts/peds_client.py +0 -285
- package/bin/skills/uspto-database/scripts/trademark_client.py +0 -311
- package/bin/skills/vaex/SKILL.md +0 -182
- package/bin/skills/vaex/references/core_dataframes.md +0 -367
- package/bin/skills/vaex/references/data_processing.md +0 -555
- package/bin/skills/vaex/references/io_operations.md +0 -703
- package/bin/skills/vaex/references/machine_learning.md +0 -728
- package/bin/skills/vaex/references/performance.md +0 -571
- package/bin/skills/vaex/references/visualization.md +0 -613
- package/bin/skills/venue-templates/SKILL.md +0 -686
- package/bin/skills/venue-templates/assets/examples/cell_summary_example.md +0 -247
- package/bin/skills/venue-templates/assets/examples/medical_structured_abstract.md +0 -313
- package/bin/skills/venue-templates/assets/examples/nature_abstract_examples.md +0 -213
- package/bin/skills/venue-templates/assets/examples/neurips_introduction_example.md +0 -245
- package/bin/skills/venue-templates/assets/grants/nih_specific_aims.tex +0 -235
- package/bin/skills/venue-templates/assets/grants/nsf_proposal_template.tex +0 -375
- package/bin/skills/venue-templates/assets/journals/nature_article.tex +0 -171
- package/bin/skills/venue-templates/assets/journals/neurips_article.tex +0 -283
- package/bin/skills/venue-templates/assets/journals/plos_one.tex +0 -317
- package/bin/skills/venue-templates/assets/posters/beamerposter_academic.tex +0 -311
- package/bin/skills/venue-templates/references/cell_press_style.md +0 -483
- package/bin/skills/venue-templates/references/conferences_formatting.md +0 -564
- package/bin/skills/venue-templates/references/cs_conference_style.md +0 -463
- package/bin/skills/venue-templates/references/grants_requirements.md +0 -787
- package/bin/skills/venue-templates/references/journals_formatting.md +0 -486
- package/bin/skills/venue-templates/references/medical_journal_styles.md +0 -535
- package/bin/skills/venue-templates/references/ml_conference_style.md +0 -556
- package/bin/skills/venue-templates/references/nature_science_style.md +0 -405
- package/bin/skills/venue-templates/references/posters_guidelines.md +0 -628
- package/bin/skills/venue-templates/references/reviewer_expectations.md +0 -417
- package/bin/skills/venue-templates/references/venue_writing_styles.md +0 -321
- package/bin/skills/venue-templates/scripts/customize_template.py +0 -195
- package/bin/skills/venue-templates/scripts/query_template.py +0 -266
- package/bin/skills/venue-templates/scripts/validate_format.py +0 -250
- package/bin/skills/verl/SKILL.md +0 -391
- package/bin/skills/verl/references/api-reference.md +0 -301
- package/bin/skills/verl/references/troubleshooting.md +0 -391
- package/bin/skills/vllm/SKILL.md +0 -364
- package/bin/skills/vllm/references/optimization.md +0 -226
- package/bin/skills/vllm/references/quantization.md +0 -284
- package/bin/skills/vllm/references/server-deployment.md +0 -255
- package/bin/skills/vllm/references/troubleshooting.md +0 -447
- package/bin/skills/weights-and-biases/SKILL.md +0 -590
- package/bin/skills/weights-and-biases/references/artifacts.md +0 -584
- package/bin/skills/weights-and-biases/references/integrations.md +0 -700
- package/bin/skills/weights-and-biases/references/sweeps.md +0 -847
- package/bin/skills/whisper/SKILL.md +0 -317
- package/bin/skills/whisper/references/languages.md +0 -189
- package/bin/skills/zarr-python/SKILL.md +0 -779
- package/bin/skills/zarr-python/references/api_reference.md +0 -515
- package/bin/skills/zinc-database/SKILL.md +0 -404
- package/bin/skills/zinc-database/references/api_reference.md +0 -692
|
@@ -1,653 +0,0 @@
|
|
|
1
|
-
# Tokenization Algorithms Deep Dive
|
|
2
|
-
|
|
3
|
-
Comprehensive explanation of BPE, WordPiece, and Unigram algorithms.
|
|
4
|
-
|
|
5
|
-
## Byte-Pair Encoding (BPE)
|
|
6
|
-
|
|
7
|
-
### Algorithm overview
|
|
8
|
-
|
|
9
|
-
BPE iteratively merges the most frequent pair of tokens in a corpus.
|
|
10
|
-
|
|
11
|
-
**Training process**:
|
|
12
|
-
1. Initialize vocabulary with all characters
|
|
13
|
-
2. Count frequency of all adjacent token pairs
|
|
14
|
-
3. Merge most frequent pair into new token
|
|
15
|
-
4. Add new token to vocabulary
|
|
16
|
-
5. Update corpus with new token
|
|
17
|
-
6. Repeat until vocabulary size reached
|
|
18
|
-
|
|
19
|
-
### Step-by-step example
|
|
20
|
-
|
|
21
|
-
**Corpus**:
|
|
22
|
-
```
|
|
23
|
-
low: 5
|
|
24
|
-
lower: 2
|
|
25
|
-
newest: 6
|
|
26
|
-
widest: 3
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
**Iteration 1**:
|
|
30
|
-
```
|
|
31
|
-
Count pairs:
|
|
32
|
-
'e' + 's': 9 (newest: 6, widest: 3) ← most frequent
|
|
33
|
-
'l' + 'o': 7
|
|
34
|
-
'o' + 'w': 7
|
|
35
|
-
...
|
|
36
|
-
|
|
37
|
-
Merge: 'e' + 's' → 'es'
|
|
38
|
-
|
|
39
|
-
Updated corpus:
|
|
40
|
-
low: 5
|
|
41
|
-
lower: 2
|
|
42
|
-
newest: 6 → newes|t: 6
|
|
43
|
-
widest: 3 → wides|t: 3
|
|
44
|
-
|
|
45
|
-
Vocabulary: [a-z] + ['es']
|
|
46
|
-
```
|
|
47
|
-
|
|
48
|
-
**Iteration 2**:
|
|
49
|
-
```
|
|
50
|
-
Count pairs:
|
|
51
|
-
'es' + 't': 9 ← most frequent
|
|
52
|
-
'l' + 'o': 7
|
|
53
|
-
...
|
|
54
|
-
|
|
55
|
-
Merge: 'es' + 't' → 'est'
|
|
56
|
-
|
|
57
|
-
Updated corpus:
|
|
58
|
-
low: 5
|
|
59
|
-
lower: 2
|
|
60
|
-
newest: 6 → new|est: 6
|
|
61
|
-
widest: 3 → wid|est: 3
|
|
62
|
-
|
|
63
|
-
Vocabulary: [a-z] + ['es', 'est']
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
**Continue until desired vocabulary size...**
|
|
67
|
-
|
|
68
|
-
### Tokenization with trained BPE
|
|
69
|
-
|
|
70
|
-
Given vocabulary: `['l', 'o', 'w', 'e', 'r', 'n', 's', 't', 'i', 'd', 'es', 'est', 'lo', 'low', 'ne', 'new', 'newest', 'wi', 'wid', 'widest']`
|
|
71
|
-
|
|
72
|
-
Tokenize "lowest":
|
|
73
|
-
```
|
|
74
|
-
Step 1: Split into characters
|
|
75
|
-
['l', 'o', 'w', 'e', 's', 't']
|
|
76
|
-
|
|
77
|
-
Step 2: Apply merges in order learned during training
|
|
78
|
-
- Merge 'l' + 'o' → 'lo' (if this merge was learned)
|
|
79
|
-
- Merge 'lo' + 'w' → 'low' (if learned)
|
|
80
|
-
- Merge 'e' + 's' → 'es' (learned)
|
|
81
|
-
- Merge 'es' + 't' → 'est' (learned)
|
|
82
|
-
|
|
83
|
-
Final: ['low', 'est']
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
### Implementation
|
|
87
|
-
|
|
88
|
-
```python
|
|
89
|
-
from tokenizers import Tokenizer
|
|
90
|
-
from tokenizers.models import BPE
|
|
91
|
-
from tokenizers.trainers import BpeTrainer
|
|
92
|
-
from tokenizers.pre_tokenizers import Whitespace
|
|
93
|
-
|
|
94
|
-
# Initialize
|
|
95
|
-
tokenizer = Tokenizer(BPE(unk_token="[UNK]"))
|
|
96
|
-
tokenizer.pre_tokenizer = Whitespace()
|
|
97
|
-
|
|
98
|
-
# Configure trainer
|
|
99
|
-
trainer = BpeTrainer(
|
|
100
|
-
vocab_size=1000,
|
|
101
|
-
min_frequency=2,
|
|
102
|
-
special_tokens=["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"]
|
|
103
|
-
)
|
|
104
|
-
|
|
105
|
-
# Train
|
|
106
|
-
corpus = [
|
|
107
|
-
"This is a sample corpus for BPE training.",
|
|
108
|
-
"BPE learns subword units from the training data.",
|
|
109
|
-
# ... more sentences
|
|
110
|
-
]
|
|
111
|
-
|
|
112
|
-
tokenizer.train_from_iterator(corpus, trainer=trainer)
|
|
113
|
-
|
|
114
|
-
# Use
|
|
115
|
-
output = tokenizer.encode("This is tokenization")
|
|
116
|
-
print(output.tokens) # ['This', 'is', 'token', 'ization']
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
### Byte-level BPE (GPT-2 variant)
|
|
120
|
-
|
|
121
|
-
**Problem**: Standard BPE has limited character coverage (256+ Unicode chars)
|
|
122
|
-
|
|
123
|
-
**Solution**: Operate on byte level (256 bytes)
|
|
124
|
-
|
|
125
|
-
```python
|
|
126
|
-
from tokenizers.pre_tokenizers import ByteLevel
|
|
127
|
-
from tokenizers.decoders import ByteLevel as ByteLevelDecoder
|
|
128
|
-
|
|
129
|
-
tokenizer = Tokenizer(BPE())
|
|
130
|
-
|
|
131
|
-
# Byte-level pre-tokenization
|
|
132
|
-
tokenizer.pre_tokenizer = ByteLevel()
|
|
133
|
-
tokenizer.decoder = ByteLevelDecoder()
|
|
134
|
-
|
|
135
|
-
# This handles ALL possible characters, including emojis
|
|
136
|
-
text = "Hello 🌍 世界"
|
|
137
|
-
tokens = tokenizer.encode(text).tokens
|
|
138
|
-
```
|
|
139
|
-
|
|
140
|
-
**Advantages**:
|
|
141
|
-
- Handles any Unicode character (256 byte coverage)
|
|
142
|
-
- No unknown tokens (worst case: bytes)
|
|
143
|
-
- Used by GPT-2, GPT-3, BART
|
|
144
|
-
|
|
145
|
-
**Trade-offs**:
|
|
146
|
-
- Slightly worse compression (bytes vs characters)
|
|
147
|
-
- More tokens for non-ASCII text
|
|
148
|
-
|
|
149
|
-
### BPE variants
|
|
150
|
-
|
|
151
|
-
**SentencePiece BPE**:
|
|
152
|
-
- Language-independent (no pre-tokenization)
|
|
153
|
-
- Treats input as raw byte stream
|
|
154
|
-
- Used by T5, ALBERT, XLNet
|
|
155
|
-
|
|
156
|
-
**Robust BPE**:
|
|
157
|
-
- Dropout during training (randomly skip merges)
|
|
158
|
-
- More robust tokenization at inference
|
|
159
|
-
- Reduces overfitting to training data
|
|
160
|
-
|
|
161
|
-
## WordPiece
|
|
162
|
-
|
|
163
|
-
### Algorithm overview
|
|
164
|
-
|
|
165
|
-
WordPiece is similar to BPE but uses a different merge selection criterion.
|
|
166
|
-
|
|
167
|
-
**Training process**:
|
|
168
|
-
1. Initialize vocabulary with all characters
|
|
169
|
-
2. Count frequency of all token pairs
|
|
170
|
-
3. Score each pair: `score = freq(pair) / (freq(first) × freq(second))`
|
|
171
|
-
4. Merge pair with highest score
|
|
172
|
-
5. Repeat until vocabulary size reached
|
|
173
|
-
|
|
174
|
-
### Why different scoring?
|
|
175
|
-
|
|
176
|
-
**BPE**: Merges most frequent pairs
|
|
177
|
-
- "aa" appears 100 times → high priority
|
|
178
|
-
- Even if 'a' appears 1000 times alone
|
|
179
|
-
|
|
180
|
-
**WordPiece**: Merges pairs that are semantically related
|
|
181
|
-
- "aa" appears 100 times, 'a' appears 1000 times → low score (100 / (1000 × 1000))
|
|
182
|
-
- "th" appears 50 times, 't' appears 60 times, 'h' appears 55 times → high score (50 / (60 × 55))
|
|
183
|
-
- Prioritizes pairs that appear together more than expected
|
|
184
|
-
|
|
185
|
-
### Step-by-step example
|
|
186
|
-
|
|
187
|
-
**Corpus**:
|
|
188
|
-
```
|
|
189
|
-
low: 5
|
|
190
|
-
lower: 2
|
|
191
|
-
newest: 6
|
|
192
|
-
widest: 3
|
|
193
|
-
```
|
|
194
|
-
|
|
195
|
-
**Iteration 1**:
|
|
196
|
-
```
|
|
197
|
-
Count frequencies:
|
|
198
|
-
'e': 11 (lower: 2, newest: 6, widest: 3)
|
|
199
|
-
's': 9
|
|
200
|
-
't': 9
|
|
201
|
-
...
|
|
202
|
-
|
|
203
|
-
Count pairs:
|
|
204
|
-
'e' + 's': 9 (newest: 6, widest: 3)
|
|
205
|
-
'es' + 't': 9 (newest: 6, widest: 3)
|
|
206
|
-
...
|
|
207
|
-
|
|
208
|
-
Compute scores:
|
|
209
|
-
score('e' + 's') = 9 / (11 × 9) = 0.091
|
|
210
|
-
score('es' + 't') = 9 / (9 × 9) = 0.111 ← highest score
|
|
211
|
-
score('l' + 'o') = 7 / (7 × 9) = 0.111 ← tied
|
|
212
|
-
|
|
213
|
-
Choose: 'es' + 't' → 'est' (or 'lo' if tied)
|
|
214
|
-
```
|
|
215
|
-
|
|
216
|
-
**Key difference**: WordPiece prioritizes rare combinations over frequent ones.
|
|
217
|
-
|
|
218
|
-
### Tokenization with WordPiece
|
|
219
|
-
|
|
220
|
-
Given vocabulary: `['##e', '##s', '##t', 'l', 'o', 'w', 'new', 'est', 'low']`
|
|
221
|
-
|
|
222
|
-
Tokenize "lowest":
|
|
223
|
-
```
|
|
224
|
-
Step 1: Find longest matching prefix
|
|
225
|
-
'lowest' → 'low' (matches)
|
|
226
|
-
|
|
227
|
-
Step 2: Find longest match for remainder
|
|
228
|
-
'est' → 'est' (matches)
|
|
229
|
-
|
|
230
|
-
Final: ['low', 'est']
|
|
231
|
-
```
|
|
232
|
-
|
|
233
|
-
**If no match**:
|
|
234
|
-
```
|
|
235
|
-
Tokenize "unknownword":
|
|
236
|
-
'unknownword' → no match
|
|
237
|
-
'unknown' → no match
|
|
238
|
-
'unkn' → no match
|
|
239
|
-
'un' → no match
|
|
240
|
-
'u' → no match
|
|
241
|
-
→ [UNK]
|
|
242
|
-
```
|
|
243
|
-
|
|
244
|
-
### Implementation
|
|
245
|
-
|
|
246
|
-
```python
|
|
247
|
-
from tokenizers import Tokenizer
|
|
248
|
-
from tokenizers.models import WordPiece
|
|
249
|
-
from tokenizers.trainers import WordPieceTrainer
|
|
250
|
-
from tokenizers.normalizers import BertNormalizer
|
|
251
|
-
from tokenizers.pre_tokenizers import BertPreTokenizer
|
|
252
|
-
|
|
253
|
-
# Initialize BERT-style tokenizer
|
|
254
|
-
tokenizer = Tokenizer(WordPiece(unk_token="[UNK]"))
|
|
255
|
-
|
|
256
|
-
# Normalization (lowercase, accent stripping)
|
|
257
|
-
tokenizer.normalizer = BertNormalizer(lowercase=True)
|
|
258
|
-
|
|
259
|
-
# Pre-tokenization (whitespace + punctuation)
|
|
260
|
-
tokenizer.pre_tokenizer = BertPreTokenizer()
|
|
261
|
-
|
|
262
|
-
# Configure trainer
|
|
263
|
-
trainer = WordPieceTrainer(
|
|
264
|
-
vocab_size=30522, # BERT vocab size
|
|
265
|
-
min_frequency=2,
|
|
266
|
-
special_tokens=["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"],
|
|
267
|
-
continuing_subword_prefix="##" # BERT uses ##
|
|
268
|
-
)
|
|
269
|
-
|
|
270
|
-
# Train
|
|
271
|
-
tokenizer.train_from_iterator(corpus, trainer=trainer)
|
|
272
|
-
|
|
273
|
-
# Use
|
|
274
|
-
output = tokenizer.encode("Tokenization works great!")
|
|
275
|
-
print(output.tokens) # ['token', '##ization', 'works', 'great', '!']
|
|
276
|
-
```
|
|
277
|
-
|
|
278
|
-
### Subword prefix
|
|
279
|
-
|
|
280
|
-
**BERT uses `##` prefix**:
|
|
281
|
-
```
|
|
282
|
-
"unbelievable" → ['un', '##believ', '##able']
|
|
283
|
-
```
|
|
284
|
-
|
|
285
|
-
**Why?**
|
|
286
|
-
- Indicates token is a continuation
|
|
287
|
-
- Allows reconstruction: remove ##, concatenate
|
|
288
|
-
- Helps model distinguish word boundaries
|
|
289
|
-
|
|
290
|
-
### WordPiece advantages
|
|
291
|
-
|
|
292
|
-
**Semantic merges**:
|
|
293
|
-
- Prioritizes meaningful combinations
|
|
294
|
-
- "qu" has high score (always together)
|
|
295
|
-
- "qx" has low score (rare combination)
|
|
296
|
-
|
|
297
|
-
**Better for morphology**:
|
|
298
|
-
- Captures affixes: un-, -ing, -ed
|
|
299
|
-
- Preserves word stems
|
|
300
|
-
|
|
301
|
-
**Trade-offs**:
|
|
302
|
-
- Slower training than BPE
|
|
303
|
-
- More memory (stores vocabulary, not merges)
|
|
304
|
-
- Original implementation not open-source (HF reimplementation)
|
|
305
|
-
|
|
306
|
-
## Unigram
|
|
307
|
-
|
|
308
|
-
### Algorithm overview
|
|
309
|
-
|
|
310
|
-
Unigram works backward: start with large vocabulary, remove tokens.
|
|
311
|
-
|
|
312
|
-
**Training process**:
|
|
313
|
-
1. Initialize with large vocabulary (all substrings)
|
|
314
|
-
2. Estimate probability of each token (frequency-based)
|
|
315
|
-
3. For each token, compute loss increase if removed
|
|
316
|
-
4. Remove 10-20% of tokens with lowest loss impact
|
|
317
|
-
5. Re-estimate probabilities
|
|
318
|
-
6. Repeat until desired vocabulary size
|
|
319
|
-
|
|
320
|
-
### Probabilistic tokenization
|
|
321
|
-
|
|
322
|
-
**Unigram assumption**: Each token is independent.
|
|
323
|
-
|
|
324
|
-
Given vocabulary with probabilities:
|
|
325
|
-
```
|
|
326
|
-
P('low') = 0.02
|
|
327
|
-
P('l') = 0.01
|
|
328
|
-
P('o') = 0.015
|
|
329
|
-
P('w') = 0.01
|
|
330
|
-
P('est') = 0.03
|
|
331
|
-
P('e') = 0.02
|
|
332
|
-
P('s') = 0.015
|
|
333
|
-
P('t') = 0.015
|
|
334
|
-
```
|
|
335
|
-
|
|
336
|
-
Tokenize "lowest":
|
|
337
|
-
```
|
|
338
|
-
Option 1: ['low', 'est']
|
|
339
|
-
P = P('low') × P('est') = 0.02 × 0.03 = 0.0006
|
|
340
|
-
|
|
341
|
-
Option 2: ['l', 'o', 'w', 'est']
|
|
342
|
-
P = 0.01 × 0.015 × 0.01 × 0.03 = 0.000000045
|
|
343
|
-
|
|
344
|
-
Option 3: ['low', 'e', 's', 't']
|
|
345
|
-
P = 0.02 × 0.02 × 0.015 × 0.015 = 0.0000009
|
|
346
|
-
|
|
347
|
-
Choose option 1 (highest probability)
|
|
348
|
-
```
|
|
349
|
-
|
|
350
|
-
### Viterbi algorithm
|
|
351
|
-
|
|
352
|
-
Finding best tokenization is expensive (exponential possibilities).
|
|
353
|
-
|
|
354
|
-
**Viterbi algorithm** (dynamic programming):
|
|
355
|
-
```python
|
|
356
|
-
def tokenize_viterbi(word, vocab, probs):
|
|
357
|
-
n = len(word)
|
|
358
|
-
# dp[i] = (best_prob, best_tokens) for word[:i]
|
|
359
|
-
dp = [{} for _ in range(n + 1)]
|
|
360
|
-
dp[0] = (0.0, []) # log probability
|
|
361
|
-
|
|
362
|
-
for i in range(1, n + 1):
|
|
363
|
-
best_prob = float('-inf')
|
|
364
|
-
best_tokens = []
|
|
365
|
-
|
|
366
|
-
# Try all possible last tokens
|
|
367
|
-
for j in range(i):
|
|
368
|
-
token = word[j:i]
|
|
369
|
-
if token in vocab:
|
|
370
|
-
prob = dp[j][0] + log(probs[token])
|
|
371
|
-
if prob > best_prob:
|
|
372
|
-
best_prob = prob
|
|
373
|
-
best_tokens = dp[j][1] + [token]
|
|
374
|
-
|
|
375
|
-
dp[i] = (best_prob, best_tokens)
|
|
376
|
-
|
|
377
|
-
return dp[n][1]
|
|
378
|
-
```
|
|
379
|
-
|
|
380
|
-
**Time complexity**: O(n² × vocab_size) vs O(2^n) brute force
|
|
381
|
-
|
|
382
|
-
### Implementation
|
|
383
|
-
|
|
384
|
-
```python
|
|
385
|
-
from tokenizers import Tokenizer
|
|
386
|
-
from tokenizers.models import Unigram
|
|
387
|
-
from tokenizers.trainers import UnigramTrainer
|
|
388
|
-
|
|
389
|
-
# Initialize
|
|
390
|
-
tokenizer = Tokenizer(Unigram())
|
|
391
|
-
|
|
392
|
-
# Configure trainer
|
|
393
|
-
trainer = UnigramTrainer(
|
|
394
|
-
vocab_size=8000,
|
|
395
|
-
special_tokens=["<unk>", "<s>", "</s>"],
|
|
396
|
-
unk_token="<unk>",
|
|
397
|
-
max_piece_length=16, # Max token length
|
|
398
|
-
n_sub_iterations=2, # EM iterations
|
|
399
|
-
shrinking_factor=0.75 # Remove 25% each iteration
|
|
400
|
-
)
|
|
401
|
-
|
|
402
|
-
# Train
|
|
403
|
-
tokenizer.train_from_iterator(corpus, trainer=trainer)
|
|
404
|
-
|
|
405
|
-
# Use
|
|
406
|
-
output = tokenizer.encode("Tokenization with Unigram")
|
|
407
|
-
print(output.tokens) # ['▁Token', 'ization', '▁with', '▁Un', 'igram']
|
|
408
|
-
```
|
|
409
|
-
|
|
410
|
-
### Unigram advantages
|
|
411
|
-
|
|
412
|
-
**Probabilistic**:
|
|
413
|
-
- Multiple valid tokenizations
|
|
414
|
-
- Can sample different tokenizations (data augmentation)
|
|
415
|
-
|
|
416
|
-
**Subword regularization**:
|
|
417
|
-
```python
|
|
418
|
-
# Sample different tokenizations
|
|
419
|
-
for _ in range(3):
|
|
420
|
-
tokens = tokenizer.encode("tokenization", is_pretokenized=False).tokens
|
|
421
|
-
print(tokens)
|
|
422
|
-
|
|
423
|
-
# Output (different each time):
|
|
424
|
-
# ['token', 'ization']
|
|
425
|
-
# ['tok', 'en', 'ization']
|
|
426
|
-
# ['token', 'iz', 'ation']
|
|
427
|
-
```
|
|
428
|
-
|
|
429
|
-
**Language-independent**:
|
|
430
|
-
- No word boundaries needed
|
|
431
|
-
- Works for CJK languages (Chinese, Japanese, Korean)
|
|
432
|
-
- Treats input as character stream
|
|
433
|
-
|
|
434
|
-
**Trade-offs**:
|
|
435
|
-
- Slower training (EM algorithm)
|
|
436
|
-
- More hyperparameters
|
|
437
|
-
- Larger model (stores probabilities)
|
|
438
|
-
|
|
439
|
-
## Algorithm comparison
|
|
440
|
-
|
|
441
|
-
### Training speed
|
|
442
|
-
|
|
443
|
-
| Algorithm | Small (10MB) | Medium (100MB) | Large (1GB) |
|
|
444
|
-
|------------|--------------|----------------|-------------|
|
|
445
|
-
| BPE | 10-15 sec | 1-2 min | 10-20 min |
|
|
446
|
-
| WordPiece | 15-20 sec | 2-3 min | 15-30 min |
|
|
447
|
-
| Unigram | 20-30 sec | 3-5 min | 30-60 min |
|
|
448
|
-
|
|
449
|
-
**Tested on**: 16-core CPU, 30k vocab
|
|
450
|
-
|
|
451
|
-
### Tokenization quality
|
|
452
|
-
|
|
453
|
-
Tested on English Wikipedia (perplexity measurement):
|
|
454
|
-
|
|
455
|
-
| Algorithm | Vocab Size | Tokens/Word | Unknown Rate |
|
|
456
|
-
|------------|------------|-------------|--------------|
|
|
457
|
-
| BPE | 30k | 1.3 | 0.5% |
|
|
458
|
-
| WordPiece | 30k | 1.2 | 1.2% |
|
|
459
|
-
| Unigram | 8k | 1.5 | 0.3% |
|
|
460
|
-
|
|
461
|
-
**Key observations**:
|
|
462
|
-
- WordPiece: Slightly better compression
|
|
463
|
-
- BPE: Lower unknown rate
|
|
464
|
-
- Unigram: Smallest vocab, good coverage
|
|
465
|
-
|
|
466
|
-
### Compression ratio
|
|
467
|
-
|
|
468
|
-
Characters per token (higher = better compression):
|
|
469
|
-
|
|
470
|
-
| Language | BPE (30k) | WordPiece (30k) | Unigram (8k) |
|
|
471
|
-
|----------|-----------|-----------------|--------------|
|
|
472
|
-
| English | 4.2 | 4.5 | 3.8 |
|
|
473
|
-
| Chinese | 2.1 | 2.3 | 2.5 |
|
|
474
|
-
| Arabic | 3.5 | 3.8 | 3.2 |
|
|
475
|
-
|
|
476
|
-
**Best for each**:
|
|
477
|
-
- English: WordPiece
|
|
478
|
-
- Chinese: Unigram (language-independent)
|
|
479
|
-
- Arabic: WordPiece
|
|
480
|
-
|
|
481
|
-
### Use case recommendations
|
|
482
|
-
|
|
483
|
-
**BPE** - Best for:
|
|
484
|
-
- English language models
|
|
485
|
-
- Code (handles symbols well)
|
|
486
|
-
- Fast training needed
|
|
487
|
-
- **Models**: GPT-2, GPT-3, RoBERTa, BART
|
|
488
|
-
|
|
489
|
-
**WordPiece** - Best for:
|
|
490
|
-
- Masked language modeling (BERT-style)
|
|
491
|
-
- Morphologically rich languages
|
|
492
|
-
- Semantic understanding tasks
|
|
493
|
-
- **Models**: BERT, DistilBERT, ELECTRA
|
|
494
|
-
|
|
495
|
-
**Unigram** - Best for:
|
|
496
|
-
- Multilingual models
|
|
497
|
-
- Languages without word boundaries (CJK)
|
|
498
|
-
- Data augmentation via subword regularization
|
|
499
|
-
- **Models**: T5, ALBERT, XLNet (via SentencePiece)
|
|
500
|
-
|
|
501
|
-
## Advanced topics
|
|
502
|
-
|
|
503
|
-
### Handling rare words
|
|
504
|
-
|
|
505
|
-
**BPE approach**:
|
|
506
|
-
```
|
|
507
|
-
"antidisestablishmentarianism"
|
|
508
|
-
→ ['anti', 'dis', 'establish', 'ment', 'arian', 'ism']
|
|
509
|
-
```
|
|
510
|
-
|
|
511
|
-
**WordPiece approach**:
|
|
512
|
-
```
|
|
513
|
-
"antidisestablishmentarianism"
|
|
514
|
-
→ ['anti', '##dis', '##establish', '##ment', '##arian', '##ism']
|
|
515
|
-
```
|
|
516
|
-
|
|
517
|
-
**Unigram approach**:
|
|
518
|
-
```
|
|
519
|
-
"antidisestablishmentarianism"
|
|
520
|
-
→ ['▁anti', 'dis', 'establish', 'ment', 'arian', 'ism']
|
|
521
|
-
```
|
|
522
|
-
|
|
523
|
-
### Handling numbers
|
|
524
|
-
|
|
525
|
-
**Challenge**: Infinite number combinations
|
|
526
|
-
|
|
527
|
-
**BPE solution**: Byte-level (handles any digit sequence)
|
|
528
|
-
```python
|
|
529
|
-
tokenizer = Tokenizer(BPE())
|
|
530
|
-
tokenizer.pre_tokenizer = ByteLevel()
|
|
531
|
-
|
|
532
|
-
# Handles any number
|
|
533
|
-
"123456789" → byte-level tokens
|
|
534
|
-
```
|
|
535
|
-
|
|
536
|
-
**WordPiece solution**: Digit pre-tokenization
|
|
537
|
-
```python
|
|
538
|
-
from tokenizers.pre_tokenizers import Digits
|
|
539
|
-
|
|
540
|
-
# Split digits individually or as groups
|
|
541
|
-
tokenizer.pre_tokenizer = Digits(individual_digits=True)
|
|
542
|
-
|
|
543
|
-
"123" → ['1', '2', '3']
|
|
544
|
-
```
|
|
545
|
-
|
|
546
|
-
**Unigram solution**: Learns common number patterns
|
|
547
|
-
```python
|
|
548
|
-
# Learns patterns during training
|
|
549
|
-
"2023" → ['202', '3'] or ['20', '23']
|
|
550
|
-
```
|
|
551
|
-
|
|
552
|
-
### Handling case sensitivity
|
|
553
|
-
|
|
554
|
-
**Lowercase (BERT)**:
|
|
555
|
-
```python
|
|
556
|
-
from tokenizers.normalizers import Lowercase
|
|
557
|
-
|
|
558
|
-
tokenizer.normalizer = Lowercase()
|
|
559
|
-
|
|
560
|
-
"Hello WORLD" → "hello world" → ['hello', 'world']
|
|
561
|
-
```
|
|
562
|
-
|
|
563
|
-
**Preserve case (GPT-2)**:
|
|
564
|
-
```python
|
|
565
|
-
# No case normalization
|
|
566
|
-
tokenizer.normalizer = None
|
|
567
|
-
|
|
568
|
-
"Hello WORLD" → ['Hello', 'WORLD']
|
|
569
|
-
```
|
|
570
|
-
|
|
571
|
-
**Cased tokens (RoBERTa)**:
|
|
572
|
-
```python
|
|
573
|
-
# Learns separate tokens for different cases
|
|
574
|
-
Vocabulary: ['Hello', 'hello', 'HELLO', 'world', 'WORLD']
|
|
575
|
-
```
|
|
576
|
-
|
|
577
|
-
### Handling emojis and special characters
|
|
578
|
-
|
|
579
|
-
**Byte-level (GPT-2)**:
|
|
580
|
-
```python
|
|
581
|
-
tokenizer.pre_tokenizer = ByteLevel()
|
|
582
|
-
|
|
583
|
-
"Hello 🌍 👋" → byte-level representation (always works)
|
|
584
|
-
```
|
|
585
|
-
|
|
586
|
-
**Unicode normalization**:
|
|
587
|
-
```python
|
|
588
|
-
from tokenizers.normalizers import NFKC
|
|
589
|
-
|
|
590
|
-
tokenizer.normalizer = NFKC()
|
|
591
|
-
|
|
592
|
-
"é" (composed) ↔ "é" (decomposed) → normalized to one form
|
|
593
|
-
```
|
|
594
|
-
|
|
595
|
-
## Troubleshooting
|
|
596
|
-
|
|
597
|
-
### Issue: Poor subword splitting
|
|
598
|
-
|
|
599
|
-
**Symptom**:
|
|
600
|
-
```
|
|
601
|
-
"running" → ['r', 'u', 'n', 'n', 'i', 'n', 'g'] (too granular)
|
|
602
|
-
```
|
|
603
|
-
|
|
604
|
-
**Solutions**:
|
|
605
|
-
1. Increase vocabulary size
|
|
606
|
-
2. Train longer (more merge iterations)
|
|
607
|
-
3. Lower `min_frequency` threshold
|
|
608
|
-
|
|
609
|
-
### Issue: Too many unknown tokens
|
|
610
|
-
|
|
611
|
-
**Symptom**:
|
|
612
|
-
```
|
|
613
|
-
5% of tokens are [UNK]
|
|
614
|
-
```
|
|
615
|
-
|
|
616
|
-
**Solutions**:
|
|
617
|
-
1. Increase vocabulary size
|
|
618
|
-
2. Use byte-level BPE (no UNK possible)
|
|
619
|
-
3. Verify training corpus is representative
|
|
620
|
-
|
|
621
|
-
### Issue: Inconsistent tokenization
|
|
622
|
-
|
|
623
|
-
**Symptom**:
|
|
624
|
-
```
|
|
625
|
-
"running" → ['run', 'ning']
|
|
626
|
-
"runner" → ['r', 'u', 'n', 'n', 'e', 'r']
|
|
627
|
-
```
|
|
628
|
-
|
|
629
|
-
**Solutions**:
|
|
630
|
-
1. Check normalization consistency
|
|
631
|
-
2. Ensure pre-tokenization is deterministic
|
|
632
|
-
3. Use Unigram for probabilistic variance
|
|
633
|
-
|
|
634
|
-
## Best practices
|
|
635
|
-
|
|
636
|
-
1. **Match algorithm to model architecture**:
|
|
637
|
-
- BERT-style → WordPiece
|
|
638
|
-
- GPT-style → BPE
|
|
639
|
-
- T5-style → Unigram
|
|
640
|
-
|
|
641
|
-
2. **Use byte-level for multilingual**:
|
|
642
|
-
- Handles any Unicode
|
|
643
|
-
- No unknown tokens
|
|
644
|
-
|
|
645
|
-
3. **Test on representative data**:
|
|
646
|
-
- Measure compression ratio
|
|
647
|
-
- Check unknown token rate
|
|
648
|
-
- Inspect sample tokenizations
|
|
649
|
-
|
|
650
|
-
4. **Version control tokenizers**:
|
|
651
|
-
- Save with model
|
|
652
|
-
- Document special tokens
|
|
653
|
-
- Track vocabulary changes
|