@synsci/cli-darwin-x64-baseline 1.1.76 → 1.1.78
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/adaptyv/SKILL.md +114 -0
- package/bin/skills/adaptyv/reference/api_reference.md +308 -0
- package/bin/skills/adaptyv/reference/examples.md +913 -0
- package/bin/skills/adaptyv/reference/experiments.md +360 -0
- package/bin/skills/adaptyv/reference/protein_optimization.md +637 -0
- package/bin/skills/aeon/SKILL.md +374 -0
- package/bin/skills/aeon/references/anomaly_detection.md +154 -0
- package/bin/skills/aeon/references/classification.md +144 -0
- package/bin/skills/aeon/references/clustering.md +123 -0
- package/bin/skills/aeon/references/datasets_benchmarking.md +387 -0
- package/bin/skills/aeon/references/distances.md +256 -0
- package/bin/skills/aeon/references/forecasting.md +140 -0
- package/bin/skills/aeon/references/networks.md +289 -0
- package/bin/skills/aeon/references/regression.md +118 -0
- package/bin/skills/aeon/references/segmentation.md +163 -0
- package/bin/skills/aeon/references/similarity_search.md +187 -0
- package/bin/skills/aeon/references/transformations.md +246 -0
- package/bin/skills/alphafold-database/SKILL.md +513 -0
- package/bin/skills/alphafold-database/references/api_reference.md +423 -0
- package/bin/skills/anndata/SKILL.md +400 -0
- package/bin/skills/anndata/references/best_practices.md +525 -0
- package/bin/skills/anndata/references/concatenation.md +396 -0
- package/bin/skills/anndata/references/data_structure.md +314 -0
- package/bin/skills/anndata/references/io_operations.md +404 -0
- package/bin/skills/anndata/references/manipulation.md +516 -0
- package/bin/skills/arboreto/SKILL.md +243 -0
- package/bin/skills/arboreto/references/algorithms.md +138 -0
- package/bin/skills/arboreto/references/basic_inference.md +151 -0
- package/bin/skills/arboreto/references/distributed_computing.md +242 -0
- package/bin/skills/arboreto/scripts/basic_grn_inference.py +97 -0
- package/bin/skills/astropy/SKILL.md +331 -0
- package/bin/skills/astropy/references/coordinates.md +273 -0
- package/bin/skills/astropy/references/cosmology.md +307 -0
- package/bin/skills/astropy/references/fits.md +396 -0
- package/bin/skills/astropy/references/tables.md +489 -0
- package/bin/skills/astropy/references/time.md +404 -0
- package/bin/skills/astropy/references/units.md +178 -0
- package/bin/skills/astropy/references/wcs_and_other_modules.md +373 -0
- package/bin/skills/benchling-integration/SKILL.md +480 -0
- package/bin/skills/benchling-integration/references/api_endpoints.md +883 -0
- package/bin/skills/benchling-integration/references/authentication.md +379 -0
- package/bin/skills/benchling-integration/references/sdk_reference.md +774 -0
- package/bin/skills/biopython/SKILL.md +443 -0
- package/bin/skills/biopython/references/advanced.md +577 -0
- package/bin/skills/biopython/references/alignment.md +362 -0
- package/bin/skills/biopython/references/blast.md +455 -0
- package/bin/skills/biopython/references/databases.md +484 -0
- package/bin/skills/biopython/references/phylogenetics.md +566 -0
- package/bin/skills/biopython/references/sequence_io.md +285 -0
- package/bin/skills/biopython/references/structure.md +564 -0
- package/bin/skills/biorxiv-database/SKILL.md +483 -0
- package/bin/skills/biorxiv-database/references/api_reference.md +280 -0
- package/bin/skills/biorxiv-database/scripts/biorxiv_search.py +445 -0
- package/bin/skills/bioservices/SKILL.md +361 -0
- package/bin/skills/bioservices/references/identifier_mapping.md +685 -0
- package/bin/skills/bioservices/references/services_reference.md +636 -0
- package/bin/skills/bioservices/references/workflow_patterns.md +811 -0
- package/bin/skills/bioservices/scripts/batch_id_converter.py +347 -0
- package/bin/skills/bioservices/scripts/compound_cross_reference.py +378 -0
- package/bin/skills/bioservices/scripts/pathway_analysis.py +309 -0
- package/bin/skills/bioservices/scripts/protein_analysis_workflow.py +408 -0
- package/bin/skills/brenda-database/SKILL.md +719 -0
- package/bin/skills/brenda-database/references/api_reference.md +537 -0
- package/bin/skills/brenda-database/scripts/brenda_queries.py +844 -0
- package/bin/skills/brenda-database/scripts/brenda_visualization.py +772 -0
- package/bin/skills/brenda-database/scripts/enzyme_pathway_builder.py +1053 -0
- package/bin/skills/cellxgene-census/SKILL.md +511 -0
- package/bin/skills/cellxgene-census/references/census_schema.md +182 -0
- package/bin/skills/cellxgene-census/references/common_patterns.md +351 -0
- package/bin/skills/chembl-database/SKILL.md +389 -0
- package/bin/skills/chembl-database/references/api_reference.md +272 -0
- package/bin/skills/chembl-database/scripts/example_queries.py +278 -0
- package/bin/skills/cirq/SKILL.md +346 -0
- package/bin/skills/cirq/references/building.md +307 -0
- package/bin/skills/cirq/references/experiments.md +572 -0
- package/bin/skills/cirq/references/hardware.md +515 -0
- package/bin/skills/cirq/references/noise.md +515 -0
- package/bin/skills/cirq/references/simulation.md +350 -0
- package/bin/skills/cirq/references/transformation.md +416 -0
- package/bin/skills/clinicaltrials-database/SKILL.md +507 -0
- package/bin/skills/clinicaltrials-database/references/api_reference.md +358 -0
- package/bin/skills/clinicaltrials-database/scripts/query_clinicaltrials.py +215 -0
- package/bin/skills/clinpgx-database/SKILL.md +638 -0
- package/bin/skills/clinpgx-database/references/api_reference.md +757 -0
- package/bin/skills/clinpgx-database/scripts/query_clinpgx.py +518 -0
- package/bin/skills/clinvar-database/SKILL.md +362 -0
- package/bin/skills/clinvar-database/references/api_reference.md +227 -0
- package/bin/skills/clinvar-database/references/clinical_significance.md +218 -0
- package/bin/skills/clinvar-database/references/data_formats.md +358 -0
- package/bin/skills/cobrapy/SKILL.md +463 -0
- package/bin/skills/cobrapy/references/api_quick_reference.md +655 -0
- package/bin/skills/cobrapy/references/workflows.md +593 -0
- package/bin/skills/cosmic-database/SKILL.md +336 -0
- package/bin/skills/cosmic-database/references/cosmic_data_reference.md +220 -0
- package/bin/skills/cosmic-database/scripts/download_cosmic.py +231 -0
- package/bin/skills/dask/SKILL.md +456 -0
- package/bin/skills/dask/references/arrays.md +497 -0
- package/bin/skills/dask/references/bags.md +468 -0
- package/bin/skills/dask/references/best-practices.md +277 -0
- package/bin/skills/dask/references/dataframes.md +368 -0
- package/bin/skills/dask/references/futures.md +541 -0
- package/bin/skills/dask/references/schedulers.md +504 -0
- package/bin/skills/datacommons-client/SKILL.md +255 -0
- package/bin/skills/datacommons-client/references/getting_started.md +417 -0
- package/bin/skills/datacommons-client/references/node.md +250 -0
- package/bin/skills/datacommons-client/references/observation.md +185 -0
- package/bin/skills/datacommons-client/references/resolve.md +246 -0
- package/bin/skills/datamol/SKILL.md +706 -0
- package/bin/skills/datamol/references/conformers_module.md +131 -0
- package/bin/skills/datamol/references/core_api.md +130 -0
- package/bin/skills/datamol/references/descriptors_viz.md +195 -0
- package/bin/skills/datamol/references/fragments_scaffolds.md +174 -0
- package/bin/skills/datamol/references/io_module.md +109 -0
- package/bin/skills/datamol/references/reactions_data.md +218 -0
- package/bin/skills/deepchem/SKILL.md +597 -0
- package/bin/skills/deepchem/references/api_reference.md +303 -0
- package/bin/skills/deepchem/references/workflows.md +491 -0
- package/bin/skills/deepchem/scripts/graph_neural_network.py +338 -0
- package/bin/skills/deepchem/scripts/predict_solubility.py +224 -0
- package/bin/skills/deepchem/scripts/transfer_learning.py +375 -0
- package/bin/skills/deeptools/SKILL.md +531 -0
- package/bin/skills/deeptools/assets/quick_reference.md +58 -0
- package/bin/skills/deeptools/references/effective_genome_sizes.md +116 -0
- package/bin/skills/deeptools/references/normalization_methods.md +410 -0
- package/bin/skills/deeptools/references/tools_reference.md +533 -0
- package/bin/skills/deeptools/references/workflows.md +474 -0
- package/bin/skills/deeptools/scripts/validate_files.py +195 -0
- package/bin/skills/deeptools/scripts/workflow_generator.py +454 -0
- package/bin/skills/denario/SKILL.md +215 -0
- package/bin/skills/denario/references/examples.md +494 -0
- package/bin/skills/denario/references/installation.md +213 -0
- package/bin/skills/denario/references/llm_configuration.md +265 -0
- package/bin/skills/denario/references/research_pipeline.md +471 -0
- package/bin/skills/diffdock/SKILL.md +483 -0
- package/bin/skills/diffdock/assets/batch_template.csv +4 -0
- package/bin/skills/diffdock/assets/custom_inference_config.yaml +90 -0
- package/bin/skills/diffdock/references/confidence_and_limitations.md +182 -0
- package/bin/skills/diffdock/references/parameters_reference.md +163 -0
- package/bin/skills/diffdock/references/workflows_examples.md +392 -0
- package/bin/skills/diffdock/scripts/analyze_results.py +334 -0
- package/bin/skills/diffdock/scripts/prepare_batch_csv.py +254 -0
- package/bin/skills/diffdock/scripts/setup_check.py +278 -0
- package/bin/skills/dnanexus-integration/SKILL.md +383 -0
- package/bin/skills/dnanexus-integration/references/app-development.md +247 -0
- package/bin/skills/dnanexus-integration/references/configuration.md +646 -0
- package/bin/skills/dnanexus-integration/references/data-operations.md +400 -0
- package/bin/skills/dnanexus-integration/references/job-execution.md +412 -0
- package/bin/skills/dnanexus-integration/references/python-sdk.md +523 -0
- package/bin/skills/document-skills/docx/LICENSE.txt +30 -0
- package/bin/skills/document-skills/docx/SKILL.md +233 -0
- package/bin/skills/document-skills/docx/docx-js.md +350 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +1499 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +146 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +1085 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +11 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +3081 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +23 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +185 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +287 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +1676 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +28 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +144 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +174 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +25 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +18 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +59 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +56 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +195 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +582 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +25 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +4439 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +570 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +509 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +12 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +108 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +96 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +3646 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +116 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +42 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +50 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +49 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +33 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/mce/mc.xsd +75 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2010.xsd +560 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2012.xsd +67 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2018.xsd +14 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cex-2018.xsd +20 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cid-2016.xsd +13 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +4 -0
- package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-symex-2015.xsd +8 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/pack.py +159 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/unpack.py +29 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validate.py +69 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/__init__.py +15 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/base.py +951 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/docx.py +274 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/pptx.py +315 -0
- package/bin/skills/document-skills/docx/ooxml/scripts/validation/redlining.py +279 -0
- package/bin/skills/document-skills/docx/ooxml.md +610 -0
- package/bin/skills/document-skills/docx/scripts/__init__.py +1 -0
- package/bin/skills/document-skills/docx/scripts/document.py +1276 -0
- package/bin/skills/document-skills/docx/scripts/templates/comments.xml +3 -0
- package/bin/skills/document-skills/docx/scripts/templates/commentsExtended.xml +3 -0
- package/bin/skills/document-skills/docx/scripts/templates/commentsExtensible.xml +3 -0
- package/bin/skills/document-skills/docx/scripts/templates/commentsIds.xml +3 -0
- package/bin/skills/document-skills/docx/scripts/templates/people.xml +3 -0
- package/bin/skills/document-skills/docx/scripts/utilities.py +374 -0
- package/bin/skills/document-skills/pdf/LICENSE.txt +30 -0
- package/bin/skills/document-skills/pdf/SKILL.md +330 -0
- package/bin/skills/document-skills/pdf/forms.md +205 -0
- package/bin/skills/document-skills/pdf/reference.md +612 -0
- package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes.py +70 -0
- package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes_test.py +226 -0
- package/bin/skills/document-skills/pdf/scripts/check_fillable_fields.py +12 -0
- package/bin/skills/document-skills/pdf/scripts/convert_pdf_to_images.py +35 -0
- package/bin/skills/document-skills/pdf/scripts/create_validation_image.py +41 -0
- package/bin/skills/document-skills/pdf/scripts/extract_form_field_info.py +152 -0
- package/bin/skills/document-skills/pdf/scripts/fill_fillable_fields.py +114 -0
- package/bin/skills/document-skills/pdf/scripts/fill_pdf_form_with_annotations.py +108 -0
- package/bin/skills/document-skills/pptx/LICENSE.txt +30 -0
- package/bin/skills/document-skills/pptx/SKILL.md +520 -0
- package/bin/skills/document-skills/pptx/html2pptx.md +625 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +1499 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +146 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +1085 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +11 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +3081 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +23 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +185 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +287 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +1676 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +28 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +144 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +174 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +25 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +18 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +59 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +56 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +195 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +582 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +25 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +4439 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +570 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +509 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +12 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +108 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +96 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +3646 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +116 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +42 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +50 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +49 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +33 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/mce/mc.xsd +75 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2010.xsd +560 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2012.xsd +67 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2018.xsd +14 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cex-2018.xsd +20 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cid-2016.xsd +13 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +4 -0
- package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-symex-2015.xsd +8 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/pack.py +159 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/unpack.py +29 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validate.py +69 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/__init__.py +15 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/base.py +951 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/docx.py +274 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/pptx.py +315 -0
- package/bin/skills/document-skills/pptx/ooxml/scripts/validation/redlining.py +279 -0
- package/bin/skills/document-skills/pptx/ooxml.md +427 -0
- package/bin/skills/document-skills/pptx/scripts/html2pptx.js +979 -0
- package/bin/skills/document-skills/pptx/scripts/inventory.py +1020 -0
- package/bin/skills/document-skills/pptx/scripts/rearrange.py +231 -0
- package/bin/skills/document-skills/pptx/scripts/replace.py +385 -0
- package/bin/skills/document-skills/pptx/scripts/thumbnail.py +450 -0
- package/bin/skills/document-skills/xlsx/LICENSE.txt +30 -0
- package/bin/skills/document-skills/xlsx/SKILL.md +325 -0
- package/bin/skills/document-skills/xlsx/recalc.py +178 -0
- package/bin/skills/drugbank-database/SKILL.md +190 -0
- package/bin/skills/drugbank-database/references/chemical-analysis.md +590 -0
- package/bin/skills/drugbank-database/references/data-access.md +242 -0
- package/bin/skills/drugbank-database/references/drug-queries.md +386 -0
- package/bin/skills/drugbank-database/references/interactions.md +425 -0
- package/bin/skills/drugbank-database/references/targets-pathways.md +518 -0
- package/bin/skills/drugbank-database/scripts/drugbank_helper.py +350 -0
- package/bin/skills/ena-database/SKILL.md +204 -0
- package/bin/skills/ena-database/references/api_reference.md +490 -0
- package/bin/skills/ensembl-database/SKILL.md +311 -0
- package/bin/skills/ensembl-database/references/api_endpoints.md +346 -0
- package/bin/skills/ensembl-database/scripts/ensembl_query.py +427 -0
- package/bin/skills/esm/SKILL.md +306 -0
- package/bin/skills/esm/references/esm-c-api.md +583 -0
- package/bin/skills/esm/references/esm3-api.md +452 -0
- package/bin/skills/esm/references/forge-api.md +657 -0
- package/bin/skills/esm/references/workflows.md +685 -0
- package/bin/skills/etetoolkit/SKILL.md +623 -0
- package/bin/skills/etetoolkit/references/api_reference.md +583 -0
- package/bin/skills/etetoolkit/references/visualization.md +783 -0
- package/bin/skills/etetoolkit/references/workflows.md +774 -0
- package/bin/skills/etetoolkit/scripts/quick_visualize.py +214 -0
- package/bin/skills/etetoolkit/scripts/tree_operations.py +229 -0
- package/bin/skills/exploratory-data-analysis/SKILL.md +446 -0
- package/bin/skills/exploratory-data-analysis/assets/report_template.md +196 -0
- package/bin/skills/exploratory-data-analysis/references/bioinformatics_genomics_formats.md +664 -0
- package/bin/skills/exploratory-data-analysis/references/chemistry_molecular_formats.md +664 -0
- package/bin/skills/exploratory-data-analysis/references/general_scientific_formats.md +518 -0
- package/bin/skills/exploratory-data-analysis/references/microscopy_imaging_formats.md +620 -0
- package/bin/skills/exploratory-data-analysis/references/proteomics_metabolomics_formats.md +517 -0
- package/bin/skills/exploratory-data-analysis/references/spectroscopy_analytical_formats.md +633 -0
- package/bin/skills/exploratory-data-analysis/scripts/eda_analyzer.py +547 -0
- package/bin/skills/fda-database/SKILL.md +518 -0
- package/bin/skills/fda-database/references/animal_veterinary.md +377 -0
- package/bin/skills/fda-database/references/api_basics.md +687 -0
- package/bin/skills/fda-database/references/devices.md +632 -0
- package/bin/skills/fda-database/references/drugs.md +468 -0
- package/bin/skills/fda-database/references/foods.md +374 -0
- package/bin/skills/fda-database/references/other.md +472 -0
- package/bin/skills/fda-database/scripts/fda_examples.py +335 -0
- package/bin/skills/fda-database/scripts/fda_query.py +440 -0
- package/bin/skills/flowio/SKILL.md +608 -0
- package/bin/skills/flowio/references/api_reference.md +372 -0
- package/bin/skills/fluidsim/SKILL.md +349 -0
- package/bin/skills/fluidsim/references/advanced_features.md +398 -0
- package/bin/skills/fluidsim/references/installation.md +68 -0
- package/bin/skills/fluidsim/references/output_analysis.md +283 -0
- package/bin/skills/fluidsim/references/parameters.md +198 -0
- package/bin/skills/fluidsim/references/simulation_workflow.md +172 -0
- package/bin/skills/fluidsim/references/solvers.md +94 -0
- package/bin/skills/fred-economic-data/SKILL.md +433 -0
- package/bin/skills/fred-economic-data/references/api_basics.md +212 -0
- package/bin/skills/fred-economic-data/references/categories.md +442 -0
- package/bin/skills/fred-economic-data/references/geofred.md +588 -0
- package/bin/skills/fred-economic-data/references/releases.md +642 -0
- package/bin/skills/fred-economic-data/references/series.md +584 -0
- package/bin/skills/fred-economic-data/references/sources.md +423 -0
- package/bin/skills/fred-economic-data/references/tags.md +485 -0
- package/bin/skills/fred-economic-data/scripts/fred_examples.py +354 -0
- package/bin/skills/fred-economic-data/scripts/fred_query.py +590 -0
- package/bin/skills/gene-database/SKILL.md +179 -0
- package/bin/skills/gene-database/references/api_reference.md +404 -0
- package/bin/skills/gene-database/references/common_workflows.md +428 -0
- package/bin/skills/gene-database/scripts/batch_gene_lookup.py +298 -0
- package/bin/skills/gene-database/scripts/fetch_gene_data.py +277 -0
- package/bin/skills/gene-database/scripts/query_gene.py +251 -0
- package/bin/skills/geniml/SKILL.md +318 -0
- package/bin/skills/geniml/references/bedspace.md +127 -0
- package/bin/skills/geniml/references/consensus_peaks.md +238 -0
- package/bin/skills/geniml/references/region2vec.md +90 -0
- package/bin/skills/geniml/references/scembed.md +197 -0
- package/bin/skills/geniml/references/utilities.md +385 -0
- package/bin/skills/geo-database/SKILL.md +815 -0
- package/bin/skills/geo-database/references/geo_reference.md +829 -0
- package/bin/skills/geopandas/SKILL.md +251 -0
- package/bin/skills/geopandas/references/crs-management.md +243 -0
- package/bin/skills/geopandas/references/data-io.md +165 -0
- package/bin/skills/geopandas/references/data-structures.md +70 -0
- package/bin/skills/geopandas/references/geometric-operations.md +221 -0
- package/bin/skills/geopandas/references/spatial-analysis.md +184 -0
- package/bin/skills/geopandas/references/visualization.md +243 -0
- package/bin/skills/get-available-resources/SKILL.md +277 -0
- package/bin/skills/get-available-resources/scripts/detect_resources.py +401 -0
- package/bin/skills/gget/SKILL.md +871 -0
- package/bin/skills/gget/references/database_info.md +300 -0
- package/bin/skills/gget/references/module_reference.md +467 -0
- package/bin/skills/gget/references/workflows.md +814 -0
- package/bin/skills/gget/scripts/batch_sequence_analysis.py +191 -0
- package/bin/skills/gget/scripts/enrichment_pipeline.py +235 -0
- package/bin/skills/gget/scripts/gene_analysis.py +161 -0
- package/bin/skills/gtars/SKILL.md +285 -0
- package/bin/skills/gtars/references/cli.md +222 -0
- package/bin/skills/gtars/references/coverage.md +172 -0
- package/bin/skills/gtars/references/overlap.md +156 -0
- package/bin/skills/gtars/references/python-api.md +211 -0
- package/bin/skills/gtars/references/refget.md +147 -0
- package/bin/skills/gtars/references/tokenizers.md +103 -0
- package/bin/skills/gwas-database/SKILL.md +608 -0
- package/bin/skills/gwas-database/references/api_reference.md +793 -0
- package/bin/skills/histolab/SKILL.md +678 -0
- package/bin/skills/histolab/references/filters_preprocessing.md +514 -0
- package/bin/skills/histolab/references/slide_management.md +172 -0
- package/bin/skills/histolab/references/tile_extraction.md +421 -0
- package/bin/skills/histolab/references/tissue_masks.md +251 -0
- package/bin/skills/histolab/references/visualization.md +547 -0
- package/bin/skills/hmdb-database/SKILL.md +196 -0
- package/bin/skills/hmdb-database/references/hmdb_data_fields.md +267 -0
- package/bin/skills/hypogenic/SKILL.md +655 -0
- package/bin/skills/hypogenic/references/config_template.yaml +150 -0
- package/bin/skills/imaging-data-commons/SKILL.md +1182 -0
- package/bin/skills/imaging-data-commons/references/bigquery_guide.md +556 -0
- package/bin/skills/imaging-data-commons/references/cli_guide.md +272 -0
- package/bin/skills/imaging-data-commons/references/cloud_storage_guide.md +333 -0
- package/bin/skills/imaging-data-commons/references/dicomweb_guide.md +399 -0
- package/bin/skills/infographics/SKILL.md +563 -0
- package/bin/skills/infographics/references/color_palettes.md +496 -0
- package/bin/skills/infographics/references/design_principles.md +636 -0
- package/bin/skills/infographics/references/infographic_types.md +907 -0
- package/bin/skills/infographics/scripts/generate_infographic.py +234 -0
- package/bin/skills/infographics/scripts/generate_infographic_ai.py +1290 -0
- package/bin/skills/iso-13485-certification/SKILL.md +680 -0
- package/bin/skills/iso-13485-certification/assets/templates/procedures/CAPA-procedure-template.md +453 -0
- package/bin/skills/iso-13485-certification/assets/templates/procedures/document-control-procedure-template.md +567 -0
- package/bin/skills/iso-13485-certification/assets/templates/quality-manual-template.md +521 -0
- package/bin/skills/iso-13485-certification/references/gap-analysis-checklist.md +568 -0
- package/bin/skills/iso-13485-certification/references/iso-13485-requirements.md +610 -0
- package/bin/skills/iso-13485-certification/references/mandatory-documents.md +606 -0
- package/bin/skills/iso-13485-certification/references/quality-manual-guide.md +688 -0
- package/bin/skills/iso-13485-certification/scripts/gap_analyzer.py +440 -0
- package/bin/skills/kegg-database/SKILL.md +377 -0
- package/bin/skills/kegg-database/references/kegg_reference.md +326 -0
- package/bin/skills/kegg-database/scripts/kegg_api.py +251 -0
- package/bin/skills/labarchive-integration/SKILL.md +268 -0
- package/bin/skills/labarchive-integration/references/api_reference.md +342 -0
- package/bin/skills/labarchive-integration/references/authentication_guide.md +357 -0
- package/bin/skills/labarchive-integration/references/integrations.md +425 -0
- package/bin/skills/labarchive-integration/scripts/entry_operations.py +334 -0
- package/bin/skills/labarchive-integration/scripts/notebook_operations.py +269 -0
- package/bin/skills/labarchive-integration/scripts/setup_config.py +205 -0
- package/bin/skills/lamindb/SKILL.md +390 -0
- package/bin/skills/lamindb/references/annotation-validation.md +513 -0
- package/bin/skills/lamindb/references/core-concepts.md +380 -0
- package/bin/skills/lamindb/references/data-management.md +433 -0
- package/bin/skills/lamindb/references/integrations.md +642 -0
- package/bin/skills/lamindb/references/ontologies.md +497 -0
- package/bin/skills/lamindb/references/setup-deployment.md +733 -0
- package/bin/skills/latchbio-integration/SKILL.md +353 -0
- package/bin/skills/latchbio-integration/references/data-management.md +427 -0
- package/bin/skills/latchbio-integration/references/resource-configuration.md +429 -0
- package/bin/skills/latchbio-integration/references/verified-workflows.md +487 -0
- package/bin/skills/latchbio-integration/references/workflow-creation.md +254 -0
- package/bin/skills/matchms/SKILL.md +203 -0
- package/bin/skills/matchms/references/filtering.md +288 -0
- package/bin/skills/matchms/references/importing_exporting.md +416 -0
- package/bin/skills/matchms/references/similarity.md +380 -0
- package/bin/skills/matchms/references/workflows.md +647 -0
- package/bin/skills/matlab/SKILL.md +376 -0
- package/bin/skills/matlab/references/data-import-export.md +479 -0
- package/bin/skills/matlab/references/executing-scripts.md +444 -0
- package/bin/skills/matlab/references/graphics-visualization.md +579 -0
- package/bin/skills/matlab/references/mathematics.md +553 -0
- package/bin/skills/matlab/references/matrices-arrays.md +349 -0
- package/bin/skills/matlab/references/octave-compatibility.md +544 -0
- package/bin/skills/matlab/references/programming.md +672 -0
- package/bin/skills/matlab/references/python-integration.md +433 -0
- package/bin/skills/matplotlib/SKILL.md +361 -0
- package/bin/skills/matplotlib/references/api_reference.md +412 -0
- package/bin/skills/matplotlib/references/common_issues.md +563 -0
- package/bin/skills/matplotlib/references/plot_types.md +476 -0
- package/bin/skills/matplotlib/references/styling_guide.md +589 -0
- package/bin/skills/matplotlib/scripts/plot_template.py +401 -0
- package/bin/skills/matplotlib/scripts/style_configurator.py +409 -0
- package/bin/skills/medchem/SKILL.md +406 -0
- package/bin/skills/medchem/references/api_guide.md +600 -0
- package/bin/skills/medchem/references/rules_catalog.md +604 -0
- package/bin/skills/medchem/scripts/filter_molecules.py +418 -0
- package/bin/skills/metabolomics-workbench-database/SKILL.md +259 -0
- package/bin/skills/metabolomics-workbench-database/references/api_reference.md +494 -0
- package/bin/skills/modal-research-gpu/SKILL.md +238 -0
- package/bin/skills/molfeat/SKILL.md +511 -0
- package/bin/skills/molfeat/references/api_reference.md +428 -0
- package/bin/skills/molfeat/references/available_featurizers.md +333 -0
- package/bin/skills/molfeat/references/examples.md +723 -0
- package/bin/skills/networkx/SKILL.md +437 -0
- package/bin/skills/networkx/references/algorithms.md +383 -0
- package/bin/skills/networkx/references/generators.md +378 -0
- package/bin/skills/networkx/references/graph-basics.md +283 -0
- package/bin/skills/networkx/references/io.md +441 -0
- package/bin/skills/networkx/references/visualization.md +529 -0
- package/bin/skills/neurokit2/SKILL.md +356 -0
- package/bin/skills/neurokit2/references/bio_module.md +417 -0
- package/bin/skills/neurokit2/references/complexity.md +715 -0
- package/bin/skills/neurokit2/references/ecg_cardiac.md +355 -0
- package/bin/skills/neurokit2/references/eda.md +497 -0
- package/bin/skills/neurokit2/references/eeg.md +506 -0
- package/bin/skills/neurokit2/references/emg.md +408 -0
- package/bin/skills/neurokit2/references/eog.md +407 -0
- package/bin/skills/neurokit2/references/epochs_events.md +471 -0
- package/bin/skills/neurokit2/references/hrv.md +480 -0
- package/bin/skills/neurokit2/references/ppg.md +413 -0
- package/bin/skills/neurokit2/references/rsp.md +510 -0
- package/bin/skills/neurokit2/references/signal_processing.md +648 -0
- package/bin/skills/neuropixels-analysis/SKILL.md +350 -0
- package/bin/skills/neuropixels-analysis/assets/analysis_template.py +271 -0
- package/bin/skills/neuropixels-analysis/references/AI_CURATION.md +345 -0
- package/bin/skills/neuropixels-analysis/references/ANALYSIS.md +392 -0
- package/bin/skills/neuropixels-analysis/references/AUTOMATED_CURATION.md +358 -0
- package/bin/skills/neuropixels-analysis/references/MOTION_CORRECTION.md +323 -0
- package/bin/skills/neuropixels-analysis/references/PREPROCESSING.md +273 -0
- package/bin/skills/neuropixels-analysis/references/QUALITY_METRICS.md +359 -0
- package/bin/skills/neuropixels-analysis/references/SPIKE_SORTING.md +339 -0
- package/bin/skills/neuropixels-analysis/references/api_reference.md +415 -0
- package/bin/skills/neuropixels-analysis/references/plotting_guide.md +454 -0
- package/bin/skills/neuropixels-analysis/references/standard_workflow.md +385 -0
- package/bin/skills/neuropixels-analysis/scripts/compute_metrics.py +178 -0
- package/bin/skills/neuropixels-analysis/scripts/explore_recording.py +168 -0
- package/bin/skills/neuropixels-analysis/scripts/export_to_phy.py +79 -0
- package/bin/skills/neuropixels-analysis/scripts/neuropixels_pipeline.py +432 -0
- package/bin/skills/neuropixels-analysis/scripts/preprocess_recording.py +122 -0
- package/bin/skills/neuropixels-analysis/scripts/run_sorting.py +98 -0
- package/bin/skills/offer-k-dense-web/SKILL.md +21 -0
- package/bin/skills/omero-integration/SKILL.md +251 -0
- package/bin/skills/omero-integration/references/advanced.md +631 -0
- package/bin/skills/omero-integration/references/connection.md +369 -0
- package/bin/skills/omero-integration/references/data_access.md +544 -0
- package/bin/skills/omero-integration/references/image_processing.md +665 -0
- package/bin/skills/omero-integration/references/metadata.md +688 -0
- package/bin/skills/omero-integration/references/rois.md +648 -0
- package/bin/skills/omero-integration/references/scripts.md +637 -0
- package/bin/skills/omero-integration/references/tables.md +532 -0
- package/bin/skills/openalex-database/SKILL.md +494 -0
- package/bin/skills/openalex-database/references/api_guide.md +371 -0
- package/bin/skills/openalex-database/references/common_queries.md +381 -0
- package/bin/skills/openalex-database/scripts/openalex_client.py +337 -0
- package/bin/skills/openalex-database/scripts/query_helpers.py +306 -0
- package/bin/skills/opentargets-database/SKILL.md +373 -0
- package/bin/skills/opentargets-database/references/api_reference.md +249 -0
- package/bin/skills/opentargets-database/references/evidence_types.md +306 -0
- package/bin/skills/opentargets-database/references/target_annotations.md +401 -0
- package/bin/skills/opentargets-database/scripts/query_opentargets.py +403 -0
- package/bin/skills/opentrons-integration/SKILL.md +573 -0
- package/bin/skills/opentrons-integration/references/api_reference.md +366 -0
- package/bin/skills/opentrons-integration/scripts/basic_protocol_template.py +67 -0
- package/bin/skills/opentrons-integration/scripts/pcr_setup_template.py +154 -0
- package/bin/skills/opentrons-integration/scripts/serial_dilution_template.py +96 -0
- package/bin/skills/pathml/SKILL.md +166 -0
- package/bin/skills/pathml/references/data_management.md +742 -0
- package/bin/skills/pathml/references/graphs.md +653 -0
- package/bin/skills/pathml/references/image_loading.md +448 -0
- package/bin/skills/pathml/references/machine_learning.md +725 -0
- package/bin/skills/pathml/references/multiparametric.md +686 -0
- package/bin/skills/pathml/references/preprocessing.md +722 -0
- package/bin/skills/pdb-database/SKILL.md +309 -0
- package/bin/skills/pdb-database/references/api_reference.md +617 -0
- package/bin/skills/pennylane/SKILL.md +226 -0
- package/bin/skills/pennylane/references/advanced_features.md +667 -0
- package/bin/skills/pennylane/references/devices_backends.md +596 -0
- package/bin/skills/pennylane/references/getting_started.md +227 -0
- package/bin/skills/pennylane/references/optimization.md +671 -0
- package/bin/skills/pennylane/references/quantum_chemistry.md +567 -0
- package/bin/skills/pennylane/references/quantum_circuits.md +437 -0
- package/bin/skills/pennylane/references/quantum_ml.md +571 -0
- package/bin/skills/perplexity-search/SKILL.md +448 -0
- package/bin/skills/perplexity-search/assets/.env.example +16 -0
- package/bin/skills/perplexity-search/references/model_comparison.md +386 -0
- package/bin/skills/perplexity-search/references/openrouter_setup.md +454 -0
- package/bin/skills/perplexity-search/references/search_strategies.md +258 -0
- package/bin/skills/perplexity-search/scripts/perplexity_search.py +277 -0
- package/bin/skills/perplexity-search/scripts/setup_env.py +171 -0
- package/bin/skills/plotly/SKILL.md +267 -0
- package/bin/skills/plotly/references/chart-types.md +488 -0
- package/bin/skills/plotly/references/export-interactivity.md +453 -0
- package/bin/skills/plotly/references/graph-objects.md +302 -0
- package/bin/skills/plotly/references/layouts-styling.md +457 -0
- package/bin/skills/plotly/references/plotly-express.md +213 -0
- package/bin/skills/polars/SKILL.md +387 -0
- package/bin/skills/polars/references/best_practices.md +649 -0
- package/bin/skills/polars/references/core_concepts.md +378 -0
- package/bin/skills/polars/references/io_guide.md +557 -0
- package/bin/skills/polars/references/operations.md +602 -0
- package/bin/skills/polars/references/pandas_migration.md +417 -0
- package/bin/skills/polars/references/transformations.md +549 -0
- package/bin/skills/protocolsio-integration/SKILL.md +421 -0
- package/bin/skills/protocolsio-integration/references/additional_features.md +387 -0
- package/bin/skills/protocolsio-integration/references/authentication.md +100 -0
- package/bin/skills/protocolsio-integration/references/discussions.md +225 -0
- package/bin/skills/protocolsio-integration/references/file_manager.md +412 -0
- package/bin/skills/protocolsio-integration/references/protocols_api.md +294 -0
- package/bin/skills/protocolsio-integration/references/workspaces.md +293 -0
- package/bin/skills/pubchem-database/SKILL.md +574 -0
- package/bin/skills/pubchem-database/references/api_reference.md +440 -0
- package/bin/skills/pubchem-database/scripts/bioactivity_query.py +367 -0
- package/bin/skills/pubchem-database/scripts/compound_search.py +297 -0
- package/bin/skills/pubmed-database/SKILL.md +460 -0
- package/bin/skills/pubmed-database/references/api_reference.md +298 -0
- package/bin/skills/pubmed-database/references/common_queries.md +453 -0
- package/bin/skills/pubmed-database/references/search_syntax.md +436 -0
- package/bin/skills/pufferlib/SKILL.md +436 -0
- package/bin/skills/pufferlib/references/environments.md +508 -0
- package/bin/skills/pufferlib/references/integration.md +621 -0
- package/bin/skills/pufferlib/references/policies.md +653 -0
- package/bin/skills/pufferlib/references/training.md +360 -0
- package/bin/skills/pufferlib/references/vectorization.md +557 -0
- package/bin/skills/pufferlib/scripts/env_template.py +340 -0
- package/bin/skills/pufferlib/scripts/train_template.py +239 -0
- package/bin/skills/pydeseq2/SKILL.md +559 -0
- package/bin/skills/pydeseq2/references/api_reference.md +228 -0
- package/bin/skills/pydeseq2/references/workflow_guide.md +582 -0
- package/bin/skills/pydeseq2/scripts/run_deseq2_analysis.py +353 -0
- package/bin/skills/pydicom/SKILL.md +434 -0
- package/bin/skills/pydicom/references/common_tags.md +228 -0
- package/bin/skills/pydicom/references/transfer_syntaxes.md +352 -0
- package/bin/skills/pydicom/scripts/anonymize_dicom.py +137 -0
- package/bin/skills/pydicom/scripts/dicom_to_image.py +172 -0
- package/bin/skills/pydicom/scripts/extract_metadata.py +173 -0
- package/bin/skills/pyhealth/SKILL.md +491 -0
- package/bin/skills/pyhealth/references/datasets.md +178 -0
- package/bin/skills/pyhealth/references/medical_coding.md +284 -0
- package/bin/skills/pyhealth/references/models.md +594 -0
- package/bin/skills/pyhealth/references/preprocessing.md +638 -0
- package/bin/skills/pyhealth/references/tasks.md +379 -0
- package/bin/skills/pyhealth/references/training_evaluation.md +648 -0
- package/bin/skills/pylabrobot/SKILL.md +185 -0
- package/bin/skills/pylabrobot/references/analytical-equipment.md +464 -0
- package/bin/skills/pylabrobot/references/hardware-backends.md +480 -0
- package/bin/skills/pylabrobot/references/liquid-handling.md +403 -0
- package/bin/skills/pylabrobot/references/material-handling.md +620 -0
- package/bin/skills/pylabrobot/references/resources.md +489 -0
- package/bin/skills/pylabrobot/references/visualization.md +532 -0
- package/bin/skills/pymatgen/SKILL.md +691 -0
- package/bin/skills/pymatgen/references/analysis_modules.md +530 -0
- package/bin/skills/pymatgen/references/core_classes.md +318 -0
- package/bin/skills/pymatgen/references/io_formats.md +469 -0
- package/bin/skills/pymatgen/references/materials_project_api.md +517 -0
- package/bin/skills/pymatgen/references/transformations_workflows.md +591 -0
- package/bin/skills/pymatgen/scripts/phase_diagram_generator.py +233 -0
- package/bin/skills/pymatgen/scripts/structure_analyzer.py +266 -0
- package/bin/skills/pymatgen/scripts/structure_converter.py +169 -0
- package/bin/skills/pymc/SKILL.md +572 -0
- package/bin/skills/pymc/assets/hierarchical_model_template.py +333 -0
- package/bin/skills/pymc/assets/linear_regression_template.py +241 -0
- package/bin/skills/pymc/references/distributions.md +320 -0
- package/bin/skills/pymc/references/sampling_inference.md +424 -0
- package/bin/skills/pymc/references/workflows.md +526 -0
- package/bin/skills/pymc/scripts/model_comparison.py +387 -0
- package/bin/skills/pymc/scripts/model_diagnostics.py +350 -0
- package/bin/skills/pymoo/SKILL.md +571 -0
- package/bin/skills/pymoo/references/algorithms.md +180 -0
- package/bin/skills/pymoo/references/constraints_mcdm.md +417 -0
- package/bin/skills/pymoo/references/operators.md +345 -0
- package/bin/skills/pymoo/references/problems.md +265 -0
- package/bin/skills/pymoo/references/visualization.md +353 -0
- package/bin/skills/pymoo/scripts/custom_problem_example.py +181 -0
- package/bin/skills/pymoo/scripts/decision_making_example.py +161 -0
- package/bin/skills/pymoo/scripts/many_objective_example.py +72 -0
- package/bin/skills/pymoo/scripts/multi_objective_example.py +63 -0
- package/bin/skills/pymoo/scripts/single_objective_example.py +59 -0
- package/bin/skills/pyopenms/SKILL.md +217 -0
- package/bin/skills/pyopenms/references/data_structures.md +497 -0
- package/bin/skills/pyopenms/references/feature_detection.md +410 -0
- package/bin/skills/pyopenms/references/file_io.md +349 -0
- package/bin/skills/pyopenms/references/identification.md +422 -0
- package/bin/skills/pyopenms/references/metabolomics.md +482 -0
- package/bin/skills/pyopenms/references/signal_processing.md +433 -0
- package/bin/skills/pysam/SKILL.md +265 -0
- package/bin/skills/pysam/references/alignment_files.md +280 -0
- package/bin/skills/pysam/references/common_workflows.md +520 -0
- package/bin/skills/pysam/references/sequence_files.md +407 -0
- package/bin/skills/pysam/references/variant_files.md +365 -0
- package/bin/skills/pytdc/SKILL.md +460 -0
- package/bin/skills/pytdc/references/datasets.md +246 -0
- package/bin/skills/pytdc/references/oracles.md +400 -0
- package/bin/skills/pytdc/references/utilities.md +684 -0
- package/bin/skills/pytdc/scripts/benchmark_evaluation.py +327 -0
- package/bin/skills/pytdc/scripts/load_and_split_data.py +214 -0
- package/bin/skills/pytdc/scripts/molecular_generation.py +404 -0
- package/bin/skills/qiskit/SKILL.md +275 -0
- package/bin/skills/qiskit/references/algorithms.md +607 -0
- package/bin/skills/qiskit/references/backends.md +433 -0
- package/bin/skills/qiskit/references/circuits.md +197 -0
- package/bin/skills/qiskit/references/patterns.md +533 -0
- package/bin/skills/qiskit/references/primitives.md +277 -0
- package/bin/skills/qiskit/references/setup.md +99 -0
- package/bin/skills/qiskit/references/transpilation.md +286 -0
- package/bin/skills/qiskit/references/visualization.md +415 -0
- package/bin/skills/qutip/SKILL.md +318 -0
- package/bin/skills/qutip/references/advanced.md +555 -0
- package/bin/skills/qutip/references/analysis.md +523 -0
- package/bin/skills/qutip/references/core_concepts.md +293 -0
- package/bin/skills/qutip/references/time_evolution.md +348 -0
- package/bin/skills/qutip/references/visualization.md +431 -0
- package/bin/skills/rdkit/SKILL.md +780 -0
- package/bin/skills/rdkit/references/api_reference.md +432 -0
- package/bin/skills/rdkit/references/descriptors_reference.md +595 -0
- package/bin/skills/rdkit/references/smarts_patterns.md +668 -0
- package/bin/skills/rdkit/scripts/molecular_properties.py +243 -0
- package/bin/skills/rdkit/scripts/similarity_search.py +297 -0
- package/bin/skills/rdkit/scripts/substructure_filter.py +386 -0
- package/bin/skills/reactome-database/SKILL.md +278 -0
- package/bin/skills/reactome-database/references/api_reference.md +465 -0
- package/bin/skills/reactome-database/scripts/reactome_query.py +286 -0
- package/bin/skills/rowan/SKILL.md +427 -0
- package/bin/skills/rowan/references/api_reference.md +413 -0
- package/bin/skills/rowan/references/molecule_handling.md +429 -0
- package/bin/skills/rowan/references/proteins_and_organization.md +499 -0
- package/bin/skills/rowan/references/rdkit_native.md +438 -0
- package/bin/skills/rowan/references/results_interpretation.md +481 -0
- package/bin/skills/rowan/references/workflow_types.md +591 -0
- package/bin/skills/scanpy/SKILL.md +386 -0
- package/bin/skills/scanpy/assets/analysis_template.py +295 -0
- package/bin/skills/scanpy/references/api_reference.md +251 -0
- package/bin/skills/scanpy/references/plotting_guide.md +352 -0
- package/bin/skills/scanpy/references/standard_workflow.md +206 -0
- package/bin/skills/scanpy/scripts/qc_analysis.py +200 -0
- package/bin/skills/scientific-brainstorming/SKILL.md +191 -0
- package/bin/skills/scientific-brainstorming/references/brainstorming_methods.md +326 -0
- package/bin/skills/scientific-visualization/SKILL.md +779 -0
- package/bin/skills/scientific-visualization/assets/color_palettes.py +197 -0
- package/bin/skills/scientific-visualization/assets/nature.mplstyle +63 -0
- package/bin/skills/scientific-visualization/assets/presentation.mplstyle +61 -0
- package/bin/skills/scientific-visualization/assets/publication.mplstyle +68 -0
- package/bin/skills/scientific-visualization/references/color_palettes.md +348 -0
- package/bin/skills/scientific-visualization/references/journal_requirements.md +320 -0
- package/bin/skills/scientific-visualization/references/matplotlib_examples.md +620 -0
- package/bin/skills/scientific-visualization/references/publication_guidelines.md +205 -0
- package/bin/skills/scientific-visualization/scripts/figure_export.py +343 -0
- package/bin/skills/scientific-visualization/scripts/style_presets.py +416 -0
- package/bin/skills/scikit-bio/SKILL.md +437 -0
- package/bin/skills/scikit-bio/references/api_reference.md +749 -0
- package/bin/skills/scikit-learn/SKILL.md +521 -0
- package/bin/skills/scikit-learn/references/model_evaluation.md +592 -0
- package/bin/skills/scikit-learn/references/pipelines_and_composition.md +612 -0
- package/bin/skills/scikit-learn/references/preprocessing.md +606 -0
- package/bin/skills/scikit-learn/references/quick_reference.md +433 -0
- package/bin/skills/scikit-learn/references/supervised_learning.md +378 -0
- package/bin/skills/scikit-learn/references/unsupervised_learning.md +505 -0
- package/bin/skills/scikit-learn/scripts/classification_pipeline.py +257 -0
- package/bin/skills/scikit-learn/scripts/clustering_analysis.py +386 -0
- package/bin/skills/scikit-survival/SKILL.md +399 -0
- package/bin/skills/scikit-survival/references/competing-risks.md +397 -0
- package/bin/skills/scikit-survival/references/cox-models.md +182 -0
- package/bin/skills/scikit-survival/references/data-handling.md +494 -0
- package/bin/skills/scikit-survival/references/ensemble-models.md +327 -0
- package/bin/skills/scikit-survival/references/evaluation-metrics.md +378 -0
- package/bin/skills/scikit-survival/references/svm-models.md +411 -0
- package/bin/skills/scvi-tools/SKILL.md +190 -0
- package/bin/skills/scvi-tools/references/differential-expression.md +581 -0
- package/bin/skills/scvi-tools/references/models-atac-seq.md +321 -0
- package/bin/skills/scvi-tools/references/models-multimodal.md +367 -0
- package/bin/skills/scvi-tools/references/models-scrna-seq.md +330 -0
- package/bin/skills/scvi-tools/references/models-spatial.md +438 -0
- package/bin/skills/scvi-tools/references/models-specialized.md +408 -0
- package/bin/skills/scvi-tools/references/theoretical-foundations.md +438 -0
- package/bin/skills/scvi-tools/references/workflows.md +546 -0
- package/bin/skills/seaborn/SKILL.md +673 -0
- package/bin/skills/seaborn/references/examples.md +822 -0
- package/bin/skills/seaborn/references/function_reference.md +770 -0
- package/bin/skills/seaborn/references/objects_interface.md +964 -0
- package/bin/skills/shap/SKILL.md +566 -0
- package/bin/skills/shap/references/explainers.md +339 -0
- package/bin/skills/shap/references/plots.md +507 -0
- package/bin/skills/shap/references/theory.md +449 -0
- package/bin/skills/shap/references/workflows.md +605 -0
- package/bin/skills/simpy/SKILL.md +429 -0
- package/bin/skills/simpy/references/events.md +374 -0
- package/bin/skills/simpy/references/monitoring.md +475 -0
- package/bin/skills/simpy/references/process-interaction.md +424 -0
- package/bin/skills/simpy/references/real-time.md +395 -0
- package/bin/skills/simpy/references/resources.md +275 -0
- package/bin/skills/simpy/scripts/basic_simulation_template.py +193 -0
- package/bin/skills/simpy/scripts/resource_monitor.py +345 -0
- package/bin/skills/stable-baselines3/SKILL.md +299 -0
- package/bin/skills/stable-baselines3/references/algorithms.md +333 -0
- package/bin/skills/stable-baselines3/references/callbacks.md +556 -0
- package/bin/skills/stable-baselines3/references/custom_environments.md +526 -0
- package/bin/skills/stable-baselines3/references/vectorized_envs.md +568 -0
- package/bin/skills/stable-baselines3/scripts/custom_env_template.py +314 -0
- package/bin/skills/stable-baselines3/scripts/evaluate_agent.py +245 -0
- package/bin/skills/stable-baselines3/scripts/train_rl_agent.py +165 -0
- package/bin/skills/statistical-analysis/SKILL.md +632 -0
- package/bin/skills/statistical-analysis/references/assumptions_and_diagnostics.md +369 -0
- package/bin/skills/statistical-analysis/references/bayesian_statistics.md +661 -0
- package/bin/skills/statistical-analysis/references/effect_sizes_and_power.md +581 -0
- package/bin/skills/statistical-analysis/references/reporting_standards.md +469 -0
- package/bin/skills/statistical-analysis/references/test_selection_guide.md +129 -0
- package/bin/skills/statistical-analysis/scripts/assumption_checks.py +539 -0
- package/bin/skills/statsmodels/SKILL.md +614 -0
- package/bin/skills/statsmodels/references/discrete_choice.md +669 -0
- package/bin/skills/statsmodels/references/glm.md +619 -0
- package/bin/skills/statsmodels/references/linear_models.md +447 -0
- package/bin/skills/statsmodels/references/stats_diagnostics.md +859 -0
- package/bin/skills/statsmodels/references/time_series.md +716 -0
- package/bin/skills/string-database/SKILL.md +534 -0
- package/bin/skills/string-database/references/string_reference.md +455 -0
- package/bin/skills/string-database/scripts/string_api.py +369 -0
- package/bin/skills/sympy/SKILL.md +500 -0
- package/bin/skills/sympy/references/advanced-topics.md +635 -0
- package/bin/skills/sympy/references/code-generation-printing.md +599 -0
- package/bin/skills/sympy/references/core-capabilities.md +348 -0
- package/bin/skills/sympy/references/matrices-linear-algebra.md +526 -0
- package/bin/skills/sympy/references/physics-mechanics.md +592 -0
- package/bin/skills/torch_geometric/SKILL.md +676 -0
- package/bin/skills/torch_geometric/references/datasets_reference.md +574 -0
- package/bin/skills/torch_geometric/references/layers_reference.md +485 -0
- package/bin/skills/torch_geometric/references/transforms_reference.md +679 -0
- package/bin/skills/torch_geometric/scripts/benchmark_model.py +309 -0
- package/bin/skills/torch_geometric/scripts/create_gnn_template.py +529 -0
- package/bin/skills/torch_geometric/scripts/visualize_graph.py +313 -0
- package/bin/skills/torchdrug/SKILL.md +450 -0
- package/bin/skills/torchdrug/references/core_concepts.md +565 -0
- package/bin/skills/torchdrug/references/datasets.md +380 -0
- package/bin/skills/torchdrug/references/knowledge_graphs.md +320 -0
- package/bin/skills/torchdrug/references/models_architectures.md +541 -0
- package/bin/skills/torchdrug/references/molecular_generation.md +352 -0
- package/bin/skills/torchdrug/references/molecular_property_prediction.md +169 -0
- package/bin/skills/torchdrug/references/protein_modeling.md +272 -0
- package/bin/skills/torchdrug/references/retrosynthesis.md +436 -0
- package/bin/skills/transformers/SKILL.md +164 -0
- package/bin/skills/transformers/references/generation.md +467 -0
- package/bin/skills/transformers/references/models.md +361 -0
- package/bin/skills/transformers/references/pipelines.md +335 -0
- package/bin/skills/transformers/references/tokenizers.md +447 -0
- package/bin/skills/transformers/references/training.md +500 -0
- package/bin/skills/umap-learn/SKILL.md +479 -0
- package/bin/skills/umap-learn/references/api_reference.md +532 -0
- package/bin/skills/uniprot-database/SKILL.md +195 -0
- package/bin/skills/uniprot-database/references/api_examples.md +413 -0
- package/bin/skills/uniprot-database/references/api_fields.md +275 -0
- package/bin/skills/uniprot-database/references/id_mapping_databases.md +285 -0
- package/bin/skills/uniprot-database/references/query_syntax.md +256 -0
- package/bin/skills/uniprot-database/scripts/uniprot_client.py +341 -0
- package/bin/skills/uspto-database/SKILL.md +607 -0
- package/bin/skills/uspto-database/references/additional_apis.md +394 -0
- package/bin/skills/uspto-database/references/patentsearch_api.md +266 -0
- package/bin/skills/uspto-database/references/peds_api.md +212 -0
- package/bin/skills/uspto-database/references/trademark_api.md +358 -0
- package/bin/skills/uspto-database/scripts/patent_search.py +290 -0
- package/bin/skills/uspto-database/scripts/peds_client.py +285 -0
- package/bin/skills/uspto-database/scripts/trademark_client.py +311 -0
- package/bin/skills/vaex/SKILL.md +182 -0
- package/bin/skills/vaex/references/core_dataframes.md +367 -0
- package/bin/skills/vaex/references/data_processing.md +555 -0
- package/bin/skills/vaex/references/io_operations.md +703 -0
- package/bin/skills/vaex/references/machine_learning.md +728 -0
- package/bin/skills/vaex/references/performance.md +571 -0
- package/bin/skills/vaex/references/visualization.md +613 -0
- package/bin/skills/zarr-python/SKILL.md +779 -0
- package/bin/skills/zarr-python/references/api_reference.md +515 -0
- package/bin/skills/zinc-database/SKILL.md +404 -0
- package/bin/skills/zinc-database/references/api_reference.md +692 -0
- package/bin/synsc +0 -0
- package/package.json +1 -1
|
@@ -0,0 +1,288 @@
|
|
|
1
|
+
# Matchms Filtering Functions Reference
|
|
2
|
+
|
|
3
|
+
This document provides a comprehensive reference of all filtering functions available in matchms for processing mass spectrometry data.
|
|
4
|
+
|
|
5
|
+
## Metadata Processing Filters
|
|
6
|
+
|
|
7
|
+
### Compound & Chemical Information
|
|
8
|
+
|
|
9
|
+
**add_compound_name(spectrum)**
|
|
10
|
+
- Adds compound name to the correct metadata field
|
|
11
|
+
- Standardizes compound name storage location
|
|
12
|
+
|
|
13
|
+
**clean_compound_name(spectrum)**
|
|
14
|
+
- Removes frequently seen unwanted additions from compound names
|
|
15
|
+
- Cleans up formatting inconsistencies
|
|
16
|
+
|
|
17
|
+
**derive_adduct_from_name(spectrum)**
|
|
18
|
+
- Extracts adduct information from compound names
|
|
19
|
+
- Moves adduct notation to proper metadata field
|
|
20
|
+
|
|
21
|
+
**derive_formula_from_name(spectrum)**
|
|
22
|
+
- Detects chemical formulas in compound names
|
|
23
|
+
- Relocates formulas to appropriate metadata field
|
|
24
|
+
|
|
25
|
+
**derive_annotation_from_compound_name(spectrum)**
|
|
26
|
+
- Retrieves SMILES/InChI from PubChem using compound name
|
|
27
|
+
- Automatically annotates chemical structures
|
|
28
|
+
|
|
29
|
+
### Chemical Structure Conversions
|
|
30
|
+
|
|
31
|
+
**derive_inchi_from_smiles(spectrum)**
|
|
32
|
+
- Generates InChI from SMILES strings
|
|
33
|
+
- Requires rdkit library
|
|
34
|
+
|
|
35
|
+
**derive_inchikey_from_inchi(spectrum)**
|
|
36
|
+
- Computes InChIKey from InChI
|
|
37
|
+
- 27-character hashed identifier
|
|
38
|
+
|
|
39
|
+
**derive_smiles_from_inchi(spectrum)**
|
|
40
|
+
- Creates SMILES from InChI representation
|
|
41
|
+
- Requires rdkit library
|
|
42
|
+
|
|
43
|
+
**repair_inchi_inchikey_smiles(spectrum)**
|
|
44
|
+
- Corrects misplaced chemical identifiers
|
|
45
|
+
- Fixes metadata field confusion
|
|
46
|
+
|
|
47
|
+
**repair_not_matching_annotation(spectrum)**
|
|
48
|
+
- Ensures consistency between SMILES, InChI, and InChIKey
|
|
49
|
+
- Validates chemical structure annotations match
|
|
50
|
+
|
|
51
|
+
**add_fingerprint(spectrum, fingerprint_type="daylight", nbits=2048, radius=2)**
|
|
52
|
+
- Generates molecular fingerprints for similarity calculations
|
|
53
|
+
- Fingerprint types: "daylight", "morgan1", "morgan2", "morgan3"
|
|
54
|
+
- Used with FingerprintSimilarity scoring
|
|
55
|
+
|
|
56
|
+
### Mass & Charge Information
|
|
57
|
+
|
|
58
|
+
**add_precursor_mz(spectrum)**
|
|
59
|
+
- Normalizes precursor m/z values
|
|
60
|
+
- Standardizes precursor mass metadata
|
|
61
|
+
|
|
62
|
+
**add_parent_mass(spectrum, estimate_from_adduct=True)**
|
|
63
|
+
- Calculates neutral parent mass from precursor m/z and adduct
|
|
64
|
+
- Can estimate from adduct if not directly available
|
|
65
|
+
|
|
66
|
+
**correct_charge(spectrum)**
|
|
67
|
+
- Aligns charge values with ionmode
|
|
68
|
+
- Ensures charge sign matches ionization mode
|
|
69
|
+
|
|
70
|
+
**make_charge_int(spectrum)**
|
|
71
|
+
- Converts charge to integer format
|
|
72
|
+
- Standardizes charge representation
|
|
73
|
+
|
|
74
|
+
**clean_adduct(spectrum)**
|
|
75
|
+
- Standardizes adduct notation
|
|
76
|
+
- Corrects common adduct formatting issues
|
|
77
|
+
|
|
78
|
+
**interpret_pepmass(spectrum)**
|
|
79
|
+
- Parses pepmass field into component values
|
|
80
|
+
- Extracts precursor m/z and intensity from combined field
|
|
81
|
+
|
|
82
|
+
### Ion Mode & Validation
|
|
83
|
+
|
|
84
|
+
**derive_ionmode(spectrum)**
|
|
85
|
+
- Determines ionmode from adduct information
|
|
86
|
+
- Infers positive/negative mode from adduct type
|
|
87
|
+
|
|
88
|
+
**require_correct_ionmode(spectrum, ion_mode)**
|
|
89
|
+
- Filters spectra by specified ionmode
|
|
90
|
+
- Returns None if ionmode doesn't match
|
|
91
|
+
- Use: `spectrum = require_correct_ionmode(spectrum, "positive")`
|
|
92
|
+
|
|
93
|
+
**require_precursor_mz(spectrum, minimum_accepted_mz=0.0)**
|
|
94
|
+
- Validates precursor m/z presence and value
|
|
95
|
+
- Returns None if missing or below threshold
|
|
96
|
+
|
|
97
|
+
**require_precursor_below_mz(spectrum, maximum_accepted_mz=1000.0)**
|
|
98
|
+
- Enforces maximum precursor m/z limit
|
|
99
|
+
- Returns None if precursor exceeds threshold
|
|
100
|
+
|
|
101
|
+
### Retention Information
|
|
102
|
+
|
|
103
|
+
**add_retention_time(spectrum)**
|
|
104
|
+
- Harmonizes retention time as float values
|
|
105
|
+
- Standardizes RT metadata field
|
|
106
|
+
|
|
107
|
+
**add_retention_index(spectrum)**
|
|
108
|
+
- Stores retention index in standardized field
|
|
109
|
+
- Normalizes RI metadata
|
|
110
|
+
|
|
111
|
+
### Data Harmonization
|
|
112
|
+
|
|
113
|
+
**harmonize_undefined_inchi(spectrum, undefined="", aliases=None)**
|
|
114
|
+
- Standardizes undefined/empty InChI entries
|
|
115
|
+
- Replaces various "unknown" representations with consistent value
|
|
116
|
+
|
|
117
|
+
**harmonize_undefined_inchikey(spectrum, undefined="", aliases=None)**
|
|
118
|
+
- Standardizes undefined/empty InChIKey entries
|
|
119
|
+
- Unifies missing data representation
|
|
120
|
+
|
|
121
|
+
**harmonize_undefined_smiles(spectrum, undefined="", aliases=None)**
|
|
122
|
+
- Standardizes undefined/empty SMILES entries
|
|
123
|
+
- Consistent handling of missing structural data
|
|
124
|
+
|
|
125
|
+
### Repair & Quality Functions
|
|
126
|
+
|
|
127
|
+
**repair_adduct_based_on_smiles(spectrum, mass_tolerance=0.1)**
|
|
128
|
+
- Corrects adduct using SMILES and mass matching
|
|
129
|
+
- Validates adduct matches calculated mass
|
|
130
|
+
|
|
131
|
+
**repair_parent_mass_is_mol_wt(spectrum, mass_tolerance=0.1)**
|
|
132
|
+
- Converts molecular weight to monoisotopic mass
|
|
133
|
+
- Fixes common metadata confusion
|
|
134
|
+
|
|
135
|
+
**repair_precursor_is_parent_mass(spectrum)**
|
|
136
|
+
- Fixes swapped precursor/parent mass values
|
|
137
|
+
- Corrects field misassignments
|
|
138
|
+
|
|
139
|
+
**repair_smiles_of_salts(spectrum, mass_tolerance=0.1)**
|
|
140
|
+
- Removes salt components to match parent mass
|
|
141
|
+
- Extracts relevant molecular fragment
|
|
142
|
+
|
|
143
|
+
**require_parent_mass_match_smiles(spectrum, mass_tolerance=0.1)**
|
|
144
|
+
- Validates parent mass against SMILES-calculated mass
|
|
145
|
+
- Returns None if masses don't match within tolerance
|
|
146
|
+
|
|
147
|
+
**require_valid_annotation(spectrum)**
|
|
148
|
+
- Ensures complete, consistent chemical annotations
|
|
149
|
+
- Validates SMILES, InChI, and InChIKey presence and consistency
|
|
150
|
+
|
|
151
|
+
## Peak Processing Filters
|
|
152
|
+
|
|
153
|
+
### Normalization & Selection
|
|
154
|
+
|
|
155
|
+
**normalize_intensities(spectrum)**
|
|
156
|
+
- Scales peak intensities to unit height (max = 1.0)
|
|
157
|
+
- Essential preprocessing step for similarity calculations
|
|
158
|
+
|
|
159
|
+
**select_by_intensity(spectrum, intensity_from=0.0, intensity_to=1.0)**
|
|
160
|
+
- Retains peaks within specified absolute intensity range
|
|
161
|
+
- Filters by raw intensity values
|
|
162
|
+
|
|
163
|
+
**select_by_relative_intensity(spectrum, intensity_from=0.0, intensity_to=1.0)**
|
|
164
|
+
- Keeps peaks within relative intensity bounds
|
|
165
|
+
- Filters as fraction of maximum intensity
|
|
166
|
+
|
|
167
|
+
**select_by_mz(spectrum, mz_from=0.0, mz_to=1000.0)**
|
|
168
|
+
- Filters peaks by m/z value range
|
|
169
|
+
- Removes peaks outside specified m/z window
|
|
170
|
+
|
|
171
|
+
### Peak Reduction & Filtering
|
|
172
|
+
|
|
173
|
+
**reduce_to_number_of_peaks(spectrum, n_max=None, ratio_desired=None)**
|
|
174
|
+
- Removes lowest-intensity peaks when exceeding maximum
|
|
175
|
+
- Can specify absolute number or ratio
|
|
176
|
+
- Use: `spectrum = reduce_to_number_of_peaks(spectrum, n_max=100)`
|
|
177
|
+
|
|
178
|
+
**remove_peaks_around_precursor_mz(spectrum, mz_tolerance=17)**
|
|
179
|
+
- Eliminates peaks within tolerance of precursor
|
|
180
|
+
- Removes precursor and isotope peaks
|
|
181
|
+
- Common preprocessing for fragment-based similarity
|
|
182
|
+
|
|
183
|
+
**remove_peaks_outside_top_k(spectrum, k=10, ratio_desired=None)**
|
|
184
|
+
- Retains only peaks near k highest-intensity peaks
|
|
185
|
+
- Focuses on most informative signals
|
|
186
|
+
|
|
187
|
+
**require_minimum_number_of_peaks(spectrum, n_required=10)**
|
|
188
|
+
- Discards spectra with insufficient peaks
|
|
189
|
+
- Quality control filter
|
|
190
|
+
- Returns None if peak count below threshold
|
|
191
|
+
|
|
192
|
+
**require_minimum_number_of_high_peaks(spectrum, n_required=5, intensity_threshold=0.05)**
|
|
193
|
+
- Removes spectra lacking high-intensity peaks
|
|
194
|
+
- Ensures data quality
|
|
195
|
+
- Returns None if insufficient peaks above threshold
|
|
196
|
+
|
|
197
|
+
### Loss Calculation
|
|
198
|
+
|
|
199
|
+
**add_losses(spectrum, loss_mz_from=5.0, loss_mz_to=200.0)**
|
|
200
|
+
- Derives neutral losses from precursor mass
|
|
201
|
+
- Calculates loss = precursor_mz - fragment_mz
|
|
202
|
+
- Adds losses to spectrum for NeutralLossesCosine scoring
|
|
203
|
+
|
|
204
|
+
## Pipeline Functions
|
|
205
|
+
|
|
206
|
+
**default_filters(spectrum)**
|
|
207
|
+
- Applies nine essential metadata filters sequentially:
|
|
208
|
+
1. make_charge_int
|
|
209
|
+
2. add_precursor_mz
|
|
210
|
+
3. add_retention_time
|
|
211
|
+
4. add_retention_index
|
|
212
|
+
5. derive_adduct_from_name
|
|
213
|
+
6. derive_formula_from_name
|
|
214
|
+
7. clean_compound_name
|
|
215
|
+
8. harmonize_undefined_smiles
|
|
216
|
+
9. harmonize_undefined_inchi
|
|
217
|
+
- Recommended starting point for metadata harmonization
|
|
218
|
+
|
|
219
|
+
**SpectrumProcessor(filters)**
|
|
220
|
+
- Orchestrates multi-filter pipelines
|
|
221
|
+
- Accepts list of filter functions
|
|
222
|
+
- Example:
|
|
223
|
+
```python
|
|
224
|
+
from matchms import SpectrumProcessor
|
|
225
|
+
processor = SpectrumProcessor([
|
|
226
|
+
default_filters,
|
|
227
|
+
normalize_intensities,
|
|
228
|
+
lambda s: select_by_relative_intensity(s, intensity_from=0.01)
|
|
229
|
+
])
|
|
230
|
+
processed = processor(spectrum)
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
## Common Filter Combinations
|
|
234
|
+
|
|
235
|
+
### Standard Preprocessing Pipeline
|
|
236
|
+
```python
|
|
237
|
+
from matchms.filtering import (default_filters, normalize_intensities,
|
|
238
|
+
select_by_relative_intensity,
|
|
239
|
+
require_minimum_number_of_peaks)
|
|
240
|
+
|
|
241
|
+
spectrum = default_filters(spectrum)
|
|
242
|
+
spectrum = normalize_intensities(spectrum)
|
|
243
|
+
spectrum = select_by_relative_intensity(spectrum, intensity_from=0.01)
|
|
244
|
+
spectrum = require_minimum_number_of_peaks(spectrum, n_required=5)
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Quality Control Pipeline
|
|
248
|
+
```python
|
|
249
|
+
from matchms.filtering import (require_precursor_mz, require_minimum_number_of_peaks,
|
|
250
|
+
require_minimum_number_of_high_peaks)
|
|
251
|
+
|
|
252
|
+
spectrum = require_precursor_mz(spectrum, minimum_accepted_mz=50.0)
|
|
253
|
+
if spectrum is None:
|
|
254
|
+
# Spectrum failed quality control
|
|
255
|
+
pass
|
|
256
|
+
spectrum = require_minimum_number_of_peaks(spectrum, n_required=10)
|
|
257
|
+
spectrum = require_minimum_number_of_high_peaks(spectrum, n_required=5)
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### Chemical Annotation Pipeline
|
|
261
|
+
```python
|
|
262
|
+
from matchms.filtering import (derive_inchi_from_smiles, derive_inchikey_from_inchi,
|
|
263
|
+
add_fingerprint, require_valid_annotation)
|
|
264
|
+
|
|
265
|
+
spectrum = derive_inchi_from_smiles(spectrum)
|
|
266
|
+
spectrum = derive_inchikey_from_inchi(spectrum)
|
|
267
|
+
spectrum = add_fingerprint(spectrum, fingerprint_type="morgan2", nbits=2048)
|
|
268
|
+
spectrum = require_valid_annotation(spectrum)
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
### Peak Cleaning Pipeline
|
|
272
|
+
```python
|
|
273
|
+
from matchms.filtering import (normalize_intensities, remove_peaks_around_precursor_mz,
|
|
274
|
+
select_by_relative_intensity, reduce_to_number_of_peaks)
|
|
275
|
+
|
|
276
|
+
spectrum = normalize_intensities(spectrum)
|
|
277
|
+
spectrum = remove_peaks_around_precursor_mz(spectrum, mz_tolerance=17)
|
|
278
|
+
spectrum = select_by_relative_intensity(spectrum, intensity_from=0.01)
|
|
279
|
+
spectrum = reduce_to_number_of_peaks(spectrum, n_max=200)
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
## Notes on Filter Usage
|
|
283
|
+
|
|
284
|
+
1. **Order matters**: Apply filters in logical sequence (e.g., normalize before relative intensity selection)
|
|
285
|
+
2. **Filters return None**: Many filters return None for invalid spectra; check for None before proceeding
|
|
286
|
+
3. **Immutability**: Filters typically return modified copies; reassign results to variables
|
|
287
|
+
4. **Pipeline efficiency**: Use SpectrumProcessor for consistent multi-spectrum processing
|
|
288
|
+
5. **Documentation**: For detailed parameters, see matchms.readthedocs.io/en/latest/api/matchms.filtering.html
|
|
@@ -0,0 +1,416 @@
|
|
|
1
|
+
# Matchms Importing and Exporting Reference
|
|
2
|
+
|
|
3
|
+
This document details all file format support in matchms for loading and saving mass spectrometry data.
|
|
4
|
+
|
|
5
|
+
## Importing Spectra
|
|
6
|
+
|
|
7
|
+
Matchms provides dedicated functions for loading spectra from various file formats. All import functions return generators for memory-efficient processing of large files.
|
|
8
|
+
|
|
9
|
+
### Common Import Pattern
|
|
10
|
+
|
|
11
|
+
```python
|
|
12
|
+
from matchms.importing import load_from_mgf
|
|
13
|
+
|
|
14
|
+
# Load spectra (returns generator)
|
|
15
|
+
spectra_generator = load_from_mgf("spectra.mgf")
|
|
16
|
+
|
|
17
|
+
# Convert to list for processing
|
|
18
|
+
spectra = list(spectra_generator)
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Supported Import Formats
|
|
22
|
+
|
|
23
|
+
### MGF (Mascot Generic Format)
|
|
24
|
+
|
|
25
|
+
**Function**: `load_from_mgf(filename, metadata_harmonization=True)`
|
|
26
|
+
|
|
27
|
+
**Description**: Loads spectra from MGF files, a common format for mass spectrometry data exchange.
|
|
28
|
+
|
|
29
|
+
**Parameters**:
|
|
30
|
+
- `filename` (str): Path to MGF file
|
|
31
|
+
- `metadata_harmonization` (bool, default=True): Apply automatic metadata key harmonization
|
|
32
|
+
|
|
33
|
+
**Example**:
|
|
34
|
+
```python
|
|
35
|
+
from matchms.importing import load_from_mgf
|
|
36
|
+
|
|
37
|
+
# Load with metadata harmonization
|
|
38
|
+
spectra = list(load_from_mgf("data.mgf"))
|
|
39
|
+
|
|
40
|
+
# Load without harmonization
|
|
41
|
+
spectra = list(load_from_mgf("data.mgf", metadata_harmonization=False))
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**MGF Format**: Text-based format with BEGIN IONS/END IONS blocks containing metadata and peak lists.
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
### MSP (NIST Mass Spectral Library Format)
|
|
49
|
+
|
|
50
|
+
**Function**: `load_from_msp(filename, metadata_harmonization=True)`
|
|
51
|
+
|
|
52
|
+
**Description**: Loads spectra from MSP files, commonly used for spectral libraries.
|
|
53
|
+
|
|
54
|
+
**Parameters**:
|
|
55
|
+
- `filename` (str): Path to MSP file
|
|
56
|
+
- `metadata_harmonization` (bool, default=True): Apply automatic metadata harmonization
|
|
57
|
+
|
|
58
|
+
**Example**:
|
|
59
|
+
```python
|
|
60
|
+
from matchms.importing import load_from_msp
|
|
61
|
+
|
|
62
|
+
spectra = list(load_from_msp("library.msp"))
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**MSP Format**: Text-based format with Name/MW/Comment fields followed by peak lists.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
### mzML (Mass Spectrometry Markup Language)
|
|
70
|
+
|
|
71
|
+
**Function**: `load_from_mzml(filename, ms_level=2, metadata_harmonization=True)`
|
|
72
|
+
|
|
73
|
+
**Description**: Loads spectra from mzML files, the standard XML-based format for raw mass spectrometry data.
|
|
74
|
+
|
|
75
|
+
**Parameters**:
|
|
76
|
+
- `filename` (str): Path to mzML file
|
|
77
|
+
- `ms_level` (int, default=2): MS level to extract (1 for MS1, 2 for MS2/tandem)
|
|
78
|
+
- `metadata_harmonization` (bool, default=True): Apply automatic metadata harmonization
|
|
79
|
+
|
|
80
|
+
**Example**:
|
|
81
|
+
```python
|
|
82
|
+
from matchms.importing import load_from_mzml
|
|
83
|
+
|
|
84
|
+
# Load MS2 spectra (default)
|
|
85
|
+
ms2_spectra = list(load_from_mzml("data.mzML"))
|
|
86
|
+
|
|
87
|
+
# Load MS1 spectra
|
|
88
|
+
ms1_spectra = list(load_from_mzml("data.mzML", ms_level=1))
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
**mzML Format**: XML-based standard format containing raw instrument data and rich metadata.
|
|
92
|
+
|
|
93
|
+
---
|
|
94
|
+
|
|
95
|
+
### mzXML
|
|
96
|
+
|
|
97
|
+
**Function**: `load_from_mzxml(filename, ms_level=2, metadata_harmonization=True)`
|
|
98
|
+
|
|
99
|
+
**Description**: Loads spectra from mzXML files, an earlier XML-based format for mass spectrometry data.
|
|
100
|
+
|
|
101
|
+
**Parameters**:
|
|
102
|
+
- `filename` (str): Path to mzXML file
|
|
103
|
+
- `ms_level` (int, default=2): MS level to extract
|
|
104
|
+
- `metadata_harmonization` (bool, default=True): Apply automatic metadata harmonization
|
|
105
|
+
|
|
106
|
+
**Example**:
|
|
107
|
+
```python
|
|
108
|
+
from matchms.importing import load_from_mzxml
|
|
109
|
+
|
|
110
|
+
spectra = list(load_from_mzxml("data.mzXML"))
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
**mzXML Format**: XML-based format, predecessor to mzML.
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
### JSON (GNPS Format)
|
|
118
|
+
|
|
119
|
+
**Function**: `load_from_json(filename, metadata_harmonization=True)`
|
|
120
|
+
|
|
121
|
+
**Description**: Loads spectra from JSON files, particularly GNPS-compatible JSON format.
|
|
122
|
+
|
|
123
|
+
**Parameters**:
|
|
124
|
+
- `filename` (str): Path to JSON file
|
|
125
|
+
- `metadata_harmonization` (bool, default=True): Apply automatic metadata harmonization
|
|
126
|
+
|
|
127
|
+
**Example**:
|
|
128
|
+
```python
|
|
129
|
+
from matchms.importing import load_from_json
|
|
130
|
+
|
|
131
|
+
spectra = list(load_from_json("spectra.json"))
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
**JSON Format**: Structured JSON with spectrum metadata and peak arrays.
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
### Pickle (Python Serialization)
|
|
139
|
+
|
|
140
|
+
**Function**: `load_from_pickle(filename)`
|
|
141
|
+
|
|
142
|
+
**Description**: Loads previously saved matchms Spectrum objects from pickle files. Fast loading of preprocessed spectra.
|
|
143
|
+
|
|
144
|
+
**Parameters**:
|
|
145
|
+
- `filename` (str): Path to pickle file
|
|
146
|
+
|
|
147
|
+
**Example**:
|
|
148
|
+
```python
|
|
149
|
+
from matchms.importing import load_from_pickle
|
|
150
|
+
|
|
151
|
+
spectra = list(load_from_pickle("processed_spectra.pkl"))
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
**Use case**: Saving and loading preprocessed spectra for faster subsequent analyses.
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
### USI (Universal Spectrum Identifier)
|
|
159
|
+
|
|
160
|
+
**Function**: `load_from_usi(usi)`
|
|
161
|
+
|
|
162
|
+
**Description**: Loads a single spectrum from a metabolomics USI reference.
|
|
163
|
+
|
|
164
|
+
**Parameters**:
|
|
165
|
+
- `usi` (str): Universal Spectrum Identifier string
|
|
166
|
+
|
|
167
|
+
**Example**:
|
|
168
|
+
```python
|
|
169
|
+
from matchms.importing import load_from_usi
|
|
170
|
+
|
|
171
|
+
usi = "mzspec:GNPS:TASK-...:spectrum..."
|
|
172
|
+
spectrum = load_from_usi(usi)
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
**USI Format**: Standardized identifier for accessing spectra from online repositories.
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
## Exporting Spectra
|
|
180
|
+
|
|
181
|
+
Matchms provides functions to save processed spectra to various formats for sharing and archival.
|
|
182
|
+
|
|
183
|
+
### MGF Export
|
|
184
|
+
|
|
185
|
+
**Function**: `save_as_mgf(spectra, filename, write_mode='w')`
|
|
186
|
+
|
|
187
|
+
**Description**: Saves spectra to MGF format.
|
|
188
|
+
|
|
189
|
+
**Parameters**:
|
|
190
|
+
- `spectra` (list): List of Spectrum objects to save
|
|
191
|
+
- `filename` (str): Output file path
|
|
192
|
+
- `write_mode` (str, default='w'): File write mode ('w' for write, 'a' for append)
|
|
193
|
+
|
|
194
|
+
**Example**:
|
|
195
|
+
```python
|
|
196
|
+
from matchms.exporting import save_as_mgf
|
|
197
|
+
|
|
198
|
+
save_as_mgf(processed_spectra, "output.mgf")
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
---
|
|
202
|
+
|
|
203
|
+
### MSP Export
|
|
204
|
+
|
|
205
|
+
**Function**: `save_as_msp(spectra, filename, write_mode='w')`
|
|
206
|
+
|
|
207
|
+
**Description**: Saves spectra to MSP format.
|
|
208
|
+
|
|
209
|
+
**Parameters**:
|
|
210
|
+
- `spectra` (list): List of Spectrum objects to save
|
|
211
|
+
- `filename` (str): Output file path
|
|
212
|
+
- `write_mode` (str, default='w'): File write mode
|
|
213
|
+
|
|
214
|
+
**Example**:
|
|
215
|
+
```python
|
|
216
|
+
from matchms.exporting import save_as_msp
|
|
217
|
+
|
|
218
|
+
save_as_msp(library_spectra, "library.msp")
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
### JSON Export
|
|
224
|
+
|
|
225
|
+
**Function**: `save_as_json(spectra, filename, write_mode='w')`
|
|
226
|
+
|
|
227
|
+
**Description**: Saves spectra to JSON format (GNPS-compatible).
|
|
228
|
+
|
|
229
|
+
**Parameters**:
|
|
230
|
+
- `spectra` (list): List of Spectrum objects to save
|
|
231
|
+
- `filename` (str): Output file path
|
|
232
|
+
- `write_mode` (str, default='w'): File write mode
|
|
233
|
+
|
|
234
|
+
**Example**:
|
|
235
|
+
```python
|
|
236
|
+
from matchms.exporting import save_as_json
|
|
237
|
+
|
|
238
|
+
save_as_json(spectra, "spectra.json")
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
### Pickle Export
|
|
244
|
+
|
|
245
|
+
**Function**: `save_as_pickle(spectra, filename)`
|
|
246
|
+
|
|
247
|
+
**Description**: Saves spectra as Python pickle file. Preserves all Spectrum attributes and is fastest for loading.
|
|
248
|
+
|
|
249
|
+
**Parameters**:
|
|
250
|
+
- `spectra` (list): List of Spectrum objects to save
|
|
251
|
+
- `filename` (str): Output file path
|
|
252
|
+
|
|
253
|
+
**Example**:
|
|
254
|
+
```python
|
|
255
|
+
from matchms.exporting import save_as_pickle
|
|
256
|
+
|
|
257
|
+
save_as_pickle(processed_spectra, "processed.pkl")
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
**Advantages**:
|
|
261
|
+
- Fast save and load
|
|
262
|
+
- Preserves exact Spectrum state
|
|
263
|
+
- No format conversion overhead
|
|
264
|
+
|
|
265
|
+
**Disadvantages**:
|
|
266
|
+
- Not human-readable
|
|
267
|
+
- Python-specific (not portable to other languages)
|
|
268
|
+
- Pickle format may not be compatible across Python versions
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
## Complete Import/Export Workflow
|
|
273
|
+
|
|
274
|
+
### Preprocessing and Saving Pipeline
|
|
275
|
+
|
|
276
|
+
```python
|
|
277
|
+
from matchms.importing import load_from_mgf
|
|
278
|
+
from matchms.exporting import save_as_mgf, save_as_pickle
|
|
279
|
+
from matchms.filtering import default_filters, normalize_intensities
|
|
280
|
+
from matchms.filtering import select_by_relative_intensity
|
|
281
|
+
|
|
282
|
+
# Load raw spectra
|
|
283
|
+
spectra = list(load_from_mgf("raw_data.mgf"))
|
|
284
|
+
|
|
285
|
+
# Process spectra
|
|
286
|
+
processed = []
|
|
287
|
+
for spectrum in spectra:
|
|
288
|
+
spectrum = default_filters(spectrum)
|
|
289
|
+
spectrum = normalize_intensities(spectrum)
|
|
290
|
+
spectrum = select_by_relative_intensity(spectrum, intensity_from=0.01)
|
|
291
|
+
if spectrum is not None:
|
|
292
|
+
processed.append(spectrum)
|
|
293
|
+
|
|
294
|
+
# Save processed spectra (MGF for sharing)
|
|
295
|
+
save_as_mgf(processed, "processed_data.mgf")
|
|
296
|
+
|
|
297
|
+
# Save as pickle for fast reloading
|
|
298
|
+
save_as_pickle(processed, "processed_data.pkl")
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
### Format Conversion
|
|
302
|
+
|
|
303
|
+
```python
|
|
304
|
+
from matchms.importing import load_from_mzml
|
|
305
|
+
from matchms.exporting import save_as_mgf, save_as_msp
|
|
306
|
+
|
|
307
|
+
# Convert mzML to MGF
|
|
308
|
+
spectra = list(load_from_mzml("data.mzML", ms_level=2))
|
|
309
|
+
save_as_mgf(spectra, "data.mgf")
|
|
310
|
+
|
|
311
|
+
# Convert to MSP library format
|
|
312
|
+
save_as_msp(spectra, "data.msp")
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
### Loading from Multiple Files
|
|
316
|
+
|
|
317
|
+
```python
|
|
318
|
+
from matchms.importing import load_from_mgf
|
|
319
|
+
import glob
|
|
320
|
+
|
|
321
|
+
# Load all MGF files in directory
|
|
322
|
+
all_spectra = []
|
|
323
|
+
for mgf_file in glob.glob("data/*.mgf"):
|
|
324
|
+
spectra = list(load_from_mgf(mgf_file))
|
|
325
|
+
all_spectra.extend(spectra)
|
|
326
|
+
|
|
327
|
+
print(f"Loaded {len(all_spectra)} spectra from multiple files")
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
### Memory-Efficient Processing
|
|
331
|
+
|
|
332
|
+
```python
|
|
333
|
+
from matchms.importing import load_from_mgf
|
|
334
|
+
from matchms.exporting import save_as_mgf
|
|
335
|
+
from matchms.filtering import default_filters, normalize_intensities
|
|
336
|
+
|
|
337
|
+
# Process large file without loading all into memory
|
|
338
|
+
def process_spectrum(spectrum):
|
|
339
|
+
spectrum = default_filters(spectrum)
|
|
340
|
+
spectrum = normalize_intensities(spectrum)
|
|
341
|
+
return spectrum
|
|
342
|
+
|
|
343
|
+
# Stream processing
|
|
344
|
+
with open("output.mgf", 'w') as outfile:
|
|
345
|
+
for spectrum in load_from_mgf("large_file.mgf"):
|
|
346
|
+
processed = process_spectrum(spectrum)
|
|
347
|
+
if processed is not None:
|
|
348
|
+
# Write immediately without storing in memory
|
|
349
|
+
save_as_mgf([processed], outfile, write_mode='a')
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
## Format Selection Guidelines
|
|
353
|
+
|
|
354
|
+
**MGF**:
|
|
355
|
+
- ✓ Widely supported
|
|
356
|
+
- ✓ Human-readable
|
|
357
|
+
- ✓ Good for data sharing
|
|
358
|
+
- ✓ Moderate file size
|
|
359
|
+
- Best for: Data exchange, GNPS uploads, publication data
|
|
360
|
+
|
|
361
|
+
**MSP**:
|
|
362
|
+
- ✓ Spectral library standard
|
|
363
|
+
- ✓ Human-readable
|
|
364
|
+
- ✓ Good metadata support
|
|
365
|
+
- Best for: Reference libraries, NIST format compatibility
|
|
366
|
+
|
|
367
|
+
**JSON**:
|
|
368
|
+
- ✓ Structured format
|
|
369
|
+
- ✓ GNPS compatible
|
|
370
|
+
- ✓ Easy to parse programmatically
|
|
371
|
+
- Best for: Web applications, GNPS integration, structured data
|
|
372
|
+
|
|
373
|
+
**Pickle**:
|
|
374
|
+
- ✓ Fastest save/load
|
|
375
|
+
- ✓ Preserves exact state
|
|
376
|
+
- ✗ Not portable to other languages
|
|
377
|
+
- ✗ Not human-readable
|
|
378
|
+
- Best for: Intermediate processing, Python-only workflows
|
|
379
|
+
|
|
380
|
+
**mzML/mzXML**:
|
|
381
|
+
- ✓ Raw instrument data
|
|
382
|
+
- ✓ Rich metadata
|
|
383
|
+
- ✓ Industry standard
|
|
384
|
+
- ✗ Large file size
|
|
385
|
+
- ✗ Slower to parse
|
|
386
|
+
- Best for: Raw data archival, multi-level MS data
|
|
387
|
+
|
|
388
|
+
## Metadata Harmonization
|
|
389
|
+
|
|
390
|
+
The `metadata_harmonization` parameter (available in most import functions) automatically standardizes metadata keys:
|
|
391
|
+
|
|
392
|
+
```python
|
|
393
|
+
# Without harmonization
|
|
394
|
+
spectrum = load_from_mgf("data.mgf", metadata_harmonization=False)
|
|
395
|
+
# May have: "PRECURSOR_MZ", "Precursor_mz", "precursormz"
|
|
396
|
+
|
|
397
|
+
# With harmonization (default)
|
|
398
|
+
spectrum = load_from_mgf("data.mgf", metadata_harmonization=True)
|
|
399
|
+
# Standardized to: "precursor_mz"
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
**Recommended**: Keep harmonization enabled (default) for consistent metadata access across different data sources.
|
|
403
|
+
|
|
404
|
+
## File Format Specifications
|
|
405
|
+
|
|
406
|
+
For detailed format specifications:
|
|
407
|
+
- **MGF**: http://www.matrixscience.com/help/data_file_help.html
|
|
408
|
+
- **MSP**: https://chemdata.nist.gov/mass-spc/ms-search/
|
|
409
|
+
- **mzML**: http://www.psidev.info/mzML
|
|
410
|
+
- **GNPS JSON**: https://gnps.ucsd.edu/
|
|
411
|
+
|
|
412
|
+
## Further Reading
|
|
413
|
+
|
|
414
|
+
For complete API documentation:
|
|
415
|
+
https://matchms.readthedocs.io/en/latest/api/matchms.importing.html
|
|
416
|
+
https://matchms.readthedocs.io/en/latest/api/matchms.exporting.html
|