PyPI - validmind - Versions diffs - 2.1.1__py3-none-any.whl → 2.2.2__py3-none-any.whl - Mend

validmind 2.1.1py3-none-any.whl → 2.2.2py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (110) hide show

validmind/datasets/llm/rag/datasets/rfp_existing_questions_client_5.csv ADDED Viewed

@@ -0,0 +1,53 @@
+Project_Title,RFP_Question_ID,question,ground_truth,Area,Last_Accessed_At,Requester,Status
+"Generative AI Solutions for Fraud Detection and Financial Security
+",1,Please share your experience with developing AI-enabled applications and provide examples of notable projects.,"Our company has 15 years of experience in developing AI-based applications, with a strong portfolio in sectors such as healthcare, finance, and education. For instance, our project MediAI Insight for the healthcare industry demonstrated significant achievements in patient data analysis, resulting in a 30% reduction in diagnostic errors and a 40% improvement in treatment personalization. Our platform has engaged over 200 healthcare facilities, achieving a user satisfaction rate of 95%.",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",2,How do you maintain your AI applications with the newest AI technologies and advancements?,"We maintain a dedicated R&D team focused on integrating the latest AI advancements into our applications. This includes regular updates and feature enhancements based on cutting-edge technologies such as GPT (Generative Pre-trained Transformer) for natural language understanding, CNNs (Convolutional Neural Networks) for advanced image recognition tasks, and DQN (Deep Q-Networks) for decision-making processes in complex environments. Our commitment to these AI methodologies ensures that our applications remain innovative, with capabilities that adapt to evolving market demands and client needs. This approach has enabled us to enhance the predictive accuracy of our financial forecasting tools by 25% and improve the efficiency of our educational content personalization by 40%",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",3,Can your AI applications be tailored to meet unique organizational or user-specific requirements?,"Absolutely, customization is a core aspect of our offering. We work closely with clients to understand their specific needs and tailor our AI algorithms and app functionalities accordingly, using technologies such as TensorFlow for machine learning models, React for responsive UI/UX designs, and Kubernetes for scalable cloud deployment. This personalized approach allows us to optimize AI functionalities to match unique business processes, enhancing user experience and operational efficiency for each client. For example, for a retail client, we customized our recommendation engine to increase customer retention by 20% through more accurate and personalized product suggestions.",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",4,What actions do you undertake to secure user data and maintain privacy within your AI platforms?,"User privacy and data security are paramount. We implement robust measures such as end-to-end encryption to secure data transmissions, anonymization techniques to protect user identities, and comprehensive compliance with data protection laws like GDPR and CCPA. We also employ regular security audits and vulnerability assessments to ensure our systems are impenetrable. Additionally, our deployment of advanced intrusion detection systems and the use of secure coding practices reinforce our commitment to safeguarding user data at all times",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",5,What considerations do you take into account for user interface and user experience design in your AI applications?,"Our design philosophy centers on simplicity and intuitiveness. We conduct extensive user research and testing to inform our UI/UX designs, ensuring that our AI-based apps are accessible and engaging for all users, regardless of their technical expertise. This includes applying principles from human-centered design, utilizing accessibility guidelines such as WCAG 2.1, and conducting iterative testing with diverse user groups. Our commitment to inclusivity and usability leads to higher user adoption rates and satisfaction. For instance, feedback-driven enhancements in our visual design have improved user engagement by over 30% across our applications.",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",6,What support and maintenance do you provide following the deployment of AI applications?,"Post-launch, we offer comprehensive support and maintenance services, including regular updates, bug fixes, and performance optimization. Our support team is available 24/7 to assist with any issues or questions. We utilize a ticketing system that ensures swift response times, with an average initial response time of under 2 hours. Additionally, we provide monthly performance reports and hold quarterly reviews with clients to discuss system status and potential improvements. Our proactive approach includes using automated monitoring tools to detect and resolve issues before they impact users, ensuring that our applications perform optimally at all times. This service structure has been instrumental in maintaining a client satisfaction rate above 98%.",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",7,How do you evaluate the success of your AI applications in fulfilling client goals?,"Success measurement is tailored to each project's objectives. We establish key performance indicators (KPIs) in collaboration with our clients, such as user engagement rates, efficiency improvements, or return on investment (ROI). We then regularly review these metrics using advanced analytics platforms and business intelligence tools to assess the app’s impact. Our approach includes monthly performance analysis meetings where we provide detailed reports and insights on metrics like session duration, user retention rates, and cost savings achieved through automation. We also implement A/B testing to continuously refine and optimize the application based on real-world usage data, ensuring that we make data-driven improvements that align closely with our clients' strategic goals.",General,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",8,How are ethical issues like bias detection and data privacy managed in your LLM applications?,"We adhere to ethical AI practices by implementing bias detection and mitigation techniques during the training of our Large Language Models (LLMs). This involves using diverse datasets to prevent skewed results and deploying algorithms specifically designed to identify and correct bias in AI outputs. For data privacy, we employ data anonymization and secure data handling protocols, ensuring compliance with GDPR, CCPA, and other relevant regulations. Our systems use state-of-the-art encryption methods for data at rest and in transit, and our data governance policies are rigorously audited by third-party security firms to maintain high standards of data integrity and confidentiality. This commitment extends to regular training for our staff on the latest privacy laws and ethical AI use to ensure that our practices are up-to-date and effective.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",9,"Please outline the training process for your LLMs, including choices regarding data, models, and validation approaches.","Our LLM training process begins with the meticulous sourcing of diverse and comprehensive datasets from global sources, ensuring a rich variety that includes various languages, dialects, and cultural contexts. This diversity is critical for building models that perform equitably across different demographics. We leverage cutting-edge tools like Apache Kafka for real-time data streaming and Apache Hadoop for handling large datasets efficiently during preprocessing stages. For model architecture selection, we utilize TensorFlow and PyTorch frameworks to design and iterate on neural network structures that best suit each application's unique requirements, whether it's for predictive analytics in finance or customer service chatbots. Depending on the use case, we might choose from a variety of architectures such as Transformer models for their robust handling of sequential data or GANs (Generative Adversarial Networks) for generating new, synthetic data samples for training.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",10,How do your LLMs continuously learn and update to reflect new data and changing user expectations?,"We implement advanced continuous learning mechanisms that allow our Large Language Models (LLMs) to adapt over time by incorporating new data and feedback loops, ensuring our models remain current and effective. We utilize incremental learning techniques where the model is periodically updated with fresh data without the need for retraining from scratch. This is facilitated by employing online learning algorithms such as Online Gradient Descent, which can quickly adjust model weights in response to new information.
+ To efficiently manage this continuous learning process, we use tools like Apache Spark for handling large-scale data processing in a distributed computing environment. This allows for seamless integration of new data streams into our training datasets. We also implement active learning cycles where the models request human feedback on specific outputs that are uncertain, thus refining model predictions over time based on actual user interactions and feedback.
+ Additionally, we incorporate reinforcement learning techniques where models are rewarded for improvements in performance metrics like accuracy and user engagement. This helps in fine-tuning the models' responses based on what is most effective in real-world scenarios.
+ For monitoring and managing these updates, we use TensorFlow Extended (TFX) for a robust end-to-end platform that ensures our models are consistently validated against performance benchmarks before being deployed. This continuous adaptation framework guarantees that our LLMs are not only keeping pace with evolving user needs and preferences but are also progressively enhancing their relevance and effectiveness.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",11,What measures do you employ to ensure your LLMs' decisions are clear and explicable?,"We prioritize transparency and explainability in our AI models by incorporating advanced features such as model interpretability layers and providing comprehensive documentation on how model decisions are made. This approach ensures that users can understand and trust the outputs of our Large Language Models (LLMs). To achieve this, we integrate tools like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) into our models. These tools allow us to break down and communicate the reasoning behind each model decision, fostering trust and facilitating easier audits by stakeholders.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",12,How do you ensure your LLMs can efficiently handle increased loads and demand?,"We conduct extensive performance testing under various load conditions to assess scalability and ensure our LLMs can handle high-demand scenarios efficiently. This involves using tools like Apache JMeter and LoadRunner to simulate different levels of user interaction and data volume, allowing us to evaluate how our systems perform under stress. Additionally, we employ scalable cloud infrastructure, utilizing services like Amazon Web Services (AWS) Elastic Compute Cloud (EC2) and Google Cloud Platform (GCP) Compute Engine, which support dynamic scaling. Optimization techniques such as auto-scaling groups and load balancers are implemented to ensure that our resources adjust automatically based on real-time demands, providing both robustness and cost efficiency.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",13,"Could you provide case studies on successful LLM-based application deployments, focusing on challenges and solutions?","We can share case studies of successful LLM-based application deployments, highlighting specific challenges such as data scarcity or model interpretability, and detailing the strategies and solutions we implemented to overcome these challenges. For example, in a project involving natural language processing for a legal firm, we faced significant data scarcity. To address this, we employed techniques like synthetic data generation and transfer learning from related domains to enrich our training datasets. Additionally, the issue of model interpretability was critical for our client’s trust and regulatory compliance. We tackled this by integrating SHAP (SHapley Additive exPlanations) to provide clear, understandable insights into how our model's decisions were made, ensuring transparency and boosting user confidence in the AI system.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",14,Describe your strategy for integrating LLMs into existing organizational frameworks and systems.,"Our approach involves conducting a thorough analysis of the existing systems and workflows, designing integration plans that minimize disruption, and using APIs and custom connectors to ensure seamless integration of our LLM-based applications. We start by meticulously mapping the client's current infrastructure and operational flows to identify the most efficient points of integration. This is followed by the development of tailored integration plans that prioritize operational continuity and minimize downtime. To achieve seamless integration, we utilize robust APIs and develop custom connectors where necessary, ensuring compatibility with existing software platforms and databases. These tools allow for the smooth transfer of data and maintain the integrity and security of the system, ensuring that the new AI capabilities enhance functionality without compromising existing processes.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",15,"What is your approach to maintaining and supporting LLM-based applications post-deployment, especially in terms of updates and addressing model drift?","Our post-deployment support is designed to ensure sustained performance and relevance of our LLM-based applications. We actively monitor for model drift to detect and address any degradation in model accuracy over time due to changes in underlying data patterns. This includes implementing automated systems that alert our team to potential drifts, allowing for timely interventions. Regular model updates and improvements are also part of our support protocol, ensuring that our solutions adapt to new data and evolving industry standards. Additionally, our dedicated technical support team is available to swiftly address any operational issues or adapt to changes in client requirements. This comprehensive support structure guarantees that our applications continue to deliver optimal performance and align with our clients' strategic objectives.",Large Language Models,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",16,How does your AI strategy align with the NIST AI RMF for trustworthy and responsible use of AI?,"Our AI solution is meticulously designed to align with the NIST AI Risk Management Framework (RMF) guidelines, ensuring adherence to principles of trustworthiness and responsibility. We have implemented comprehensive governance structures that oversee the ethical development and deployment of our AI technologies. This includes risk identification and assessment processes where potential risks are analyzed and categorized at each stage of the AI lifecycle. To manage these risks, we have instituted robust risk management controls that are deeply integrated into our development and operational processes. These controls are based on the NIST framework’s best practices, ensuring that our AI solutions are not only effective but also secure and ethical, maintaining transparency and accountability at all times.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",17,Can you discuss your governance framework for managing AI risks in accordance with the NIST AI RMF?,"We have established an AI Risk Council that plays a pivotal role in overseeing AI risk management across our organization. This council is tasked with defining clear roles and responsibilities for AI governance, ensuring that there is a structured approach to managing AI risks. It also integrates AI risk management into our existing governance frameworks to enhance coherence and alignment with broader corporate policies and objectives. Additionally, the AI Risk Council promotes robust collaboration between various business units and our IT department. This collaboration is crucial for sharing insights, aligning strategies, and implementing comprehensive risk management practices effectively across the entire organization. This framework not only supports proactive risk management but also fosters an environment where AI technologies are used responsibly and ethically.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",18,How do you perform risk assessment and identification according to the NIST AI RMF's 'Map' function?,"We conduct thorough assessments of AI systems and the people using AI within our organization. This process involves meticulously identifying potential risks such as data privacy, security, bias, and legal compliance. We assess both the impact and the likelihood of each identified risk to effectively prioritize them. Our approach includes the use of sophisticated tools and methodologies, such as risk matrices and scenario analysis, to quantify and categorize risks accurately. This comprehensive assessment enables us to develop targeted risk mitigation strategies and allocate resources more efficiently, ensuring that the most critical risks are addressed promptly and effectively. This proactive risk management practice helps us maintain the integrity of our AI systems and uphold our ethical and legal responsibilities.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",19,What steps do you take to ensure the transparency and explainability of your AI systems' decisions?,"We prioritize transparency by incorporating explainability features into our AI models, providing detailed documentation on the decision-making processes, and ensuring that stakeholders can understand and trust the outputs of our AI systems. To achieve this, we integrate explainability tools like feature importance scores and decision trees that clearly outline how and why decisions are made by our AI. We supplement these technical tools with comprehensive documentation that describes the algorithms' functions in accessible language. This approach is designed to demystify the AI's operations for non-technical stakeholders, fostering a higher level of trust and acceptance. By ensuring that our AI systems are transparent and their workings understandable, we maintain open communication and build confidence among users and regulators alike.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",20,How do you monitor and assess AI risk exposure using metrics suggested by the NIST AI RMF's 'Measure' function?,"We have developed a set of Key Performance Indicators (KPIs) and metrics specifically designed to assess and analyze AI risk exposure across our systems. These metrics are tracked continuously to provide a clear, quantifiable measure of risk at any given time. To streamline this process, we utilize AI risk assessment tools that automate both data collection and analysis, enhancing the accuracy and efficiency of our monitoring efforts.
+ These tools employ advanced analytics to detect subtle shifts in risk patterns, enabling proactive risk management. Regular updates to our risk assessment protocols ensure that they remain aligned with current threat landscapes and regulatory requirements. This systematic monitoring and analysis not only help us maintain control over AI risks but also ensure that we can respond swiftly and effectively to any changes in risk levels, keeping our AI systems secure and compliant.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",21,How do you handle the management and mitigation of AI risks in line with the NIST AI RMF's 'Manage' function?,"We implement and maintain robust risk management controls to mitigate identified risks effectively. This comprehensive approach includes regular updates to our AI models to address evolving challenges and improve performance. We also implement stringent security measures, such as encryption, access controls, and continuous monitoring systems, to safeguard our data and systems from unauthorized access and potential breaches.
+ Furthermore, ensuring compliance with data protection laws is a critical part of our risk management strategy. We stay abreast of legal requirements in all operational jurisdictions, such as GDPR in Europe and CCPA in California, and integrate compliance measures into our AI deployments. Regular audits, both internal and by third-party assessors, help ensure that our practices are up-to-date and that we maintain the highest standards of data privacy and security. This holistic approach to risk management enables us to maintain trust and reliability in our AI applications.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",22,How do you ensure your AI solutions adhere to U.S. regulations on data privacy and security?,"We ensure compliance with U.S. regulations such as the Federal Information Security Modernization Act (FISMA) and other applicable laws and directives by adopting a risk-based approach to control selection and specification. This approach meticulously considers the constraints and requirements imposed by these regulations. We conduct regular audits and assessments to verify that our security controls meet or exceed the stipulated standards, ensuring that all our data handling and processing activities are fully compliant.
+ Our compliance framework is designed to adapt to the specific needs of the environments in which our systems operate, integrating best practices and guidance from regulatory bodies. We also engage with legal and compliance experts to stay updated on any changes in legislation, ensuring our practices remain in line with the latest requirements. This proactive and informed approach allows us to manage risk effectively while maintaining the highest levels of data protection and security as mandated by U.S. law.",AI Regulation,18/10/2022,Bank E,Under Review
+"Generative AI Solutions for Fraud Detection and Financial Security
+",23,How do you contribute to the ongoing development of AI risk management practices following the NIST AI RMF?,"We actively participate in industry working groups and public-private partnerships to contribute to the continual improvement of AI risk management practices. Our engagement in these collaborative efforts not only allows us to share our insights and strategies but also enables us to learn from the collective experiences of the industry, helping to elevate the standards of AI safety and reliability across the board. Additionally, we stay abreast of updates to the NIST AI Risk Management Framework (RMF) and adjust our practices accordingly. This commitment to staying current ensures that our risk management approaches align with the latest guidelines and best practices, reinforcing our dedication to leading-edge, responsible AI development and deployment.",AI Regulation,18/10/2022,Bank E,Under Review

validmind/datasets/llm/rag/rfp.py ADDED Viewed

@@ -0,0 +1,41 @@
+# Copyright © 2023-2024 ValidMind Inc. All rights reserved.
+# See the LICENSE file in the root of this repository for details.
+# SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
+import os
+from csv import DictReader
+from uuid import uuid4
+import pandas as pd
+from sklearn.model_selection import train_test_split
+current_path = os.path.dirname(os.path.abspath(__file__))
+dataset_path = os.path.join(current_path, "datasets")
+drop_columns = ["RFP_Question_ID", "Last_Accessed_At"]
+target_column = "RFP_Answer"
+categorical_columns = []
+def load_data(full_dataset=False):
+    documents = []
+    for file in os.listdir(dataset_path):
+        if file.endswith(".csv"):
+            # use csv dict reader to load the csv file
+            with open(os.path.join(dataset_path, file)) as f:
+                reader = DictReader(f)
+                for row in reader:
+                    # add a unique id to the row
+                    row["id"] = str(uuid4())
+                    documents.append(row)
+    df = pd.DataFrame(documents)
+    df.drop(drop_columns, axis=1, inplace=True)
+    return df
+def preprocess(df):
+    df = df.copy()
+    train_df, test_df = train_test_split(df, test_size=0.20)
+    return train_df, test_df

validmind/html_templates/__init__.py ADDED Viewed

File without changes

validmind/html_templates/content_blocks.py CHANGED Viewed

@@ -16,16 +16,50 @@ test_content_block_html = """
   <h2>{title}</h2>
   {description}
 </div>
-<h5 class="vm_required_context">
-  Required Inputs: <span style="font-size: 13px"><i>{required_inputs}</i></span>
-</h5>
-<table class="vm_params_table" style="display: {table_display};">
-    <tr>
-        <th>Parameter</th>
-        <th>Default Value</th>
-    </tr>
-    {params_table}
-</table>
+<div class="unset">
+  <h3>How to Run:</h3>
+  <button
+      onclick="(() => {{e = document.getElementById('expandable_instructions_{uuid}'); e.style.display === 'none' ? e.style.display = 'block' : e.style.display = 'none'}})()"
+  >Show/Hide Instructions</button>
+  <div id="expandable_instructions_{uuid}" style="display: {instructions_display};">
+  <h4>Code:</h4>
+    <pre>
+        <code class='language-python'>import validmind as vm
+# inputs dictionary maps your inputs to the expected input names
+# keys are the expected input names and values are the actual inputs
+# values may be string input_ids or the actual VMDataset or VMModel objects
+inputs = {example_inputs}
+params = {example_params}
+# to run and view the result of this test, run the following code:
+result = vm.tests.run_test(
+  "{test_id}", inputs=inputs, params=params
+)
+# To see the result of the test, ensure that you have called `vm.init()` and then run:
+result.log()</code>
+    </pre>
+    <h4 class="vm_required_context">
+      Required Inputs: <span style="font-size: 13px"><i>{required_inputs}</i></span>
+    </h4>
+    <div style="display: {table_display};">
+      <h4>Parameters:</h4>
+      <table class="vm_params_table" style="display: {table_display};">
+          <tr>
+              <th>Parameter</th>
+              <th>Default Value</th>
+          </tr>
+          {params_table}
+      </table>
+    </div>
+  </div>
+</div>
 <style>
 h5.vm_required_context {{
@@ -33,8 +67,9 @@ h5.vm_required_context {{
 }}
 table.vm_params_table {{
   margin-top: 20px;
-  width: 300px;
+  width: 350px;
   border-collapse: collapse;
+  border-color: --jp-border-color0;
 }}
 table.vm_params_table td, table.vm_params_table th {{
   text-align: right;
@@ -42,19 +77,59 @@ table.vm_params_table td, table.vm_params_table th {{
 table.vm_params_table td:first-child, table.vm_params_table th:first-child {{
   text-align: left;
 }}
+table.vm_params_table th {{
+  background-color: --jp-content-color0;
+  font-weight: bold;
+  font-size: 14px !important;
+}}
+table.vm_params_table tr:nth-child(even) {{
+  background-color: --jp-layout-color1;
+}}
 table.vm_params_table tr:nth-child(odd) {{
-  background-color: #f2f2f2;
+  background-color: --jp-layout-color2;
 }}
 table.vm_params_table tr:hover {{
-  background-color: #ddd;
+  background-color: --jp-layout-color3;
 }}
 table.vm_params_table td, table.vm_params_table th {{
   padding: 5px;
-  border: .8px solid #ddd;
+  border: .8px solid --jp-border-color0;
 }}
 </style>
 """
+python_syntax_highlighting = """
+<script defer type="module">
+import hljs from 'https://unpkg.com/@highlightjs/cdn-assets@11.9.0/es/highlight.min.js';
+import python from 'https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/es/languages/python.min.js';
+hljs.registerLanguage('python', python);
+hljs.highlightAll();
+</script>
+"""
+# FIXME: this is a bit too hacky
+math_jax_snippet = """
+<script>
+window.MathJax = {
+    tex2jax: {
+        inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
+        displayMath: [['$$', '$$'], ['\\[', '\\]']],
+        processEscapes: true,
+        skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'],
+        ignoreClass: ".*",
+        processClass: "math"
+    }
+};
+setTimeout(function () {
+    var script = document.createElement('script');
+    script.type = 'text/javascript';
+    script.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/MathJax.js?config=TeX-AMS_HTML';
+    document.head.appendChild(script);
+}, 300);
+</script>
+"""
 failed_content_block_html = """
 <div
   class="lm-Widget p-Widget jupyter-widget-Collapse-header"

validmind/models/__init__.py CHANGED Viewed

@@ -2,19 +2,22 @@
 # See the LICENSE file in the root of this repository for details.
 # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
-from .catboost import CatBoostModel
 from .foundation import FoundationModel, Prompt
+from .function import FunctionModel
 from .huggingface import HFModel
+from .metadata import MetadataModel
+from .pipeline import PipelineModel
 from .pytorch import PyTorchModel
-from .sklearn import SKlearnModel
-from .statsmodels import StatsModelsModel
-from .xgboost import XGBoostModel
+from .sklearn import CatBoostModel, SKlearnModel, StatsModelsModel, XGBoostModel
 __all__ = [
     "CatBoostModel",
     "FoundationModel",
+    "FunctionModel",
     "HFModel",
+    "MetadataModel",
     "Prompt",
+    "PipelineModel",
     "PyTorchModel",
     "SKlearnModel",
     "StatsModelsModel",

validmind/models/foundation.py CHANGED Viewed

@@ -7,7 +7,7 @@ from dataclasses import dataclass
 import pandas as pd
 from validmind.logging import get_logger
-from validmind.vm_models.model import ModelAttributes, VMModel
+from validmind.models.function import FunctionModel
 logger = get_logger(__name__)
@@ -18,7 +18,7 @@ class Prompt:
     variables: list
-class FoundationModel(VMModel):
+class FoundationModel(FunctionModel):
     """FoundationModel class wraps a Foundation LLM endpoint
     This class wraps a predict function that is user-defined and adapts it to works
@@ -29,22 +29,14 @@ class FoundationModel(VMModel):
           and return the result from the model
         prompt (Prompt): The prompt object that defines the prompt template and the
           variables (if any)
-        attributes (ModelAttributes, optional): The attributes of the model. Defaults to None.
+        name (str, optional): The name of the model. Defaults to name of the predict_fn
     """
-    def __init__(
-        self,
-        predict_fn: callable,
-        prompt: Prompt,  # prompt used for model (for now just a string)
-        attributes: ModelAttributes = None,
-        input_id: str = None,
-    ):
-        super().__init__(
-            attributes=attributes,
-            input_id=input_id,
-        )
-        self.predict_fn = predict_fn
-        self.prompt = prompt
+    def __post_init__(self):
+        if not getattr(self, "predict_fn") or not callable(self.predict_fn):
+            raise ValueError("FoundationModel requires a callable predict_fn")
+        self.name = self.name or self.predict_fn.__name__
     def _build_prompt(self, x: pd.DataFrame):
         """
@@ -59,21 +51,3 @@ class FoundationModel(VMModel):
         Predict method for the model. This is a wrapper around the model's
         """
         return [self.predict_fn(self._build_prompt(x[1])) for x in X.iterrows()]
-    def model_library(self):
-        """
-        Returns the model library name
-        """
-        return "FoundationModel"
-    def model_class(self):
-        """
-        Returns the model class name
-        """
-        return "FoundationModel"
-    def model_name(self):
-        """
-        Returns model name
-        """
-        return "FoundationModel"

validmind/models/function.py ADDED Viewed

@@ -0,0 +1,51 @@
+# Copyright © 2023-2024 ValidMind Inc. All rights reserved.
+# See the LICENSE file in the root of this repository for details.
+# SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
+from validmind.vm_models.model import VMModel
+# semi-immutable dict
+class Input(dict):
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self._new = set()
+    def __setitem__(self, key, value):
+        self._new.add(key)
+        super().__setitem__(key, value)
+    def __delitem__(self, _):
+        raise TypeError("Cannot delete keys from Input")
+    def get_new(self):
+        return {k: self[k] for k in self._new}
+class FunctionModel(VMModel):
+    """
+    FunctionModel class wraps a user-defined predict function
+    Attributes:
+        predict_fn (callable): The predict function that should take a dictionary of
+            input features and return a prediction.
+        input_id (str, optional): The input ID for the model. Defaults to None.
+        name (str, optional): The name of the model. Defaults to the name of the predict_fn.
+    """
+    def __post_init__(self):
+        if not getattr(self, "predict_fn") or not callable(self.predict_fn):
+            raise ValueError("FunctionModel requires a callable predict_fn")
+        self.name = self.name or self.predict_fn.__name__
+    def predict(self, X):
+        """Compute predictions for the input (X)
+        Args:
+            X (pandas.DataFrame): The input features to predict on
+        Returns:
+            list: The predictions
+        """
+        return [self.predict_fn(x) for x in X.to_dict(orient="records")]

validmind/models/huggingface.py CHANGED Viewed

@@ -4,41 +4,32 @@
 from dataclasses import dataclass
-import pandas as pd
 from validmind.errors import MissingOrInvalidModelPredictFnError
 from validmind.logging import get_logger
-from validmind.vm_models.model import (
-    ModelAttributes,
-    VMModel,
-    has_method_with_arguments,
-)
+from validmind.vm_models.model import VMModel, has_method_with_arguments
 logger = get_logger(__name__)
 @dataclass
 class HFModel(VMModel):
-    """
-    An Hugging Face model class that wraps a trained model instance and its associated data.
-    Attributes:
-        attributes (ModelAttributes, optional): The attributes of the model. Defaults to None.
-        model (object, optional): The trained model instance. Defaults to None.
-    """
     def __init__(
         self,
         input_id: str = None,
-        model: object = None,  # Trained model instance
-        attributes: ModelAttributes = None,
+        model: object = None,
+        attributes: object = None,
+        name: str = None,
+        **kwargs,
     ):
         super().__init__(
-            model=model,
-            input_id=input_id,
-            attributes=attributes,
+            input_id=input_id, model=model, attributes=attributes, name=name, **kwargs
         )
+    def __post_init__(self):
+        self.library = self.model.__class__.__module__.split(".")[0]
+        self.class_ = self.model.__class__.__name__
+        self.name = self.name or type(self.model).__name__
     def predict_proba(self, *args, **kwargs):
         """
         Invoke predict_proba from underline model
@@ -57,36 +48,15 @@ class HFModel(VMModel):
         Predict method for the model. This is a wrapper around the HF model's pipeline function
         """
         results = self.model([str(datapoint) for datapoint in data])
         tasks = self.model.__class__.__module__.split(".")
         if "text2text_generation" in tasks:
-            return pd.DataFrame(results).summary_text.values
+            return [result["summary_text"] for result in results]
         elif "text_classification" in tasks:
-            return pd.DataFrame(results).label.values
+            return [result["label"] for result in results]
         elif tasks[-1] == "feature_extraction":
-            # extract [CLS] token embedding for each input and wrap in dataframe
-            return pd.DataFrame([embedding[0][0] for embedding in results])
+            # Extract [CLS] token embedding for each input and return as list of lists
+            print(f"len(results): {len(results)}")
+            return [embedding[0][0] for embedding in results]
         else:
             return results
-    def model_library(self):
-        """
-        Returns the model library name
-        """
-        return self.model.__class__.__module__.split(".")[0]
-    def model_class(self):
-        """
-        Returns the model class name
-        """
-        return self.model.__class__.__name__
-    def model_name(self):
-        """
-        Returns model name
-        """
-        return type(self.model).__name__
-    def is_pytorch_model(self):
-        return self.model_library() == "torch"

validmind/models/metadata.py ADDED Viewed

@@ -0,0 +1,42 @@
+# Copyright © 2023-2024 ValidMind Inc. All rights reserved.
+# See the LICENSE file in the root of this repository for details.
+# SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
+from validmind.errors import MissingOrInvalidModelPredictFnError
+from validmind.vm_models.model import VMModel
+class MetadataModel(VMModel):
+    """
+    MetadataModel is designed to represent a model that is not available for inference
+    for various reasons but for which metadata and pre-computed predictions are available.
+    Model attributes are required since this will be the only information we can
+    collect and log about the model.
+    This class should not be instantiated directly. Instead call `vm.init_model()` and
+    pass in a dictionary with the required metadata as `attributes`.
+    Attributes:
+        attributes (ModelAttributes): The attributes of the model. Required.
+        input_id (str, optional): The input ID for the model. Defaults to None.
+        name (str, optional): The name of the model. Defaults to the class name.
+    """
+    def __post_init__(self):
+        if not getattr(self, "attributes"):
+            raise ValueError("MetadataModel requires attributes")
+        self.name = self.name or "Metadata Model"
+    def predict(self, *args, **kwargs):
+        """Not implemented for MetadataModel"""
+        raise MissingOrInvalidModelPredictFnError(
+            "MetadataModel does not support inference"
+        )
+    def predict_proba(self, *args, **kwargs):
+        """Not implemented for MetadataModel"""
+        raise MissingOrInvalidModelPredictFnError(
+            "MetadataModel does not support inference"
+        )

validmind/models/pipeline.py ADDED Viewed

@@ -0,0 +1,66 @@
+# Copyright © 2023-2024 ValidMind Inc. All rights reserved.
+# See the LICENSE file in the root of this repository for details.
+# SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
+from validmind.logging import get_logger
+from validmind.vm_models.model import ModelAttributes, ModelPipeline, VMModel
+logger = get_logger(__name__)
+class PipelineModel(VMModel):
+    """
+    An base class that wraps a trained model instance and its associated data.
+    Attributes:
+        pipeline (ModelPipeline): A pipeline of models to be executed. ModelPipeline
+            is just a simple container class with a list that can be chained with the
+            `|` operator.
+        input_id (str, optional): The input ID for the model. Defaults to None.
+        attributes (ModelAttributes, optional): The attributes of the model. Defaults to None.
+        name (str, optional): The name of the model. Defaults to the class name.
+    """
+    predict_col: str = None
+    def __init__(
+        self,
+        pipeline: ModelPipeline,
+        attributes: ModelAttributes = None,
+        input_id: str = None,
+        name: str = None,
+    ):
+        self.pipeline = pipeline
+        self.input_id = input_id
+        self.language = "Python"
+        self.library = self.__class__.__name__
+        self.library_version = "N/A"
+        self.class_ = self.__class__.__name__
+        self.name = name or self.__class__.__name__
+        self.attributes = attributes
+    def __or__(self, other):
+        if not isinstance(other, VMModel):
+            raise ValueError("Can only chain VMModel objects")
+        return ModelPipeline([self, other])
+    def serialize(self):
+        """
+        Serializes the model to a dictionary so it can be sent to the API
+        """
+        return {
+            "attributes": self.attributes.__dict__,
+        }
+    def predict(self, X):
+        X = X.copy()
+        for model in self.pipeline.models:
+            predictions = model.predict(X)
+            X[model.input_id] = predictions
+        return predictions

validmind 2.1.1__py3-none-any.whl → 2.2.2__py3-none-any.whl

validmind 2.1.1py3-none-any.whl → 2.2.2py3-none-any.whl