PyPI - lemonade-sdk - Versions diffs - 8.0.2__py3-none-any.whl → 8.0.4__py3-none-any.whl - Mend

lemonade-sdk 8.0.2py3-none-any.whl → 8.0.4py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of lemonade-sdk might be problematic. Click here for more details.

Files changed (26) hide show

lemonade/cli.py +2 -2
lemonade/profilers/profiler.py +4 -1
lemonade/tools/humaneval.py +1 -1
lemonade/tools/mmlu.py +1 -1
lemonade/tools/oga/load.py +3 -9
lemonade/tools/perplexity.py +2 -2
lemonade/tools/prompt.py +21 -6
lemonade/tools/quark/quark_load.py +1 -1
lemonade/tools/quark/quark_quantize.py +2 -2
lemonade/tools/report/table.py +80 -0
lemonade/tools/server/llamacpp.py +148 -16
lemonade/tools/server/serve.py +73 -0
lemonade/tools/server/static/styles.css +424 -4
lemonade/tools/server/static/webapp.html +337 -38
lemonade/tools/server/tray.py +25 -9
lemonade/version.py +1 -1
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/METADATA +33 -36
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/RECORD +26 -26
lemonade_server/model_manager.py +123 -36
lemonade_server/pydantic_models.py +25 -1
lemonade_server/server_models.json +53 -43
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/WHEEL +0 -0
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/entry_points.txt +0 -0
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/licenses/LICENSE +0 -0
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/licenses/NOTICE.md +0 -0
{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/top_level.txt +0 -0

lemonade/tools/server/static/webapp.html CHANGED Viewed

@@ -33,7 +33,47 @@
                         <input type="text" id="chat-input" placeholder="Type your message..." />
                         <button id="send-btn">Send</button>
                     </div>
-                </div>
+                </div>
+                <!-- App Suggestions Section -->
+                <div class="app-suggestions-section">
+                    <div class="suggestion-text">
+                        Use Lemonade with your favorite app
+                    </div>
+                    <div class="app-logos-grid">
+                        <a href="https://lemonade-server.ai/docs/server/apps/open-webui/" target="_blank" class="app-logo-item" title="Open WebUI">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/openwebui.jpg" alt="Open WebUI" class="app-logo-img">
+                            <span class="app-name">Open WebUI</span>
+                        </a>
+                        <a href="https://lemonade-server.ai/docs/server/apps/continue/" target="_blank" class="app-logo-item" title="Continue">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/continue_dev.png" alt="Continue" class="app-logo-img">
+                            <span class="app-name">Continue</span>
+                        </a>
+                        <a href="https://github.com/amd/gaia" target="_blank" class="app-logo-item" title="Gaia">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/gaia.ico" alt="Gaia" class="app-logo-img">
+                            <span class="app-name">Gaia</span>
+                        </a>
+                        <a href="https://lemonade-server.ai/docs/server/apps/anythingLLM/" target="_blank" class="app-logo-item" title="AnythingLLM">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/anything_llm.png" alt="AnythingLLM" class="app-logo-img">
+                            <span class="app-name">AnythingLLM</span>
+                        </a>
+                        <a href="https://lemonade-server.ai/docs/server/apps/ai-dev-gallery/" target="_blank" class="app-logo-item" title="AI Dev Gallery">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/ai_dev_gallery.webp" alt="AI Dev Gallery" class="app-logo-img">
+                            <span class="app-name">AI Dev Gallery</span>
+                        </a>
+                        <a href="https://lemonade-server.ai/docs/server/apps/lm-eval/" target="_blank" class="app-logo-item" title="LM-Eval">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/lm_eval.png" alt="LM-Eval" class="app-logo-img">
+                            <span class="app-name">LM-Eval</span>
+                        </a>
+                        <a href="https://lemonade-server.ai/docs/server/apps/codeGPT/" target="_blank" class="app-logo-item" title="CodeGPT">
+                            <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/codegpt.jpg" alt="CodeGPT" class="app-logo-img">
+                            <span class="app-name">CodeGPT</span>
+                        </a>
+                    <a href="https://github.com/lemonade-sdk/lemonade/blob/main/docs/server/apps/ai-toolkit.md" target="_blank" class="app-logo-item" title="AI Toolkit">
+                        <img src="https://raw.githubusercontent.com/lemonade-sdk/assets/refs/heads/main/partner_logos/ai_toolkit.png" alt="AI Toolkit" class="app-logo-img">
+                        <span class="app-name">AI Toolkit</span>
+                    </a>
+                    </div>
+                </div>
             </div>
             <div class="tab-content" id="content-models">                <div class="model-mgmt-register-form collapsed">                    <h3 class="model-mgmt-form-title" onclick="toggleAddModelForm()">
                         Add a Model
@@ -109,27 +149,157 @@
         <div class="copyright">Copyright 2025 AMD</div>
     </footer>
     <script src="https://cdn.jsdelivr.net/npm/openai@4.21.0/dist/openai.min.js"></script>
-    <script>    // Tab switching logic
-    function showTab(tab) {
+    <script src="https://cdn.jsdelivr.net/npm/marked@9.1.0/marked.min.js"></script>
+    <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
+    <script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
+    <script>
+    // Configure MathJax
+    window.MathJax = {
+        tex: {
+            inlineMath: [['\\(', '\\)'], ['$', '$']],
+            displayMath: [['\\[', '\\]'], ['$$', '$$']],
+            processEscapes: true,
+            processEnvironments: true
+        },
+        options: {
+            skipHtmlTags: ['script', 'noscript', 'style', 'textarea', 'pre']
+        }
+    };
+    </script>
+    <script>
+    // Configure marked.js for safe HTML rendering
+    marked.setOptions({
+        breaks: true,
+        gfm: true,
+        sanitize: false,
+        smartLists: true,
+        smartypants: true
+    });
+    // Function to unescape JSON strings
+    function unescapeJsonString(str) {
+        try {
+            return str.replace(/\\n/g, '\n')
+                     .replace(/\\t/g, '\t')
+                     .replace(/\\r/g, '\r')
+                     .replace(/\\"/g, '"')
+                     .replace(/\\\\/g, '\\');
+        } catch (error) {
+            console.error('Error unescaping string:', error);
+            return str;
+        }
+    }
+    // Function to safely render markdown with MathJax support
+    function renderMarkdown(text) {
+        try {
+            const html = marked.parse(text);
+            // Trigger MathJax to process the new content
+            if (window.MathJax && window.MathJax.typesetPromise) {
+                // Use a timeout to ensure DOM is updated before typesetting
+                setTimeout(() => {
+                    window.MathJax.typesetPromise();
+                }, 0);
+            }
+            return html;
+        } catch (error) {
+            console.error('Error rendering markdown:', error);
+            return text; // fallback to plain text
+        }
+    }
+    // Tab switching logic
+    function showTab(tab, updateHash = true) {
         document.getElementById('tab-chat').classList.remove('active');
         document.getElementById('tab-models').classList.remove('active');
         document.getElementById('content-chat').classList.remove('active');
         document.getElementById('content-models').classList.remove('active');
         if (tab === 'chat') {
             document.getElementById('tab-chat').classList.add('active');
-            document.getElementById('content-chat').classList.add('active');
+            document.getElementById('content-chat').classList.add('active');
+            if (updateHash) {
+                window.location.hash = 'llm-chat';
+            }
         } else {
             document.getElementById('tab-models').classList.add('active');
-            document.getElementById('content-models').classList.add('active');
+            document.getElementById('content-models').classList.add('active');
+            if (updateHash) {
+                window.location.hash = 'model-management';
+            }
         }
     }
+    // Handle hash changes for anchor navigation
+    function handleHashChange() {
+        const hash = window.location.hash.slice(1); // Remove the # symbol
+        if (hash === 'llm-chat') {
+            showTab('chat', false);
+        } else if (hash === 'model-management') {
+            showTab('models', false);
+        }
+    }
+    // Initialize tab based on URL hash on page load
+    function initializeTabFromHash() {
+        const hash = window.location.hash.slice(1);
+        if (hash === 'llm-chat') {
+            showTab('chat', false);
+        } else if (hash === 'model-management') {
+            showTab('models', false);
+        }
+        // If no hash or unrecognized hash, keep default (chat tab is already active)
+    }
+    // Listen for hash changes
+    window.addEventListener('hashchange', handleHashChange);
+    // Initialize on page load
+    document.addEventListener('DOMContentLoaded', initializeTabFromHash);
     // Toggle Add Model form
     function toggleAddModelForm() {
         const form = document.querySelector('.model-mgmt-register-form');
         form.classList.toggle('collapsed');
     }
+    // Handle image load failures for app logos
+    function handleImageFailure(img) {
+        const logoItem = img.closest('.app-logo-item');
+        if (logoItem) {
+            logoItem.classList.add('image-failed');
+        }
+    }
+    // Set up image error handlers when DOM is loaded
+    document.addEventListener('DOMContentLoaded', function() {
+        const logoImages = document.querySelectorAll('.app-logo-img');
+        logoImages.forEach(function(img) {
+            let imageLoaded = false;
+            img.addEventListener('load', function() {
+                imageLoaded = true;
+            });
+            img.addEventListener('error', function() {
+                if (!imageLoaded) {
+                    handleImageFailure(this);
+                }
+            });
+            // Also check if image is already broken (cached failure)
+            if (img.complete && img.naturalWidth === 0) {
+                handleImageFailure(img);
+            }
+            // Timeout fallback for slow connections (5 seconds)
+            setTimeout(function() {
+                if (!imageLoaded && !img.complete) {
+                    handleImageFailure(img);
+                }
+            }, 5000);
+        });
+    });
     // Helper to get server base URL
     function getServerBaseUrl() {
         const port = window.SERVER_PORT || 8000;
@@ -151,17 +321,37 @@
                 select.innerHTML = '<option>No models available</option>';
                 return;
             }
+            // Filter out embedding models from chat interface
+            const allModels = window.SERVER_MODELS || {};
+            let filteredModels = [];
             let defaultIndex = 0;
-            data.data.forEach(function(model, index) {
+            data.data.forEach(function(model) {
                 const modelId = model.id || model.name || model;
+                const modelInfo = allModels[modelId] || {};
+                const labels = modelInfo.labels || [];
+                // Skip models with "embeddings" or "reranking" label
+                if (labels.includes('embeddings') || labels.includes('reranking')) {
+                    return;
+                }
+                filteredModels.push(modelId);
                 const opt = document.createElement('option');
                 opt.value = modelId;
                 opt.textContent = modelId;
                 if (modelId === 'Llama-3.2-1B-Instruct-Hybrid') {
-                    defaultIndex = index;
+                    defaultIndex = filteredModels.length - 1;
                 }
                 select.appendChild(opt);
             });
+            if (filteredModels.length === 0) {
+                select.innerHTML = '<option>No chat models available</option>';
+                return;
+            }
             select.selectedIndex = defaultIndex;
         } catch (e) {
             const select = document.getElementById('model-select');
@@ -184,26 +374,24 @@
         // Add labels if they exist
         const modelData = allModels[modelId];
-        if (modelData) {
-            // Add reasoning label if reasoning is true
-            if (modelData.reasoning === true) {
-                const reasoningLabel = document.createElement('span');
-                reasoningLabel.className = 'model-label reasoning';
-                reasoningLabel.textContent = 'reasoning';
-                container.appendChild(reasoningLabel);
-            }
-            // Add other labels if they exist
-            if (modelData.labels && Array.isArray(modelData.labels)) {
-                modelData.labels.forEach(label => {
-                    const labelSpan = document.createElement('span');
-                    const labelLower = label.toLowerCase();
-                    const labelClass = (labelLower === 'vision') ? 'vision' : 'other';
-                    labelSpan.className = `model-label ${labelClass}`;
-                    labelSpan.textContent = label;
-                    container.appendChild(labelSpan);
-                });
-            }
+        if (modelData && modelData.labels && Array.isArray(modelData.labels)) {
+            modelData.labels.forEach(label => {
+                const labelSpan = document.createElement('span');
+                const labelLower = label.toLowerCase();
+                let labelClass = 'other';
+                if (labelLower === 'vision') {
+                    labelClass = 'vision';
+                } else if (labelLower === 'embeddings') {
+                    labelClass = 'embeddings';
+                } else if (labelLower === 'reasoning') {
+                    labelClass = 'reasoning';
+                } else if (labelLower === 'reranking') {
+                    labelClass = 'reranking';
+                }
+                labelSpan.className = `model-label ${labelClass}`;
+                labelSpan.textContent = label;
+                container.appendChild(labelSpan);
+            });
         }
         return container;
@@ -325,16 +513,110 @@
     const modelSelect = document.getElementById('model-select');
     let messages = [];
-    function appendMessage(role, text) {
+    function appendMessage(role, text, isMarkdown = false) {
         const div = document.createElement('div');
         div.className = 'chat-message ' + role;
         // Add a bubble for iMessage style
         const bubble = document.createElement('div');
         bubble.className = 'chat-bubble ' + role;
-        bubble.innerHTML = text;
+        if (role === 'llm' && isMarkdown) {
+            bubble.innerHTML = renderMarkdownWithThinkTokens(text);
+        } else {
+            bubble.textContent = text;
+        }
         div.appendChild(bubble);
         chatHistory.appendChild(div);
         chatHistory.scrollTop = chatHistory.scrollHeight;
+        return bubble; // Return the bubble element for streaming updates
+    }
+    function updateMessageContent(bubbleElement, text, isMarkdown = false) {
+        if (isMarkdown) {
+            bubbleElement.innerHTML = renderMarkdownWithThinkTokens(text);
+        } else {
+            bubbleElement.textContent = text;
+        }
+    }
+    function renderMarkdownWithThinkTokens(text) {
+        // Check if text contains opening think tag
+        if (text.includes('<think>')) {
+            if (text.includes('</think>')) {
+                // Complete think block - handle as before
+                const thinkMatch = text.match(/<think>(.*?)<\/think>/s);
+                if (thinkMatch) {
+                    const thinkContent = thinkMatch[1].trim();
+                    const mainResponse = text.replace(/<think>.*?<\/think>/s, '').trim();
+                    // Create collapsible structure
+                    let html = '';
+                    if (thinkContent) {
+                        html += `
+                            <div class="think-tokens-container">
+                                <div class="think-tokens-header" onclick="toggleThinkTokens(this)">
+                                    <span class="think-tokens-chevron">▼</span>
+                                    <span class="think-tokens-label">Thinking...</span>
+                                </div>
+                                <div class="think-tokens-content">
+                                    ${renderMarkdown(thinkContent)}
+                                </div>
+                            </div>
+                        `;
+                    }
+                    if (mainResponse) {
+                        html += `<div class="main-response">${renderMarkdown(mainResponse)}</div>`;
+                    }
+                    return html;
+                }
+            } else {
+                // Partial think block - only opening tag found, still being generated
+                const thinkMatch = text.match(/<think>(.*)/s);
+                if (thinkMatch) {
+                    const thinkContent = thinkMatch[1];
+                    const beforeThink = text.substring(0, text.indexOf('<think>'));
+                    let html = '';
+                    if (beforeThink.trim()) {
+                        html += `<div class="main-response">${renderMarkdown(beforeThink)}</div>`;
+                    }
+                    html += `
+                        <div class="think-tokens-container">
+                            <div class="think-tokens-header" onclick="toggleThinkTokens(this)">
+                                <span class="think-tokens-chevron">▼</span>
+                                <span class="think-tokens-label">Thinking...</span>
+                            </div>
+                            <div class="think-tokens-content">
+                                ${renderMarkdown(thinkContent)}
+                            </div>
+                        </div>
+                    `;
+                    return html;
+                }
+            }
+        }
+        // Fallback to normal markdown rendering
+        return renderMarkdown(text);
+    }
+    function toggleThinkTokens(header) {
+        const container = header.parentElement;
+        const content = container.querySelector('.think-tokens-content');
+        const chevron = header.querySelector('.think-tokens-chevron');
+        if (content.style.display === 'none') {
+            content.style.display = 'block';
+            chevron.textContent = '▼';
+            container.classList.remove('collapsed');
+        } else {
+            content.style.display = 'none';
+            chevron.textContent = '▶';
+            container.classList.add('collapsed');
+        }
     }
     async function sendMessage() {
@@ -346,8 +628,7 @@
         sendBtn.disabled = true;
         // Streaming OpenAI completions (placeholder, adapt as needed)
         let llmText = '';
-        appendMessage('llm', '...');
-        const llmDiv = chatHistory.lastChild.querySelector('.chat-bubble.llm');
+        const llmBubble = appendMessage('llm', '...');
         try {
             // Use the correct endpoint for chat completions
             const resp = await fetch(getServerBaseUrl() + '/api/v1/chat/completions', {
@@ -362,22 +643,40 @@
             if (!resp.body) throw new Error('No stream');
             const reader = resp.body.getReader();
             let decoder = new TextDecoder();
-            llmDiv.textContent = '';
+            llmBubble.textContent = '';
             while (true) {
                 const { done, value } = await reader.read();
                 if (done) break;
                 const chunk = decoder.decode(value);
                 if (chunk.trim() === 'data: [DONE]' || chunk.trim() === '[DONE]') continue;
-                // Try to extract the content from the OpenAI chunk
-                const match = chunk.match(/"content"\s*:\s*"([^"]*)"/);
-                if (match && match[1]) {
-                    llmText += match[1];
-                    llmDiv.textContent = llmText;
+                // Handle Server-Sent Events format
+                const lines = chunk.split('\n');
+                for (const line of lines) {
+                    if (line.startsWith('data: ')) {
+                        const jsonStr = line.substring(6).trim();
+                        if (jsonStr === '[DONE]') continue;
+                        try {
+                            const parsed = JSON.parse(jsonStr);
+                            if (parsed.choices && parsed.choices[0] && parsed.choices[0].delta && parsed.choices[0].delta.content) {
+                                llmText += parsed.choices[0].delta.content;
+                                updateMessageContent(llmBubble, llmText, true);
+                            }
+                        } catch (e) {
+                            // Fallback to regex parsing if JSON parsing fails
+                            const match = jsonStr.match(/"content"\s*:\s*"((?:\\.|[^"\\])*)"/);
+                            if (match && match[1]) {
+                                llmText += unescapeJsonString(match[1]);
+                                updateMessageContent(llmBubble, llmText, true);
+                            }
+                        }
+                    }
                 }
             }
             messages.push({ role: 'assistant', content: llmText });
         } catch (e) {
-            llmDiv.textContent = '[Error: ' + e.message + ']';
+            llmBubble.textContent = '[Error: ' + e.message + ']';
         }
         sendBtn.disabled = false;
     }

lemonade/tools/server/tray.py CHANGED Viewed

@@ -197,11 +197,17 @@ class LemonadeTray(SystemTray):
         """
         webbrowser.open("https://lemonade-server.ai/docs/")
+    def open_llm_chat(self, _, __):
+        """
+        Open the LLM chat in the default web browser.
+        """
+        webbrowser.open(f"http://localhost:{self.port}/#llm-chat")
     def open_model_manager(self, _, __):
         """
         Open the model manager in the default web browser.
         """
-        webbrowser.open(f"http://localhost:{self.port}/")
+        webbrowser.open(f"http://localhost:{self.port}/#model-management")
     def check_server_state(self):
         """
@@ -339,16 +345,25 @@ class LemonadeTray(SystemTray):
         # Create menu items for all downloaded models
         model_menu_items = []
-        for model_name, _ in self.downloaded_models.items():
-            # Create a function that returns the lambda to properly capture the variables
-            def create_handler(mod):
-                return lambda icon, item: self.load_llm(icon, item, mod)
+        if not self.downloaded_models:
+            model_menu_items.append(
+                MenuItem(
+                    "No models available: Use the Model Manager to pull models",
+                    None,
+                    enabled=False,
+                )
+            )
+        else:
+            for model_name, _ in self.downloaded_models.items():
+                # Create a function that returns the lambda to properly capture the variables
+                def create_handler(mod):
+                    return lambda icon, item: self.load_llm(icon, item, mod)
-            model_item = MenuItem(model_name, create_handler(model_name))
+                model_item = MenuItem(model_name, create_handler(model_name))
-            # Set checked property instead of modifying the text
-            model_item.checked = model_name == self.loaded_llm
-            model_menu_items.append(model_item)
+                # Set checked property instead of modifying the text
+                model_item.checked = model_name == self.loaded_llm
+                model_menu_items.append(model_item)
         load_submenu = Menu(*model_menu_items)
@@ -391,6 +406,7 @@ class LemonadeTray(SystemTray):
             )
         items.append(MenuItem("Documentation", self.open_documentation))
+        items.append(MenuItem("LLM Chat", self.open_llm_chat))
         items.append(MenuItem("Model Manager", self.open_model_manager))
         items.append(MenuItem("Show Logs", self.show_logs))
         items.append(Menu.SEPARATOR)

lemonade/version.py CHANGED Viewed

	@@ -1 +1 @@
1	- __version__ = "8.0.2"
1	+ __version__ = "8.0.4"

{lemonade_sdk-8.0.2.dist-info → lemonade_sdk-8.0.4.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lemonade-sdk
-Version: 8.0.2
+Version: 8.0.4
 Summary: Lemonade SDK: Your LLM Aide for Validation and Deployment
 Author-email: lemonade@amd.com
 Requires-Python: >=3.10, <3.12
@@ -26,45 +26,49 @@ Requires-Dist: openai>=1.81.0
 Requires-Dist: transformers<=4.51.3
 Requires-Dist: jinja2
 Requires-Dist: tabulate
-Requires-Dist: huggingface-hub==0.30.2
+Requires-Dist: sentencepiece
+Requires-Dist: huggingface-hub==0.33.0
+Provides-Extra: oga-hybrid
+Requires-Dist: onnx==1.16.1; extra == "oga-hybrid"
+Requires-Dist: numpy==1.26.4; extra == "oga-hybrid"
+Requires-Dist: protobuf>=6.30.1; extra == "oga-hybrid"
+Provides-Extra: oga-cpu
+Requires-Dist: onnxruntime-genai==0.8.2; extra == "oga-cpu"
+Requires-Dist: onnxruntime>=1.22.0; extra == "oga-cpu"
+Provides-Extra: dev
+Requires-Dist: torch>=2.6.0; extra == "dev"
+Requires-Dist: accelerate; extra == "dev"
+Requires-Dist: datasets; extra == "dev"
+Requires-Dist: pandas>=1.5.3; extra == "dev"
+Requires-Dist: matplotlib; extra == "dev"
+Requires-Dist: human-eval-windows==1.0.4; extra == "dev"
+Requires-Dist: lm-eval[api]; extra == "dev"
 Provides-Extra: oga-hybrid-minimal
-Requires-Dist: onnx==1.16.1; extra == "oga-hybrid-minimal"
-Requires-Dist: numpy==1.26.4; extra == "oga-hybrid-minimal"
-Requires-Dist: protobuf>=6.30.1; extra == "oga-hybrid-minimal"
+Requires-Dist: lemonade-sdk[oga-hybrid]; extra == "oga-hybrid-minimal"
 Provides-Extra: oga-cpu-minimal
-Requires-Dist: onnxruntime-genai==0.6.0; extra == "oga-cpu-minimal"
-Requires-Dist: onnxruntime<1.22.0,>=1.10.1; extra == "oga-cpu-minimal"
+Requires-Dist: lemonade-sdk[oga-cpu]; extra == "oga-cpu-minimal"
 Provides-Extra: llm
-Requires-Dist: torch>=2.6.0; extra == "llm"
-Requires-Dist: accelerate; extra == "llm"
-Requires-Dist: sentencepiece; extra == "llm"
-Requires-Dist: datasets; extra == "llm"
-Requires-Dist: pandas>=1.5.3; extra == "llm"
-Requires-Dist: matplotlib; extra == "llm"
-Requires-Dist: human-eval-windows==1.0.4; extra == "llm"
-Requires-Dist: lm-eval[api]; extra == "llm"
+Requires-Dist: lemonade-sdk[dev]; extra == "llm"
 Provides-Extra: llm-oga-cpu
-Requires-Dist: lemonade-sdk[oga-cpu-minimal]; extra == "llm-oga-cpu"
-Requires-Dist: lemonade-sdk[llm]; extra == "llm-oga-cpu"
+Requires-Dist: lemonade-sdk[dev,oga-cpu]; extra == "llm-oga-cpu"
 Provides-Extra: llm-oga-igpu
 Requires-Dist: onnxruntime-genai-directml==0.6.0; extra == "llm-oga-igpu"
 Requires-Dist: onnxruntime-directml<1.22.0,>=1.19.0; extra == "llm-oga-igpu"
 Requires-Dist: transformers<4.45.0; extra == "llm-oga-igpu"
-Requires-Dist: lemonade-sdk[llm]; extra == "llm-oga-igpu"
+Requires-Dist: lemonade-sdk[dev]; extra == "llm-oga-igpu"
 Provides-Extra: llm-oga-cuda
-Requires-Dist: onnxruntime-genai-cuda==0.6.0; extra == "llm-oga-cuda"
-Requires-Dist: onnxruntime-gpu<1.22.0,>=1.19.1; extra == "llm-oga-cuda"
-Requires-Dist: transformers<4.45.0; extra == "llm-oga-cuda"
-Requires-Dist: lemonade-sdk[llm]; extra == "llm-oga-cuda"
+Requires-Dist: onnxruntime-genai-cuda==0.8.2; extra == "llm-oga-cuda"
+Requires-Dist: onnxruntime-gpu>=1.22.0; extra == "llm-oga-cuda"
+Requires-Dist: transformers<=4.51.3; extra == "llm-oga-cuda"
+Requires-Dist: lemonade-sdk[dev]; extra == "llm-oga-cuda"
 Provides-Extra: llm-oga-npu
 Requires-Dist: onnx==1.16.0; extra == "llm-oga-npu"
 Requires-Dist: onnxruntime==1.18.0; extra == "llm-oga-npu"
 Requires-Dist: numpy==1.26.4; extra == "llm-oga-npu"
 Requires-Dist: protobuf>=6.30.1; extra == "llm-oga-npu"
-Requires-Dist: lemonade-sdk[llm]; extra == "llm-oga-npu"
+Requires-Dist: lemonade-sdk[dev]; extra == "llm-oga-npu"
 Provides-Extra: llm-oga-hybrid
-Requires-Dist: lemonade-sdk[oga-hybrid-minimal]; extra == "llm-oga-hybrid"
-Requires-Dist: lemonade-sdk[llm]; extra == "llm-oga-hybrid"
+Requires-Dist: lemonade-sdk[dev,oga-hybrid]; extra == "llm-oga-hybrid"
 Provides-Extra: llm-oga-unified
 Requires-Dist: lemonade-sdk[llm-oga-hybrid]; extra == "llm-oga-unified"
 Dynamic: author-email
@@ -78,7 +82,7 @@ Dynamic: summary
 [![Lemonade tests](https://github.com/lemonade-sdk/lemonade/actions/workflows/test_lemonade.yml/badge.svg)](https://github.com/lemonade-sdk/lemonade/tree/main/test "Check out our tests")
 [![OS - Windows | Linux](https://img.shields.io/badge/OS-windows%20%7C%20linux-blue)](docs/README.md#installation "Check out our instructions")
-[![Made with Python](https://img.shields.io/badge/Python-3.8,3.10-blue?logo=python&logoColor=white)](docs/README.md#installation "Check out our instructions")
+[![Made with Python](https://img.shields.io/badge/Python-3.10-blue?logo=python&logoColor=white)](docs/README.md#installation "Check out our instructions")
 ## 🍋 Lemonade SDK: Quickly serve, benchmark and deploy LLMs
@@ -93,8 +97,8 @@ The [Lemonade SDK](./docs/README.md) makes it easy to run Large Language Models
 The [Lemonade SDK](./docs/README.md) is comprised of the following:
 - 🌐 **[Lemonade Server](https://lemonade-server.ai/docs)**: A local LLM server for running ONNX and GGUF models using the OpenAI API standard. Install and enable your applications with NPU and GPU acceleration in minutes.
-- 🐍 **Lemonade API**: High-level Python API to directly integrate Lemonade LLMs into Python applications.
-- 🖥️ **Lemonade CLI**: The `lemonade` CLI lets you mix-and-match LLMs (ONNX, GGUF, SafeTensors) with measurement tools to characterize your models on your hardware. The available tools are:
+- 🐍 **[Lemonade API](./docs/lemonade_api.md)**: High-level Python API to directly integrate Lemonade LLMs into Python applications.
+- 🖥️ **[Lemonade CLI](./docs/dev_cli/README.md)**: The `lemonade` CLI lets you mix-and-match LLMs (ONNX, GGUF, SafeTensors) with measurement tools to characterize your models on your hardware. The available tools are:
   - Prompting with templates.
   - Measuring accuracy with a variety of tests.
   - Benchmarking to get the time-to-first-token and tokens per second.
@@ -149,14 +153,7 @@ Maximum LLM performance requires the right hardware accelerator with the right i
   </tbody>
 </table>
-#### Inference Engines Overview
-| Engine | Description |
-| :--- | :--- |
-| **OnnxRuntime GenAI (OGA)** | Microsoft engine that runs `.onnx` models and enables hardware vendors to provide their own execution providers (EPs) to support specialized hardware, such as neural processing units (NPUs). |
-| **llamacpp** | Community-driven engine with strong GPU acceleration, support for thousands of `.gguf` models, and advanced features such as vision-language models (VLMs) and mixture-of-experts (MoEs). |
-| **Hugging Face (HF)** | Hugging Face's `transformers` library can run the original `.safetensors` trained weights for models on Meta's PyTorch engine, which provides a source of truth for accuracy measurement. |
+To learn more about the supported hardware and software, visit the documentation [here](./docs/README.md#software-and-hardware-overview).
 ## Integrate Lemonade Server with Your Application

lemonade-sdk 8.0.2__py3-none-any.whl → 8.0.4__py3-none-any.whl

Potentially problematic release.

lemonade-sdk 8.0.2py3-none-any.whl → 8.0.4py3-none-any.whl