PyPI - hdsp-jupyter-extension - Versions diffs - 2.0.18__py3-none-any.whl → 2.0.20__py3-none-any.whl - Mend

hdsp-jupyter-extension 2.0.18py3-none-any.whl → 2.0.20py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

agent_server/langchain/agent_prompts/planner_prompt.py CHANGED Viewed

@@ -23,24 +23,35 @@ PLANNER_SYSTEM_PROMPT = """당신은 작업을 조율하는 Main Agent입니다.
 | athena_query | SQL 쿼리 생성 | task_tool(agent_name="athena_query", description="매출 테이블 조회 쿼리") |
 | researcher | 정보 검색 | task_tool(agent_name="researcher", description="관련 문서 검색") |
-## Step 3: 결과 실행/표시 (필수!)
+## Step 3: 결과 실행/적용 (필수!)
 **task_tool을 호출 했다면, 호출 후 반드시 결과를 처리해야 함:**
-| 서브에이전트 | 처리 방법 | 예시 |
-|-------------|----------|------|
-| python_developer | jupyter_cell_tool로 실행 또는 write/edit/multiedit file tool로 적용 | jupyter_cell_tool(code=반환된_코드) |
-| athena_query | **markdown_tool로 SQL 표시** (필수) | markdown_tool(content="```sql\n반환된_쿼리\n```") |
-| researcher | 텍스트로 요약 | 직접 응답 |
+| 서브에이전트 | 작업 유형 | 처리 방법 | 예시 |
+|-------------|----------|----------|------|
+| python_developer | 코드 실행 (데이터 분석, 시각화) | jupyter_cell_tool | jupyter_cell_tool(code=반환된_코드) |
+| python_developer | **파일 생성/수정** | **write_file_tool 또는 multiedit_file_tool** | write_file_tool(path="script.js", content=반환된_코드) |
+| athena_query | SQL 표시 | markdown_tool | markdown_tool(content="```sql\n반환된_쿼리\n```") |
+| researcher | 텍스트 요약 | 직접 응답 | - |
-**중요**: task_tool 결과를 받은 후 바로 write_todos로 완료 처리하지 말고, 반드시 위 도구로 결과를 먼저 표시!
+**🔴 중요: 코드 저장 도구 선택**
+- **파일 생성/수정 요청** → `write_file_tool` 또는 `multiedit_file_tool` 사용
+- **코드 실행 요청** (데이터 분석, 차트 등) → `jupyter_cell_tool` 사용
+- **❌ markdown_tool은 코드 저장용이 아님!** (표시 전용)
+**중요**: task_tool 결과를 받은 후 바로 write_todos로 완료 처리하지 말고, 반드시 위 도구로 결과를 먼저 적용!
+**🔴 KeyboardInterrupt 발생 시**: jupyter_cell_tool 실행 중 KeyboardInterrupt가 발생하면 ask_user_tool로 중단 사유를 사용자에게 확인
+  - 예: ask_user_tool(question="코드 실행이 중단되었습니다. 중단 사유를 알려주시면 다음 진행에 참고하겠습니다.", input_type="text")
 # write_todos 규칙 [필수]
     - 한국어로 작성
     - **🔴 기존 todo 절대 삭제 금지**: 전체 리스트를 항상 포함하고 status만 변경
-    - 잘못된 예: [{"content": "작업 요약", "status": "completed"}] ← 기존 todo 삭제됨!
-    - 올바른 예: [{"content": "기존 작업1", "status": "completed"}, {"content": "작업 요약", "status": "completed"}]
-    - **일괄 업데이트**: 연속 완료된 todo는 한 번의 write_todos 호출로 처리
-    - in_progress **1개만** 유지
+    - **🔴 상태 전환 순서 필수**: pending → in_progress → completed (건너뛰기 금지!)
+    - **🔴 초기 생성 규칙**: 첫 write_todos 호출 시 첫 번째 todo만 in_progress, 나머지는 모두 pending
+      - 올바른 초기 예: [{"content": "작업1", "status": "in_progress"}, {"content": "작업2", "status": "pending"}, {"content": "작업 요약 및 다음 단계 제시", "status": "pending"}]
+      - 잘못된 초기 예: [{"content": "작업1", "status": "completed"}, ...] ← 실제 작업 없이 completed 금지!
+    - **🔴 completed 전환 조건**: 실제 도구(task_tool, jupyter_cell_tool 등)로 작업 수행 후에만 completed로 변경
+    - in_progress 상태는 **동시에 1개만** 허용 (completed, pending todo는 삭제하지 않고 모두 유지)
     - content에 도구(tool)명 언급 금지
     - **[필수] 마지막 todo는 반드시 "작업 요약 및 다음 단계 제시"**

agent_server/langchain/custom_middleware.py CHANGED Viewed

@@ -444,63 +444,78 @@ def create_handle_empty_response_middleware(wrap_model_call):
                 )
                 if has_summary_pattern:
-                    # Try to extract and repair summary JSON from mixed content
-                    try:
-                        # Try to find JSON object containing summary
-                        import re
-                        json_match = re.search(r'\{[^{}]*"summary"[^{}]*"next_items"[^{}]*\}', content, re.DOTALL)
-                        if json_match:
-                            repaired_summary = repair_json(
-                                json_match.group(), return_objects=True
-                            )
-                        else:
-                            repaired_summary = repair_json(
-                                content, return_objects=True
-                            )
+                    # Check if pending todos exist - if so, don't force complete
+                    current_todos = request.state.get("todos", [])
+                    pending_todos = [
+                        t for t in current_todos
+                        if isinstance(t, dict) and t.get("status") == "pending"
+                    ]
+                    if pending_todos:
+                        logger.warning(
+                            "Summary JSON detected but pending todos remain - not forcing completion: %s",
+                            [t.get("content", "")[:30] for t in pending_todos],
+                        )
+                        # Don't synthesize completion, return response as-is
+                        # Let LLM continue working on pending todos
+                    else:
+                        # No pending todos, safe to synthesize completion
+                        # Try to extract and repair summary JSON from mixed content
+                        try:
+                            # Try to find JSON object containing summary
+                            import re
+                            json_match = re.search(r'\{[^{}]*"summary"[^{}]*"next_items"[^{}]*\}', content, re.DOTALL)
+                            if json_match:
+                                repaired_summary = repair_json(
+                                    json_match.group(), return_objects=True
+                                )
+                            else:
+                                repaired_summary = repair_json(
+                                    content, return_objects=True
+                                )
-                        if (
-                            isinstance(repaired_summary, dict)
-                            and "summary" in repaired_summary
-                            and "next_items" in repaired_summary
-                        ):
-                            # Create new message with repaired JSON content
-                            repaired_content = json.dumps(
-                                repaired_summary, ensure_ascii=False
-                            )
-                            logger.info(
-                                "Detected and repaired summary JSON in content (pattern-based detection)"
-                            )
-                            # Create message with repaired content
-                            repaired_response_message = AIMessage(
-                                content=repaired_content,
-                                tool_calls=getattr(
-                                    response_message, "tool_calls", []
+                            if (
+                                isinstance(repaired_summary, dict)
+                                and "summary" in repaired_summary
+                                and "next_items" in repaired_summary
+                            ):
+                                # Create new message with repaired JSON content
+                                repaired_content = json.dumps(
+                                    repaired_summary, ensure_ascii=False
                                 )
-                                or [],
-                            )
-                            synthetic_message = _create_synthetic_completion(
-                                request,
-                                repaired_response_message,
-                                has_content=True,
-                            )
-                            response = _replace_ai_message_in_response(
-                                response, synthetic_message
-                            )
-                            return response
-                    except Exception as e:
-                        logger.debug(f"Failed to extract summary JSON from mixed content: {e}")
+                                logger.info(
+                                    "Detected and repaired summary JSON in content (pattern-based detection)"
+                                )
+                                # Create message with repaired content
+                                repaired_response_message = AIMessage(
+                                    content=repaired_content,
+                                    tool_calls=getattr(
+                                        response_message, "tool_calls", []
+                                    )
+                                    or [],
+                                )
+                                synthetic_message = _create_synthetic_completion(
+                                    request,
+                                    repaired_response_message,
+                                    has_content=True,
+                                )
+                                response = _replace_ai_message_in_response(
+                                    response, synthetic_message
+                                )
+                                return response
+                        except Exception as e:
+                            logger.debug(f"Failed to extract summary JSON from mixed content: {e}")
-                    # Fallback: accept as-is if repair failed but looks like summary
-                    logger.info(
-                        "Detected summary JSON pattern in content - accepting and synthesizing write_todos"
-                    )
-                    synthetic_message = _create_synthetic_completion(
-                        request, response_message, has_content=True
-                    )
-                    response = _replace_ai_message_in_response(
-                        response, synthetic_message
-                    )
-                    return response
+                        # Fallback: accept as-is if repair failed but looks like summary
+                        logger.info(
+                            "Detected summary JSON pattern in content - accepting and synthesizing write_todos"
+                        )
+                        synthetic_message = _create_synthetic_completion(
+                            request, response_message, has_content=True
+                        )
+                        response = _replace_ai_message_in_response(
+                            response, synthetic_message
+                        )
+                        return response
                 # Legacy: Also check if current todo is a summary todo (backward compatibility)
                 todos = request.state.get("todos", [])
@@ -1009,17 +1024,34 @@ def create_normalize_tool_args_middleware(wrap_model_call, tools=None):
                                                     else:
                                                         found_first = True
-                                        # NOTE: Previously had logic to revert summary todo to in_progress
-                                        # if no summary JSON was found. This caused infinite loops
-                                        # where LLM kept calling write_todos repeatedly.
-                                        # Now we let the natural termination logic handle this.
-                                        #
-                                        # NOTE: Also removed logic to preserve todos when LLM tries to delete them.
-                                        # The LLM should be able to modify todos freely when:
-                                        # - User rejects code approval
-                                        # - User changes their request
-                                        # - Code execution fails
-                                        # We rely on prompts to guide proper todo management.
+                                        # Validate: "작업 요약 및 다음 단계 제시" cannot be in_progress if pending todos exist
+                                        # This prevents LLM from skipping pending tasks
+                                        summary_keywords = ["작업 요약", "다음 단계 제시"]
+                                        for i, todo in enumerate(todos):
+                                            if not isinstance(todo, dict):
+                                                continue
+                                            content = todo.get("content", "")
+                                            is_summary_todo = any(kw in content for kw in summary_keywords)
+                                            if is_summary_todo and todo.get("status") == "in_progress":
+                                                # Check if there are pending todos before this one
+                                                pending_before = [
+                                                    t for t in todos[:i]
+                                                    if isinstance(t, dict) and t.get("status") == "pending"
+                                                ]
+                                                if pending_before:
+                                                    # Revert summary todo to pending
+                                                    todo["status"] = "pending"
+                                                    # Set the first pending todo to in_progress
+                                                    for t in todos:
+                                                        if isinstance(t, dict) and t.get("status") == "pending":
+                                                            t["status"] = "in_progress"
+                                                            logger.warning(
+                                                                "Reverted summary todo to pending, set '%s' to in_progress (pending todos exist)",
+                                                                t.get("content", "")[:30],
+                                                            )
+                                                            break
+                                                    break
         return response

agent_server/langchain/llm_factory.py CHANGED Viewed

@@ -97,16 +97,37 @@ def _create_vllm_llm(llm_config: Dict[str, Any], callbacks):
     endpoint = vllm_config.get("endpoint", "http://localhost:8000/v1")
     model = vllm_config.get("model", "default")
     api_key = vllm_config.get("apiKey", "dummy")
+    use_responses_api = vllm_config.get("useResponsesApi", False)
+    temperature = vllm_config.get("temperature", 0.0)
-    logger.info(f"Creating vLLM LLM with model: {model}, endpoint: {endpoint}")
+    logger.info(
+        f"Creating vLLM LLM with model: {model}, endpoint: {endpoint}, "
+        f"use_responses_api: {use_responses_api}, temperature: {temperature}"
+    )
+    # Use ChatGPTOSS for gpt-oss models (Harmony format with developer role)
+    if "gpt-oss" in model.lower():
+        from agent_server.langchain.models import ChatGPTOSS
+        logger.info(f"Using ChatGPTOSS for gpt-oss model (developer role support)")
+        return ChatGPTOSS(
+            model=model,
+            base_url=endpoint,
+            api_key=api_key,
+            temperature=temperature,
+            max_tokens=8192,
+            streaming=False,
+            callbacks=callbacks,
+        )
     return ChatOpenAI(
         model=model,
         api_key=api_key,
         base_url=endpoint,  # Use endpoint as-is (no /v1 suffix added)
         streaming=False,  # Agent mode: disable LLM streaming (SSE handled by agent server)
-        temperature=0.0,
-        max_tokens=32768,
+        temperature=temperature,
+        max_tokens=8192,
+        use_responses_api=use_responses_api,  # Use /v1/responses endpoint if True
         callbacks=callbacks,
     )
@@ -148,14 +169,25 @@ def create_summarization_llm(llm_config: Dict[str, Any]):
                     temperature=0.0,
                 )
         elif provider == "vllm":
-            from langchain_openai import ChatOpenAI
             vllm_config = llm_config.get("vllm", {})
             # User provides full base URL (e.g., https://openrouter.ai/api/v1)
             endpoint = vllm_config.get("endpoint", "http://localhost:8000/v1")
             model = vllm_config.get("model", "default")
             api_key = vllm_config.get("apiKey", "dummy")
+            # Use ChatGPTOSS for gpt-oss models
+            if "gpt-oss" in model.lower():
+                from agent_server.langchain.models import ChatGPTOSS
+                return ChatGPTOSS(
+                    model=model,
+                    base_url=endpoint,
+                    api_key=api_key,
+                    temperature=0.0,
+                )
+            from langchain_openai import ChatOpenAI
             return ChatOpenAI(
                 model=model,
                 api_key=api_key,

agent_server/langchain/logging_utils.py CHANGED Viewed

@@ -14,8 +14,36 @@ from langchain_core.callbacks import BaseCallbackHandler
 logger = logging.getLogger(__name__)
+# Dedicated logger for LLM responses - always enabled with its own handler
+llm_response_logger = logging.getLogger("agent_server.llm_response")
+llm_response_logger.setLevel(logging.INFO)
+llm_response_logger.propagate = True  # Propagate to root logger
+# Ensure it has a handler if running standalone
+if not llm_response_logger.handlers and not logging.getLogger().handlers:
+    _handler = logging.StreamHandler()
+    _handler.setFormatter(logging.Formatter('%(message)s'))
+    llm_response_logger.addHandler(_handler)
+def disable_langchain_logging():
+    """Disable all langchain logging except LLM responses."""
+    # Set all langchain loggers to CRITICAL
+    for name in list(logging.Logger.manager.loggerDict.keys()):
+        if "langchain" in name.lower() or name.startswith("agent_server.langchain"):
+            logging.getLogger(name).setLevel(logging.CRITICAL)
+    # Keep LLM response logger at INFO
+    llm_response_logger.setLevel(logging.INFO)
+# Auto-disable on import (comment this line to re-enable all logs)
+disable_langchain_logging()
 LOG_SEPARATOR = "=" * 96
 LOG_SUBSECTION = "-" * 96
+LOG_EMOJI_LINE = "🔵" * 48
+LOG_RESPONSE_START = f"\n\n{LOG_EMOJI_LINE}\n{'=' * 96}\n  ✨ LLM RESPONSE START\n{'=' * 96}"
+LOG_RESPONSE_END = f"{'=' * 96}\n  ✅ LLM RESPONSE END\n{'=' * 96}\n{LOG_EMOJI_LINE}\n"
 def _format_system_prompt_for_log(messages) -> tuple[int, int, str]:
@@ -179,15 +207,15 @@ class LLMTraceLogger(BaseCallbackHandler):
             logger.info("%s", "\n".join(lines))
     def on_chat_model_start(self, serialized, messages, **kwargs) -> None:
-        if not messages:
-            logger.info(
-                "%s",
-                _format_messages_block("AGENT -> LLM PROMPT (<none>)", []),
-            )
-            return
-        self._log_prompt_batches("AGENT -> LLM PROMPT", messages)
+        # Request logging disabled - only log responses
+        pass
     def on_chat_model_end(self, response, **kwargs) -> None:
+        # Debug: Check if callback is even called
+        print("[DEBUG] on_chat_model_end CALLED!", flush=True)
+        # Use print for guaranteed visibility
+        print(LOG_RESPONSE_START, flush=True)
         generations = getattr(response, "generations", None) or []
         if generations and isinstance(generations[0], list):
             batches = generations
@@ -203,7 +231,7 @@ class LLMTraceLogger(BaseCallbackHandler):
                 title = (
                     f"LLM -> AGENT RESPONSE (batch={batch_idx}, generation={gen_idx})"
                 )
-                logger.info("%s", _format_messages_block(title, [message]))
+                print(_format_messages_block(title, [message]), flush=True)
                 tool_calls = getattr(message, "tool_calls", None)
                 if tool_calls:
@@ -211,13 +239,10 @@ class LLMTraceLogger(BaseCallbackHandler):
                         "LLM -> AGENT TOOL CALLS "
                         f"(batch={batch_idx}, generation={gen_idx})"
                     )
-                    logger.info("%s", _format_json_block(tool_title, tool_calls))
+                    print(_format_json_block(tool_title, tool_calls), flush=True)
-    def on_llm_start(self, serialized, prompts, **kwargs) -> None:
-        if not prompts:
-            logger.info("%s", _format_json_block("LLM PROMPT (<none>)", ""))
-            return
+        print(LOG_RESPONSE_END, flush=True)
-        for idx, prompt in enumerate(prompts):
-            title = f"LLM PROMPT (batch={idx}, length={len(prompt)})"
-            logger.info("%s", _format_json_block(title, prompt))
+    def on_llm_start(self, serialized, prompts, **kwargs) -> None:
+        # Request logging disabled - only log responses
+        pass

agent_server/langchain/models/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+"""Custom LangChain chat models."""
+from agent_server.langchain.models.gpt_oss_chat import ChatGPTOSS
+__all__ = ["ChatGPTOSS"]

hdsp-jupyter-extension 2.0.18__py3-none-any.whl → 2.0.20__py3-none-any.whl

hdsp-jupyter-extension 2.0.18py3-none-any.whl → 2.0.20py3-none-any.whl