neuromeka-vfm 0.1.1__tar.gz → 0.1.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. neuromeka_vfm-0.1.3/PKG-INFO +159 -0
  2. neuromeka_vfm-0.1.3/README.md +113 -0
  3. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/pyproject.toml +1 -1
  4. neuromeka_vfm-0.1.3/src/neuromeka_vfm/examples/__init__.py +1 -0
  5. neuromeka_vfm-0.1.3/src/neuromeka_vfm/examples/pose_demo.py +364 -0
  6. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/pose_estimation.py +30 -21
  7. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/segmentation.py +39 -3
  8. neuromeka_vfm-0.1.3/src/neuromeka_vfm.egg-info/PKG-INFO +159 -0
  9. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm.egg-info/SOURCES.txt +3 -1
  10. neuromeka_vfm-0.1.1/PKG-INFO +0 -109
  11. neuromeka_vfm-0.1.1/README.md +0 -64
  12. neuromeka_vfm-0.1.1/src/neuromeka_vfm.egg-info/PKG-INFO +0 -109
  13. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/LICENSE +0 -0
  14. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/setup.cfg +0 -0
  15. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/__init__.py +0 -0
  16. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/compression.py +0 -0
  17. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/pickle_client.py +0 -0
  18. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm/upload_mesh.py +0 -0
  19. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm.egg-info/dependency_links.txt +0 -0
  20. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm.egg-info/entry_points.txt +0 -0
  21. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm.egg-info/requires.txt +0 -0
  22. {neuromeka_vfm-0.1.1 → neuromeka_vfm-0.1.3}/src/neuromeka_vfm.egg-info/top_level.txt +0 -0
@@ -0,0 +1,159 @@
1
+ Metadata-Version: 2.4
2
+ Name: neuromeka_vfm
3
+ Version: 0.1.3
4
+ Summary: Client utilities for Neuromeka VFM FoundationPose RPC (upload meshes, call server)
5
+ Author: Neuromeka
6
+ License: MIT License
7
+
8
+ Copyright (c) 2025 Neuromeka Co., Ltd.
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of this software and associated documentation files (the "Software"), to deal
12
+ in the Software without restriction, including without limitation the rights
13
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
+ copies of the Software, and to permit persons to whom the Software is
15
+ furnished to do so, subject to the following conditions:
16
+
17
+ The above copyright notice and this permission notice shall be included in all
18
+ copies or substantial portions of the Software.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
27
+
28
+ Classifier: Development Status :: 3 - Alpha
29
+ Classifier: Intended Audience :: Developers
30
+ Classifier: License :: OSI Approved :: MIT License
31
+ Classifier: Programming Language :: Python :: 3
32
+ Classifier: Programming Language :: Python :: 3.8
33
+ Classifier: Programming Language :: Python :: 3.9
34
+ Classifier: Programming Language :: Python :: 3.10
35
+ Classifier: Programming Language :: Python :: 3.11
36
+ Classifier: Programming Language :: Python :: 3.12
37
+ Requires-Python: >=3.8
38
+ Description-Content-Type: text/markdown
39
+ License-File: LICENSE
40
+ Requires-Dist: numpy
41
+ Requires-Dist: pyzmq
42
+ Requires-Dist: paramiko
43
+ Requires-Dist: av
44
+ Requires-Dist: opencv-python-headless
45
+ Dynamic: license-file
46
+
47
+ # neuromeka_vfm
48
+
49
+ 클라이언트 PC에서 Segmentation (SAM2, Grounding DINO), Pose Estimation (NVIDIA FoundationPose) 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
50
+
51
+ - Website: http://www.neuromeka.com
52
+ - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
53
+ - PyPI package: https://pypi.org/project/neuromeka_vfm/
54
+ - Documents: https://docs.neuromeka.com
55
+
56
+ ## VFM (Vision Foundation Model) latency benchmark
57
+ 로컬 서버 구동 시 측정. 빈칸은 아직 미측정 항목입니다.
58
+
59
+ **RTX 5060**
60
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
61
+ | --- | --- | --- | --- | --- | --- |
62
+ | Grounding DINO | text (human . cup .) | 0.86 | 0.35 | 0.50 | 0.52 |
63
+ | DINOv2 | image prompt | 0.85 | 0.49 | 0.65 | 0.63 |
64
+ | SAM2 | - | | | | |
65
+ | FoundationPose registration | - | | | | |
66
+ | FoundationPose track | - | | | | |
67
+
68
+ **RTX 5090**
69
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
70
+ | --- | --- | --- | --- | --- | --- |
71
+ | Grounding DINO | text (human . cup .) | | | | |
72
+ | DINOv2 | image prompt | | | | |
73
+ | SAM2 | - | | | | |
74
+ | FoundationPose registration | - | | | | |
75
+ | FoundationPose track | - | | | | |
76
+
77
+
78
+ ## Installation
79
+ ```bash
80
+ pip install neuromeka_vfm
81
+ ```
82
+
83
+ ## 사용 예
84
+ ### Python API
85
+ ```python
86
+ from neuromeka_vfm import PoseEstimation, upload_mesh
87
+ # (옵션) Realtime segmentation client도 포함됩니다.
88
+
89
+ # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
90
+ upload_mesh(
91
+ host="192.168.10.72",
92
+ user="user",
93
+ password="pass", # 또는 key="~/.ssh/id_rsa"
94
+ local="mesh/123.stl",
95
+ remote="/home/user/meshes/123.stl",
96
+ )
97
+
98
+ # 2) PoseEstimation 클라이언트
99
+ pose = PoseEstimation(host="192.168.10.72", port=5557)
100
+ pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
101
+ # ...
102
+ pose.close()
103
+
104
+ # 3) Realtime segmentation client (예)
105
+ from neuromeka_vfm import Segmentation
106
+ seg = Segmentation(
107
+ hostname="192.168.10.63",
108
+ port=5432, # 해당 도커/서버 포트
109
+ compression_strategy="png", # none | png | jpeg | h264
110
+ benchmark=False,
111
+ )
112
+ # seg.register_first_frame(...)
113
+ # seg.get_next(...)
114
+ # seg.reset()
115
+ # seg.finish()
116
+ ```
117
+
118
+ ## 주의
119
+ - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
120
+ - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
121
+
122
+
123
+
124
+ ## API 레퍼런스 (Python)
125
+
126
+ ### PoseEstimation (FoundationPose RPC)
127
+ - `PoseEstimation(host=None, port=None)`
128
+ - `host`: FoundationPose 도커 서버가 구동 중인 PC의 IP.
129
+ - `port`: 5557
130
+ - `init(mesh_path, apply_scale=1.0, force_apply_color=False, apply_color=(160,160,160), est_refine_iter=10, track_refine_iter=3, min_n_views=40, inplane_step=60)`: 서버에 메쉬 등록 및 초기화.
131
+ - `register(rgb, depth, mask, K, iteration=None, check_vram=True)`: 첫 프레임 등록. `iteration`을 생략하면 서버 기본 반복 횟수를 사용하며, `check_vram=False`로 두면 GPU 메모리 사전 체크를 건너뜁니다.
132
+ - `track(rgb, depth, K, iteration=None, bbox_xywh=None)`: 추적/갱신. `bbox_xywh` 제공 시 해당 영역으로 탐색 범위를 좁힙니다.
133
+ - `reset()`: 세션 리셋.
134
+ - `reset_object()`: 캐시된 메쉬로 서버 측 `reset_object` 재호출.
135
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
136
+
137
+ ### Segmentation (실시간 SAM2/GroundingDINO)
138
+ - `Segmentation(hostname, port, compression_strategy="none", benchmark=False)`:
139
+ - `compression_strategy`: `none|png|jpeg|h264`
140
+ - `hostname`: 세그멘테이션 도커 서버가 구동 중인 PC의 IP.
141
+ - `add_image_prompt(object_name, object_image)`: 이미지 프롬프트 등록.
142
+ - `register_first_frame(frame, prompt, use_image_prompt=False) -> bool`: 첫 프레임 등록, 성공 시 `True` 반환. `use_image_prompt=True`면 모든 이름을 사전에 `add_image_prompt`로 등록해야 합니다(누락 시 `ValueError`).
143
+ - `get_next(frame) -> dict[obj_id, mask] | None`: 다음 프레임 세그멘테이션/트래킹 결과.
144
+ - `switch_compression_strategy(compression_strategy)`: 런타임 압축 방식 교체.
145
+ - `reset()`: 내부 상태 및 벤치마크 타이머 리셋.
146
+ - `finish()`: 로컬 상태 초기화.
147
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
148
+
149
+ ### 업로드 CLI/API
150
+ - `upload_mesh(host, user, port=22, password=None, key=None, local=None, remote=None)`: SSH/SFTP로 메쉬 전송, 비밀번호 또는 키 중 하나 필수.
151
+ - CLI: `neuromeka-upload-mesh --host ... --user ... (--password ... | --key ...) --local ... --remote ...`
152
+
153
+ ### 예제
154
+ - 실시간 Pose + Segmentation 데모: `python -m neuromeka_vfm.examples.pose_demo` (RealSense 필요, 서버 실행 상태에서 사용).
155
+
156
+
157
+ ## 릴리스 노트
158
+ - 0.1.1: PoseEstimation/Segmentation에서 리소스 정리 개선, iteration 미전달 시 서버 기본값 사용, pose 데모 예제 추가.
159
+ - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
@@ -0,0 +1,113 @@
1
+ # neuromeka_vfm
2
+
3
+ 클라이언트 PC에서 Segmentation (SAM2, Grounding DINO), Pose Estimation (NVIDIA FoundationPose) 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
4
+
5
+ - Website: http://www.neuromeka.com
6
+ - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
7
+ - PyPI package: https://pypi.org/project/neuromeka_vfm/
8
+ - Documents: https://docs.neuromeka.com
9
+
10
+ ## VFM (Vision Foundation Model) latency benchmark
11
+ 로컬 서버 구동 시 측정. 빈칸은 아직 미측정 항목입니다.
12
+
13
+ **RTX 5060**
14
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
15
+ | --- | --- | --- | --- | --- | --- |
16
+ | Grounding DINO | text (human . cup .) | 0.86 | 0.35 | 0.50 | 0.52 |
17
+ | DINOv2 | image prompt | 0.85 | 0.49 | 0.65 | 0.63 |
18
+ | SAM2 | - | | | | |
19
+ | FoundationPose registration | - | | | | |
20
+ | FoundationPose track | - | | | | |
21
+
22
+ **RTX 5090**
23
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
24
+ | --- | --- | --- | --- | --- | --- |
25
+ | Grounding DINO | text (human . cup .) | | | | |
26
+ | DINOv2 | image prompt | | | | |
27
+ | SAM2 | - | | | | |
28
+ | FoundationPose registration | - | | | | |
29
+ | FoundationPose track | - | | | | |
30
+
31
+
32
+ ## Installation
33
+ ```bash
34
+ pip install neuromeka_vfm
35
+ ```
36
+
37
+ ## 사용 예
38
+ ### Python API
39
+ ```python
40
+ from neuromeka_vfm import PoseEstimation, upload_mesh
41
+ # (옵션) Realtime segmentation client도 포함됩니다.
42
+
43
+ # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
44
+ upload_mesh(
45
+ host="192.168.10.72",
46
+ user="user",
47
+ password="pass", # 또는 key="~/.ssh/id_rsa"
48
+ local="mesh/123.stl",
49
+ remote="/home/user/meshes/123.stl",
50
+ )
51
+
52
+ # 2) PoseEstimation 클라이언트
53
+ pose = PoseEstimation(host="192.168.10.72", port=5557)
54
+ pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
55
+ # ...
56
+ pose.close()
57
+
58
+ # 3) Realtime segmentation client (예)
59
+ from neuromeka_vfm import Segmentation
60
+ seg = Segmentation(
61
+ hostname="192.168.10.63",
62
+ port=5432, # 해당 도커/서버 포트
63
+ compression_strategy="png", # none | png | jpeg | h264
64
+ benchmark=False,
65
+ )
66
+ # seg.register_first_frame(...)
67
+ # seg.get_next(...)
68
+ # seg.reset()
69
+ # seg.finish()
70
+ ```
71
+
72
+ ## 주의
73
+ - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
74
+ - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
75
+
76
+
77
+
78
+ ## API 레퍼런스 (Python)
79
+
80
+ ### PoseEstimation (FoundationPose RPC)
81
+ - `PoseEstimation(host=None, port=None)`
82
+ - `host`: FoundationPose 도커 서버가 구동 중인 PC의 IP.
83
+ - `port`: 5557
84
+ - `init(mesh_path, apply_scale=1.0, force_apply_color=False, apply_color=(160,160,160), est_refine_iter=10, track_refine_iter=3, min_n_views=40, inplane_step=60)`: 서버에 메쉬 등록 및 초기화.
85
+ - `register(rgb, depth, mask, K, iteration=None, check_vram=True)`: 첫 프레임 등록. `iteration`을 생략하면 서버 기본 반복 횟수를 사용하며, `check_vram=False`로 두면 GPU 메모리 사전 체크를 건너뜁니다.
86
+ - `track(rgb, depth, K, iteration=None, bbox_xywh=None)`: 추적/갱신. `bbox_xywh` 제공 시 해당 영역으로 탐색 범위를 좁힙니다.
87
+ - `reset()`: 세션 리셋.
88
+ - `reset_object()`: 캐시된 메쉬로 서버 측 `reset_object` 재호출.
89
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
90
+
91
+ ### Segmentation (실시간 SAM2/GroundingDINO)
92
+ - `Segmentation(hostname, port, compression_strategy="none", benchmark=False)`:
93
+ - `compression_strategy`: `none|png|jpeg|h264`
94
+ - `hostname`: 세그멘테이션 도커 서버가 구동 중인 PC의 IP.
95
+ - `add_image_prompt(object_name, object_image)`: 이미지 프롬프트 등록.
96
+ - `register_first_frame(frame, prompt, use_image_prompt=False) -> bool`: 첫 프레임 등록, 성공 시 `True` 반환. `use_image_prompt=True`면 모든 이름을 사전에 `add_image_prompt`로 등록해야 합니다(누락 시 `ValueError`).
97
+ - `get_next(frame) -> dict[obj_id, mask] | None`: 다음 프레임 세그멘테이션/트래킹 결과.
98
+ - `switch_compression_strategy(compression_strategy)`: 런타임 압축 방식 교체.
99
+ - `reset()`: 내부 상태 및 벤치마크 타이머 리셋.
100
+ - `finish()`: 로컬 상태 초기화.
101
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
102
+
103
+ ### 업로드 CLI/API
104
+ - `upload_mesh(host, user, port=22, password=None, key=None, local=None, remote=None)`: SSH/SFTP로 메쉬 전송, 비밀번호 또는 키 중 하나 필수.
105
+ - CLI: `neuromeka-upload-mesh --host ... --user ... (--password ... | --key ...) --local ... --remote ...`
106
+
107
+ ### 예제
108
+ - 실시간 Pose + Segmentation 데모: `python -m neuromeka_vfm.examples.pose_demo` (RealSense 필요, 서버 실행 상태에서 사용).
109
+
110
+
111
+ ## 릴리스 노트
112
+ - 0.1.1: PoseEstimation/Segmentation에서 리소스 정리 개선, iteration 미전달 시 서버 기본값 사용, pose 데모 예제 추가.
113
+ - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "neuromeka_vfm"
7
- version = "0.1.1"
7
+ version = "0.1.3"
8
8
  description = "Client utilities for Neuromeka VFM FoundationPose RPC (upload meshes, call server)"
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.8"
@@ -0,0 +1 @@
1
+ # Examples package for neuromeka_vfm
@@ -0,0 +1,364 @@
1
+ """
2
+ Real-time demo using neuromeka_vfm Segmentation + PoseEstimation clients.
3
+ - Streams RGB/Depth from an attached RealSense
4
+ - Uses segmentation masks to register/track in FoundationPose
5
+ - Renders 3D bounding box + axes overlay in a window
6
+
7
+ Requirements:
8
+ - A running segmentation server (SAM2/GroundingDINO) reachable by ZeroMQ
9
+ - A running FoundationPose RPC server
10
+ - A connected RealSense camera (pyrealsense2 installed)
11
+ """
12
+
13
+ import os
14
+ import time
15
+ from typing import Optional, Tuple
16
+
17
+ import cv2
18
+ import numpy as np
19
+ import pyrealsense2 as rs
20
+
21
+ from neuromeka_vfm import PoseEstimation, Segmentation
22
+
23
+
24
+ def to_homogeneous(pts: np.ndarray) -> np.ndarray:
25
+ """Append 1.0 to points for homogeneous projection."""
26
+ assert pts.ndim == 2, f"Expected (N,2|3), got {pts.shape}"
27
+ return np.concatenate((pts, np.ones((pts.shape[0], 1))), axis=-1)
28
+
29
+
30
+ def draw_posed_3d_box(
31
+ K: np.ndarray,
32
+ img: np.ndarray,
33
+ ob_in_cam: np.ndarray,
34
+ bbox: np.ndarray,
35
+ line_color=(0, 255, 0),
36
+ linewidth=2,
37
+ ) -> np.ndarray:
38
+ """Project a 3D bbox onto the image."""
39
+
40
+ def project_line(start, end, canvas):
41
+ pts = np.stack((start, end), axis=0).reshape(-1, 3)
42
+ pts = (ob_in_cam @ to_homogeneous(pts).T).T[:, :3]
43
+ projected = (K @ pts.T).T
44
+ uv = np.round(projected[:, :2] / projected[:, 2].reshape(-1, 1)).astype(int)
45
+ return cv2.line(
46
+ canvas,
47
+ uv[0].tolist(),
48
+ uv[1].tolist(),
49
+ color=line_color,
50
+ thickness=linewidth,
51
+ lineType=cv2.LINE_AA,
52
+ )
53
+
54
+ min_xyz = bbox.min(axis=0)
55
+ max_xyz = bbox.max(axis=0)
56
+ xmin, ymin, zmin = min_xyz
57
+ xmax, ymax, zmax = max_xyz
58
+
59
+ for y in [ymin, ymax]:
60
+ for z in [zmin, zmax]:
61
+ start = np.array([xmin, y, z])
62
+ end = start + np.array([xmax - xmin, 0, 0])
63
+ img = project_line(start, end, img)
64
+
65
+ for x in [xmin, xmax]:
66
+ for z in [zmin, zmax]:
67
+ start = np.array([x, ymin, z])
68
+ end = start + np.array([0, ymax - ymin, 0])
69
+ img = project_line(start, end, img)
70
+
71
+ for x in [xmin, xmax]:
72
+ for y in [ymin, ymax]:
73
+ start = np.array([x, y, zmin])
74
+ end = start + np.array([0, 0, zmax - zmin])
75
+ img = project_line(start, end, img)
76
+ return img
77
+
78
+
79
+ def project_point(pt: np.ndarray, K: np.ndarray, ob_in_cam: np.ndarray) -> np.ndarray:
80
+ pt = pt.reshape(4, 1)
81
+ projected = K @ ((ob_in_cam @ pt)[:3, :])
82
+ projected = projected.reshape(-1)
83
+ projected = projected / projected[2]
84
+ return projected[:2].round().astype(int)
85
+
86
+
87
+ def draw_axes(
88
+ img: np.ndarray,
89
+ ob_in_cam: np.ndarray,
90
+ scale: float,
91
+ K: np.ndarray,
92
+ thickness: int = 3,
93
+ transparency: float = 0.0,
94
+ is_input_rgb: bool = False,
95
+ ) -> np.ndarray:
96
+ """Overlay XYZ axes."""
97
+ if is_input_rgb:
98
+ img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
99
+ origin = tuple(project_point(np.array([0, 0, 0, 1]), K, ob_in_cam))
100
+ xx = tuple(project_point(np.array([scale, 0, 0, 1]), K, ob_in_cam))
101
+ yy = tuple(project_point(np.array([0, scale, 0, 1]), K, ob_in_cam))
102
+ zz = tuple(project_point(np.array([0, 0, scale, 1]), K, ob_in_cam))
103
+
104
+ base = img.copy()
105
+ next_img = cv2.arrowedLine(base.copy(), origin, xx, color=(0, 0, 255), thickness=thickness, line_type=cv2.LINE_AA)
106
+ mask = np.linalg.norm(next_img - base, axis=-1) > 0
107
+ base[mask] = base[mask] * transparency + next_img[mask] * (1 - transparency)
108
+
109
+ next_img = cv2.arrowedLine(base.copy(), origin, yy, color=(0, 255, 0), thickness=thickness, line_type=cv2.LINE_AA)
110
+ mask = np.linalg.norm(next_img - base, axis=-1) > 0
111
+ base[mask] = base[mask] * transparency + next_img[mask] * (1 - transparency)
112
+
113
+ next_img = cv2.arrowedLine(base.copy(), origin, zz, color=(255, 0, 0), thickness=thickness, line_type=cv2.LINE_AA)
114
+ mask = np.linalg.norm(next_img - base, axis=-1) > 0
115
+ base[mask] = base[mask] * transparency + next_img[mask] * (1 - transparency)
116
+
117
+ if is_input_rgb:
118
+ base = cv2.cvtColor(base.astype(np.uint8), cv2.COLOR_BGR2RGB)
119
+ return base.astype(np.uint8)
120
+
121
+
122
+ def setup_realsense(
123
+ rgb_shape: Tuple[int, int] = (960, 540),
124
+ depth_shape: Tuple[int, int] = (640, 480),
125
+ fps: int = 30,
126
+ ) -> Tuple[rs.pipeline, rs.align, float, np.ndarray]:
127
+ pipeline = rs.pipeline()
128
+ config = rs.config()
129
+ config.enable_stream(rs.stream.color, rgb_shape[0], rgb_shape[1], rs.format.bgr8, fps)
130
+ config.enable_stream(rs.stream.depth, depth_shape[0], depth_shape[1], rs.format.z16, fps)
131
+
132
+ profile = pipeline.start(config)
133
+ align_to_color = rs.align(rs.stream.color)
134
+
135
+ depth_sensor = profile.get_device().first_depth_sensor()
136
+ depth_scale = depth_sensor.get_depth_scale()
137
+
138
+ color_stream = profile.get_stream(rs.stream.color).as_video_stream_profile()
139
+ intr = color_stream.get_intrinsics()
140
+ cam_K = np.array(
141
+ [[intr.fx, 0.0, intr.ppx], [0.0, intr.fy, intr.ppy], [0.0, 0.0, 1.0]],
142
+ dtype=np.float32,
143
+ )
144
+ print("Intrinsics W,H:", intr.width, intr.height)
145
+ print("cam_K:\n", cam_K)
146
+ print("Depth scale:", depth_scale)
147
+ if depth_sensor.supports(rs.option.min_distance):
148
+ print("min_distance:", depth_sensor.get_option(rs.option.min_distance))
149
+ if depth_sensor.supports(rs.option.max_distance):
150
+ print("max_distance:", depth_sensor.get_option(rs.option.max_distance))
151
+
152
+ return pipeline, align_to_color, depth_scale, cam_K
153
+
154
+
155
+ def to_binary_mask(mask_obj: Optional[object]) -> Optional[np.ndarray]:
156
+ """Normalize mask outputs (dict/array) to a single uint8 binary mask."""
157
+ if mask_obj is None:
158
+ return None
159
+ if isinstance(mask_obj, dict):
160
+ if not mask_obj:
161
+ return None
162
+ mask_arrays = [np.asarray(m) for m in mask_obj.values() if m is not None]
163
+ if not mask_arrays:
164
+ return None
165
+ mask_arrays = [np.squeeze(m) for m in mask_arrays]
166
+ mask = np.zeros_like(mask_arrays[0], dtype=np.uint8)
167
+ for m in mask_arrays:
168
+ mask |= (np.asarray(m) > 0).astype(np.uint8)
169
+ if mask.ndim > 2:
170
+ mask = np.squeeze(mask)
171
+ return mask.astype(np.uint8)
172
+
173
+ mask_arr = np.asarray(mask_obj)
174
+ if mask_arr.ndim > 2:
175
+ mask_arr = np.squeeze(mask_arr)
176
+ if mask_arr.ndim > 2:
177
+ mask_arr = mask_arr[..., 0]
178
+ return (mask_arr > 0).astype(np.uint8)
179
+
180
+
181
+ def bbox_to_mask(bbox_xywh, shape_hw) -> Optional[np.ndarray]:
182
+ """Convert xywh bbox to a binary mask."""
183
+ if bbox_xywh is None:
184
+ return None
185
+ x, y, w, h = bbox_xywh
186
+ h_img, w_img = shape_hw
187
+ if w <= 0 or h <= 0:
188
+ return None
189
+ mask = np.zeros((h_img, w_img), dtype=np.uint8)
190
+ x0, y0 = int(x), int(y)
191
+ x1, y1 = min(w_img, x0 + int(w)), min(h_img, y0 + int(h))
192
+ mask[y0:y1, x0:x1] = 1
193
+ return mask
194
+
195
+
196
+ def overlay_mask(img_rgb: np.ndarray, mask: Optional[np.ndarray]) -> np.ndarray:
197
+ if mask is None:
198
+ return img_rgb
199
+ mask_bool = mask.astype(bool)
200
+ if not np.any(mask_bool):
201
+ return img_rgb
202
+ overlay = img_rgb.copy()
203
+ overlay_color = np.array([0, 255, 0], dtype=overlay.dtype)
204
+ overlay[mask_bool] = (
205
+ 0.6 * overlay[mask_bool].astype(np.float32) + 0.4 * overlay_color.astype(np.float32)
206
+ ).astype(overlay.dtype)
207
+ return overlay
208
+
209
+
210
+ def bbox_from_mask(mask_np: np.ndarray):
211
+ erosion_size = 5
212
+ kernel = np.ones((erosion_size, erosion_size), np.uint8)
213
+ mask_np = cv2.erode(mask_np.astype(np.uint8), kernel, iterations=1)
214
+
215
+ rows = np.any(mask_np, axis=1)
216
+ cols = np.any(mask_np, axis=0)
217
+ if np.any(rows) and np.any(cols):
218
+ y_min, y_max = np.where(rows)[0][[0, -1]]
219
+ x_min, x_max = np.where(cols)[0][[0, -1]]
220
+ bbox = [x_min, y_min, x_max - x_min, y_max - y_min]
221
+ else:
222
+ bbox = [-1, -1, 0, 0]
223
+ return bbox
224
+
225
+
226
+ def main():
227
+ mesh_path = os.environ.get("FPOSE_MESH_PATH", "/app/modules/foundation_pose/mesh/drug_box.stl")
228
+ fpose_host = os.environ.get("FPOSE_HOST", "192.168.10.69")
229
+ fpose_port = int(os.environ.get("FPOSE_PORT", "5557"))
230
+ seg_host = os.environ.get("SEG_HOST", "192.168.10.69")
231
+ seg_port = int(os.environ.get("SEG_PORT", "5432"))
232
+ seg_prompt = os.environ.get("SEG_PROMPT", "piece")
233
+ seg_ref_image = os.environ.get("SEG_REF_IMAGE", "piece.jpg")
234
+ seg_compression = os.environ.get("SEG_COMPRESSION", "none")
235
+
236
+ pipeline, align_to_color, depth_scale, cam_K = setup_realsense()
237
+
238
+ print("Init detector. path", seg_ref_image)
239
+ detector = Segmentation(seg_host, seg_port, compression_strategy=seg_compression)
240
+ use_image_prompt = False
241
+ if os.path.exists(seg_ref_image):
242
+ ref_img = cv2.imread(seg_ref_image)
243
+ if ref_img is not None:
244
+ ref_rgb = cv2.cvtColor(ref_img, cv2.COLOR_BGR2RGB)
245
+ detector.add_image_prompt(seg_prompt, ref_rgb)
246
+ use_image_prompt = True
247
+ print(f"Loaded image prompt '{seg_ref_image}' for '{seg_prompt}'")
248
+ else:
249
+ print(f"Failed to load reference image at {seg_ref_image}, falling back to text prompt.")
250
+ else:
251
+ print(f"Reference image {seg_ref_image} not found; using text prompt only.")
252
+
253
+ print("Init FP")
254
+ pose = PoseEstimation(host=fpose_host, port=fpose_port)
255
+ init_resp = pose.init(
256
+ mesh_path=mesh_path,
257
+ apply_scale=float(os.environ.get("FPOSE_APPLY_SCALE", "1.0")),
258
+ force_apply_color=os.environ.get("FPOSE_FORCE_APPLY_COLOR", "false").lower() == "true",
259
+ apply_color=tuple(map(float, os.environ.get("FPOSE_APPLY_COLOR", "160,160,160").split(","))),
260
+ est_refine_iter=int(7),
261
+ track_refine_iter=int(3),
262
+ min_n_views=int(5),
263
+ inplane_step=int(150),
264
+ )
265
+ if init_resp.get("result") != "SUCCESS":
266
+ print("Init failed:", init_resp)
267
+ return
268
+ to_origin = np.asarray(init_resp["data"]["to_origin"])
269
+ bbox = np.asarray(init_resp["data"]["bbox"])
270
+
271
+ initialized = False
272
+ current_pose = None
273
+ current_mask = None
274
+
275
+ cv2.namedWindow("neuromeka_vfm Pose Demo", cv2.WINDOW_NORMAL)
276
+
277
+ try:
278
+ while True:
279
+ frames = pipeline.wait_for_frames()
280
+ frames = align_to_color.process(frames)
281
+ color_frame = frames.get_color_frame()
282
+ depth_frame = frames.get_depth_frame()
283
+ if not color_frame or not depth_frame:
284
+ continue
285
+
286
+ color_bgr = np.asanyarray(color_frame.get_data())
287
+ depth = np.asanyarray(depth_frame.get_data()).astype(np.float32)
288
+ depth *= depth_scale
289
+ depth[(depth < 0.001) | np.isinf(depth)] = 0
290
+ color = cv2.cvtColor(color_bgr, cv2.COLOR_BGR2RGB)
291
+
292
+ if not initialized:
293
+ tic = time.time()
294
+ if detector.register_first_frame(color, seg_prompt, use_image_prompt=use_image_prompt):
295
+ detector.get_next(color)
296
+ init_mask_raw = detector.current_frame_masks
297
+ else:
298
+ init_mask_raw = None
299
+ print("First segmentation failed.")
300
+ cv2.imshow("neuromeka_vfm Pose Demo", color_bgr)
301
+ if cv2.waitKey(1) & 0xFF == 27:
302
+ break
303
+ continue
304
+
305
+ init_mask = to_binary_mask(init_mask_raw)
306
+ if init_mask is None:
307
+ print("No mask returned on first frame.")
308
+ continue
309
+ mask_uint8 = init_mask.astype(np.uint8) * 255
310
+
311
+ resp = pose.register(rgb=color, depth=depth, mask=mask_uint8, K=cam_K, iteration=7)
312
+ if resp.get("result") != "SUCCESS":
313
+ print("Register failed:", resp)
314
+ continue
315
+ current_pose = np.asarray(resp["data"]["pose"])
316
+ current_mask = init_mask
317
+ initialized = True
318
+ print(f"Registration: {time.time() - tic:.4f}s")
319
+ else:
320
+ tic = time.time()
321
+ detector.get_next(color)
322
+ current_mask_raw = detector.current_frame_masks
323
+ current_mask = to_binary_mask(current_mask_raw)
324
+ bbox_xywh = bbox_from_mask(current_mask) if current_mask is not None else None
325
+ if bbox_xywh is not None and (bbox_xywh[2] <= 0 or bbox_xywh[3] <= 0):
326
+ bbox_xywh = None
327
+ resp = pose.track(rgb=color, depth=depth, K=cam_K, iteration=3, bbox_xywh=bbox_xywh)
328
+ if resp.get("result") != "SUCCESS":
329
+ print("Track failed:", resp)
330
+ continue
331
+ data = resp.get("data", {})
332
+ current_pose = np.asarray(data.get("pose"))
333
+ print(f"Track: {time.time() - tic:.4f}s")
334
+
335
+ pose_np = current_pose[0] if current_pose.ndim == 3 else current_pose
336
+ center_pose = pose_np @ np.linalg.inv(to_origin)
337
+ vis_rgb = overlay_mask(color, current_mask)
338
+ vis_rgb = draw_posed_3d_box(cam_K, img=vis_rgb, ob_in_cam=center_pose, bbox=bbox)
339
+ vis_rgb = draw_axes(
340
+ vis_rgb,
341
+ ob_in_cam=center_pose,
342
+ scale=0.1,
343
+ K=cam_K,
344
+ thickness=3,
345
+ transparency=0,
346
+ is_input_rgb=True,
347
+ )
348
+
349
+ vis_bgr = cv2.cvtColor(vis_rgb, cv2.COLOR_RGB2BGR)
350
+ cv2.imshow("neuromeka_vfm Pose Demo", vis_bgr)
351
+ if cv2.waitKey(1) & 0xFF == 27:
352
+ break
353
+
354
+ except KeyboardInterrupt:
355
+ pass
356
+ finally:
357
+ detector.close()
358
+ pipeline.stop()
359
+ cv2.destroyAllWindows()
360
+ pose.close()
361
+
362
+
363
+ if __name__ == "__main__":
364
+ main()
@@ -41,29 +41,38 @@ class PoseEstimation:
41
41
  }
42
42
  )
43
43
 
44
- def register(self, rgb: np.ndarray, depth: np.ndarray, mask: np.ndarray, K: np.ndarray, iteration: int = None):
45
- return self.client.send_data(
46
- {
47
- "operation": "register",
48
- "rgb": rgb,
49
- "depth": depth,
50
- "mask": mask,
51
- "K": K,
52
- "iteration": iteration,
53
- }
54
- )
44
+ def register(
45
+ self,
46
+ rgb: np.ndarray,
47
+ depth: np.ndarray,
48
+ mask: np.ndarray,
49
+ K: np.ndarray,
50
+ iteration: int = None,
51
+ check_vram: bool = True,
52
+ ):
53
+ payload = {
54
+ "operation": "register",
55
+ "rgb": rgb,
56
+ "depth": depth,
57
+ "mask": mask,
58
+ "K": K,
59
+ "check_vram": bool(check_vram),
60
+ }
61
+ if iteration is not None:
62
+ payload["iteration"] = iteration
63
+ return self.client.send_data(payload)
55
64
 
56
65
  def track(self, rgb: np.ndarray, depth: np.ndarray, K: np.ndarray, iteration: int = None, bbox_xywh=None):
57
- return self.client.send_data(
58
- {
59
- "operation": "track",
60
- "rgb": rgb,
61
- "depth": depth,
62
- "K": K,
63
- "iteration": iteration,
64
- "bbox_xywh": bbox_xywh,
65
- }
66
- )
66
+ payload = {
67
+ "operation": "track",
68
+ "rgb": rgb,
69
+ "depth": depth,
70
+ "K": K,
71
+ "bbox_xywh": bbox_xywh,
72
+ }
73
+ if iteration is not None:
74
+ payload["iteration"] = iteration
75
+ return self.client.send_data(payload)
67
76
 
68
77
  def reset(self):
69
78
  return self.client.send_data({"operation": "reset"})
@@ -16,6 +16,7 @@ class Segmentation:
16
16
  self.client = PickleClient(hostname, port)
17
17
  self.tracking_object_ids = []
18
18
  self.current_frame_masks = {}
19
+ self.image_prompt_names = set()
19
20
  if compression_strategy in STRATEGIES:
20
21
  self.compression_strategy_name = compression_strategy
21
22
  else:
@@ -25,6 +26,25 @@ class Segmentation:
25
26
  self.call_time = {"add_image_prompt": 0, "register_first_frame": 0, "get_next": 0}
26
27
  self.call_count = {"add_image_prompt": 0, "register_first_frame": 0, "get_next": 0}
27
28
 
29
+ def _is_success(self, response):
30
+ """
31
+ Normalize server success flag.
32
+ Some servers return {"result": "SUCCESS"}, others {"success": true}, and
33
+ the segmentation server returns {"status": "success"}.
34
+ """
35
+ # Try known keys in order of common usage.
36
+ for key in ("result", "success", "status"):
37
+ flag = response.get(key)
38
+ if flag is None:
39
+ continue
40
+ if isinstance(flag, str):
41
+ return flag.lower() == "success"
42
+ if isinstance(flag, bool):
43
+ return flag
44
+ return bool(flag)
45
+ # Fallback: anything truthy counts as success.
46
+ return bool(response)
47
+
28
48
  def switch_compression_strategy(self, compression_strategy):
29
49
  if compression_strategy in STRATEGIES:
30
50
  self.compression_strategy_name = compression_strategy
@@ -45,6 +65,8 @@ class Segmentation:
45
65
  start = time.time()
46
66
  data = {"operation": "add_image_prompt", "object_name": object_name, "object_image": object_image}
47
67
  response = self.client.send_data(data)
68
+ if self._is_success(response):
69
+ self.image_prompt_names.add(object_name)
48
70
  if self.benchmark:
49
71
  self.call_time["add_image_prompt"] += time.time() - start
50
72
  self.call_count["add_image_prompt"] += 1
@@ -53,16 +75,23 @@ class Segmentation:
53
75
  def register_first_frame(self, frame: np.ndarray, prompt: Union[str, List[str]], use_image_prompt: bool = False):
54
76
  if self.benchmark:
55
77
  start = time.time()
78
+ prompt_to_send = prompt
79
+ if use_image_prompt:
80
+ prompt_list = prompt if isinstance(prompt, list) else [prompt]
81
+ missing = [p for p in prompt_list if p not in self.image_prompt_names]
82
+ if missing:
83
+ raise ValueError(f"Image prompt(s) not registered: {missing}. Call add_image_prompt first.")
84
+ prompt_to_send = prompt_list
56
85
  self.compression_strategy = STRATEGIES[self.compression_strategy_name](frame)
57
86
  data = {
58
87
  "operation": "start",
59
- "prompt": prompt,
88
+ "prompt": prompt_to_send,
60
89
  "frame": self.compression_strategy.encode(frame),
61
90
  "use_image_prompt": use_image_prompt,
62
91
  "compression_strategy": self.compression_strategy_name,
63
92
  }
64
93
  response = self.client.send_data(data)
65
- if response.get("result") == "SUCCESS":
94
+ if self._is_success(response):
66
95
  self.first_frame_registered = True
67
96
  self.tracking_object_ids = response["data"]["obj_ids"]
68
97
  masks = {}
@@ -88,7 +117,7 @@ class Segmentation:
88
117
  if self.benchmark:
89
118
  start = time.time()
90
119
  response = self.client.send_data({"operation": "get_next", "frame": self.compression_strategy.encode(frame)})
91
- if response.get("result") == "SUCCESS":
120
+ if self._is_success(response):
92
121
  masks = {}
93
122
  for i, obj_id in enumerate(self.tracking_object_ids):
94
123
  mask = self.compression_strategy.decode(response["data"]["masks"][i])
@@ -111,6 +140,13 @@ class Segmentation:
111
140
  self.tracking_object_ids = []
112
141
  self.current_frame_masks = {}
113
142
 
143
+ def close(self):
144
+ """Close underlying ZeroMQ socket/context."""
145
+ try:
146
+ self.finish()
147
+ finally:
148
+ self.client.close()
149
+
114
150
 
115
151
  # Backward-compat alias
116
152
  NrmkRealtimeSegmentation = Segmentation
@@ -0,0 +1,159 @@
1
+ Metadata-Version: 2.4
2
+ Name: neuromeka_vfm
3
+ Version: 0.1.3
4
+ Summary: Client utilities for Neuromeka VFM FoundationPose RPC (upload meshes, call server)
5
+ Author: Neuromeka
6
+ License: MIT License
7
+
8
+ Copyright (c) 2025 Neuromeka Co., Ltd.
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of this software and associated documentation files (the "Software"), to deal
12
+ in the Software without restriction, including without limitation the rights
13
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
+ copies of the Software, and to permit persons to whom the Software is
15
+ furnished to do so, subject to the following conditions:
16
+
17
+ The above copyright notice and this permission notice shall be included in all
18
+ copies or substantial portions of the Software.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
27
+
28
+ Classifier: Development Status :: 3 - Alpha
29
+ Classifier: Intended Audience :: Developers
30
+ Classifier: License :: OSI Approved :: MIT License
31
+ Classifier: Programming Language :: Python :: 3
32
+ Classifier: Programming Language :: Python :: 3.8
33
+ Classifier: Programming Language :: Python :: 3.9
34
+ Classifier: Programming Language :: Python :: 3.10
35
+ Classifier: Programming Language :: Python :: 3.11
36
+ Classifier: Programming Language :: Python :: 3.12
37
+ Requires-Python: >=3.8
38
+ Description-Content-Type: text/markdown
39
+ License-File: LICENSE
40
+ Requires-Dist: numpy
41
+ Requires-Dist: pyzmq
42
+ Requires-Dist: paramiko
43
+ Requires-Dist: av
44
+ Requires-Dist: opencv-python-headless
45
+ Dynamic: license-file
46
+
47
+ # neuromeka_vfm
48
+
49
+ 클라이언트 PC에서 Segmentation (SAM2, Grounding DINO), Pose Estimation (NVIDIA FoundationPose) 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
50
+
51
+ - Website: http://www.neuromeka.com
52
+ - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
53
+ - PyPI package: https://pypi.org/project/neuromeka_vfm/
54
+ - Documents: https://docs.neuromeka.com
55
+
56
+ ## VFM (Vision Foundation Model) latency benchmark
57
+ 로컬 서버 구동 시 측정. 빈칸은 아직 미측정 항목입니다.
58
+
59
+ **RTX 5060**
60
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
61
+ | --- | --- | --- | --- | --- | --- |
62
+ | Grounding DINO | text (human . cup .) | 0.86 | 0.35 | 0.50 | 0.52 |
63
+ | DINOv2 | image prompt | 0.85 | 0.49 | 0.65 | 0.63 |
64
+ | SAM2 | - | | | | |
65
+ | FoundationPose registration | - | | | | |
66
+ | FoundationPose track | - | | | | |
67
+
68
+ **RTX 5090**
69
+ | Task | Prompt | None (s) | JPEG (s) | PNG (s) | h264 (s) |
70
+ | --- | --- | --- | --- | --- | --- |
71
+ | Grounding DINO | text (human . cup .) | | | | |
72
+ | DINOv2 | image prompt | | | | |
73
+ | SAM2 | - | | | | |
74
+ | FoundationPose registration | - | | | | |
75
+ | FoundationPose track | - | | | | |
76
+
77
+
78
+ ## Installation
79
+ ```bash
80
+ pip install neuromeka_vfm
81
+ ```
82
+
83
+ ## 사용 예
84
+ ### Python API
85
+ ```python
86
+ from neuromeka_vfm import PoseEstimation, upload_mesh
87
+ # (옵션) Realtime segmentation client도 포함됩니다.
88
+
89
+ # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
90
+ upload_mesh(
91
+ host="192.168.10.72",
92
+ user="user",
93
+ password="pass", # 또는 key="~/.ssh/id_rsa"
94
+ local="mesh/123.stl",
95
+ remote="/home/user/meshes/123.stl",
96
+ )
97
+
98
+ # 2) PoseEstimation 클라이언트
99
+ pose = PoseEstimation(host="192.168.10.72", port=5557)
100
+ pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
101
+ # ...
102
+ pose.close()
103
+
104
+ # 3) Realtime segmentation client (예)
105
+ from neuromeka_vfm import Segmentation
106
+ seg = Segmentation(
107
+ hostname="192.168.10.63",
108
+ port=5432, # 해당 도커/서버 포트
109
+ compression_strategy="png", # none | png | jpeg | h264
110
+ benchmark=False,
111
+ )
112
+ # seg.register_first_frame(...)
113
+ # seg.get_next(...)
114
+ # seg.reset()
115
+ # seg.finish()
116
+ ```
117
+
118
+ ## 주의
119
+ - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
120
+ - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
121
+
122
+
123
+
124
+ ## API 레퍼런스 (Python)
125
+
126
+ ### PoseEstimation (FoundationPose RPC)
127
+ - `PoseEstimation(host=None, port=None)`
128
+ - `host`: FoundationPose 도커 서버가 구동 중인 PC의 IP.
129
+ - `port`: 5557
130
+ - `init(mesh_path, apply_scale=1.0, force_apply_color=False, apply_color=(160,160,160), est_refine_iter=10, track_refine_iter=3, min_n_views=40, inplane_step=60)`: 서버에 메쉬 등록 및 초기화.
131
+ - `register(rgb, depth, mask, K, iteration=None, check_vram=True)`: 첫 프레임 등록. `iteration`을 생략하면 서버 기본 반복 횟수를 사용하며, `check_vram=False`로 두면 GPU 메모리 사전 체크를 건너뜁니다.
132
+ - `track(rgb, depth, K, iteration=None, bbox_xywh=None)`: 추적/갱신. `bbox_xywh` 제공 시 해당 영역으로 탐색 범위를 좁힙니다.
133
+ - `reset()`: 세션 리셋.
134
+ - `reset_object()`: 캐시된 메쉬로 서버 측 `reset_object` 재호출.
135
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
136
+
137
+ ### Segmentation (실시간 SAM2/GroundingDINO)
138
+ - `Segmentation(hostname, port, compression_strategy="none", benchmark=False)`:
139
+ - `compression_strategy`: `none|png|jpeg|h264`
140
+ - `hostname`: 세그멘테이션 도커 서버가 구동 중인 PC의 IP.
141
+ - `add_image_prompt(object_name, object_image)`: 이미지 프롬프트 등록.
142
+ - `register_first_frame(frame, prompt, use_image_prompt=False) -> bool`: 첫 프레임 등록, 성공 시 `True` 반환. `use_image_prompt=True`면 모든 이름을 사전에 `add_image_prompt`로 등록해야 합니다(누락 시 `ValueError`).
143
+ - `get_next(frame) -> dict[obj_id, mask] | None`: 다음 프레임 세그멘테이션/트래킹 결과.
144
+ - `switch_compression_strategy(compression_strategy)`: 런타임 압축 방식 교체.
145
+ - `reset()`: 내부 상태 및 벤치마크 타이머 리셋.
146
+ - `finish()`: 로컬 상태 초기화.
147
+ - `close()`: ZeroMQ 소켓/컨텍스트 정리 (사용 후 필수 호출 권장).
148
+
149
+ ### 업로드 CLI/API
150
+ - `upload_mesh(host, user, port=22, password=None, key=None, local=None, remote=None)`: SSH/SFTP로 메쉬 전송, 비밀번호 또는 키 중 하나 필수.
151
+ - CLI: `neuromeka-upload-mesh --host ... --user ... (--password ... | --key ...) --local ... --remote ...`
152
+
153
+ ### 예제
154
+ - 실시간 Pose + Segmentation 데모: `python -m neuromeka_vfm.examples.pose_demo` (RealSense 필요, 서버 실행 상태에서 사용).
155
+
156
+
157
+ ## 릴리스 노트
158
+ - 0.1.1: PoseEstimation/Segmentation에서 리소스 정리 개선, iteration 미전달 시 서버 기본값 사용, pose 데모 예제 추가.
159
+ - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
@@ -12,4 +12,6 @@ src/neuromeka_vfm.egg-info/SOURCES.txt
12
12
  src/neuromeka_vfm.egg-info/dependency_links.txt
13
13
  src/neuromeka_vfm.egg-info/entry_points.txt
14
14
  src/neuromeka_vfm.egg-info/requires.txt
15
- src/neuromeka_vfm.egg-info/top_level.txt
15
+ src/neuromeka_vfm.egg-info/top_level.txt
16
+ src/neuromeka_vfm/examples/__init__.py
17
+ src/neuromeka_vfm/examples/pose_demo.py
@@ -1,109 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: neuromeka_vfm
3
- Version: 0.1.1
4
- Summary: Client utilities for Neuromeka VFM FoundationPose RPC (upload meshes, call server)
5
- Author: Neuromeka
6
- License: MIT License
7
-
8
- Copyright (c) 2025 Neuromeka Co., Ltd.
9
-
10
- Permission is hereby granted, free of charge, to any person obtaining a copy
11
- of this software and associated documentation files (the "Software"), to deal
12
- in the Software without restriction, including without limitation the rights
13
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
- copies of the Software, and to permit persons to whom the Software is
15
- furnished to do so, subject to the following conditions:
16
-
17
- The above copyright notice and this permission notice shall be included in all
18
- copies or substantial portions of the Software.
19
-
20
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
- SOFTWARE.
27
-
28
- Classifier: Development Status :: 3 - Alpha
29
- Classifier: Intended Audience :: Developers
30
- Classifier: License :: OSI Approved :: MIT License
31
- Classifier: Programming Language :: Python :: 3
32
- Classifier: Programming Language :: Python :: 3.8
33
- Classifier: Programming Language :: Python :: 3.9
34
- Classifier: Programming Language :: Python :: 3.10
35
- Classifier: Programming Language :: Python :: 3.11
36
- Classifier: Programming Language :: Python :: 3.12
37
- Requires-Python: >=3.8
38
- Description-Content-Type: text/markdown
39
- License-File: LICENSE
40
- Requires-Dist: numpy
41
- Requires-Dist: pyzmq
42
- Requires-Dist: paramiko
43
- Requires-Dist: av
44
- Requires-Dist: opencv-python-headless
45
-
46
- # neuromeka_vfm
47
-
48
- 클라이언트 PC에서 FoundationPose 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
49
-
50
- ## 설치
51
- ```bash
52
- pip install neuromeka_vfm
53
- ```
54
-
55
- ### 로컬 개발
56
- ```bash
57
- pip install -e .
58
- ```
59
-
60
- ## 사용 예
61
- ### Python API
62
- ```python
63
- from neuromeka_vfm import PoseEstimation, upload_mesh
64
- # (옵션) Realtime segmentation client도 포함됩니다.
65
-
66
- # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
67
- upload_mesh(
68
- host="192.168.10.72",
69
- user="user",
70
- password="pass", # 또는 key="~/.ssh/id_rsa"
71
- local="mesh/123.stl",
72
- remote="/home/user/meshes/123.stl",
73
- )
74
-
75
- # 2) PoseEstimation 클라이언트
76
- pose = PoseEstimation(host="192.168.10.72", port=5557)
77
- pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
78
- # ...
79
- pose.close()
80
-
81
- # 3) Realtime segmentation client (예)
82
- from neuromeka_vfm import Segmentation
83
- seg = Segmentation(
84
- hostname="192.168.10.72",
85
- port=5432, # 해당 도커/서버 포트
86
- compression_strategy="png", # none | png | jpeg | h264
87
- benchmark=False,
88
- )
89
- # seg.register_first_frame(...), seg.get_next(...), seg.finish(), seg.reset()
90
- ```
91
-
92
- ### CLI 업로드
93
- ```bash
94
- neuromeka-upload-mesh --host 192.168.10.72 --user user --password pass \
95
- --local mesh/123.stl --remote /home/user/meshes/123.stl
96
- ```
97
-
98
- ## 주의
99
- - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
100
- - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
101
-
102
- ## 링크
103
- - Website: http://www.neuromeka.com
104
- - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
105
- - PyPI package: https://pypi.org/project/neuromeka_vfm/
106
- - Documents: https://docs.neuromeka.com
107
-
108
- ## 릴리스 노트
109
- - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
@@ -1,64 +0,0 @@
1
- # neuromeka_vfm
2
-
3
- 클라이언트 PC에서 FoundationPose 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
4
-
5
- ## 설치
6
- ```bash
7
- pip install neuromeka_vfm
8
- ```
9
-
10
- ### 로컬 개발
11
- ```bash
12
- pip install -e .
13
- ```
14
-
15
- ## 사용 예
16
- ### Python API
17
- ```python
18
- from neuromeka_vfm import PoseEstimation, upload_mesh
19
- # (옵션) Realtime segmentation client도 포함됩니다.
20
-
21
- # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
22
- upload_mesh(
23
- host="192.168.10.72",
24
- user="user",
25
- password="pass", # 또는 key="~/.ssh/id_rsa"
26
- local="mesh/123.stl",
27
- remote="/home/user/meshes/123.stl",
28
- )
29
-
30
- # 2) PoseEstimation 클라이언트
31
- pose = PoseEstimation(host="192.168.10.72", port=5557)
32
- pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
33
- # ...
34
- pose.close()
35
-
36
- # 3) Realtime segmentation client (예)
37
- from neuromeka_vfm import Segmentation
38
- seg = Segmentation(
39
- hostname="192.168.10.72",
40
- port=5432, # 해당 도커/서버 포트
41
- compression_strategy="png", # none | png | jpeg | h264
42
- benchmark=False,
43
- )
44
- # seg.register_first_frame(...), seg.get_next(...), seg.finish(), seg.reset()
45
- ```
46
-
47
- ### CLI 업로드
48
- ```bash
49
- neuromeka-upload-mesh --host 192.168.10.72 --user user --password pass \
50
- --local mesh/123.stl --remote /home/user/meshes/123.stl
51
- ```
52
-
53
- ## 주의
54
- - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
55
- - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
56
-
57
- ## 링크
58
- - Website: http://www.neuromeka.com
59
- - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
60
- - PyPI package: https://pypi.org/project/neuromeka_vfm/
61
- - Documents: https://docs.neuromeka.com
62
-
63
- ## 릴리스 노트
64
- - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
@@ -1,109 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: neuromeka_vfm
3
- Version: 0.1.1
4
- Summary: Client utilities for Neuromeka VFM FoundationPose RPC (upload meshes, call server)
5
- Author: Neuromeka
6
- License: MIT License
7
-
8
- Copyright (c) 2025 Neuromeka Co., Ltd.
9
-
10
- Permission is hereby granted, free of charge, to any person obtaining a copy
11
- of this software and associated documentation files (the "Software"), to deal
12
- in the Software without restriction, including without limitation the rights
13
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
- copies of the Software, and to permit persons to whom the Software is
15
- furnished to do so, subject to the following conditions:
16
-
17
- The above copyright notice and this permission notice shall be included in all
18
- copies or substantial portions of the Software.
19
-
20
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
- SOFTWARE.
27
-
28
- Classifier: Development Status :: 3 - Alpha
29
- Classifier: Intended Audience :: Developers
30
- Classifier: License :: OSI Approved :: MIT License
31
- Classifier: Programming Language :: Python :: 3
32
- Classifier: Programming Language :: Python :: 3.8
33
- Classifier: Programming Language :: Python :: 3.9
34
- Classifier: Programming Language :: Python :: 3.10
35
- Classifier: Programming Language :: Python :: 3.11
36
- Classifier: Programming Language :: Python :: 3.12
37
- Requires-Python: >=3.8
38
- Description-Content-Type: text/markdown
39
- License-File: LICENSE
40
- Requires-Dist: numpy
41
- Requires-Dist: pyzmq
42
- Requires-Dist: paramiko
43
- Requires-Dist: av
44
- Requires-Dist: opencv-python-headless
45
-
46
- # neuromeka_vfm
47
-
48
- 클라이언트 PC에서 FoundationPose 서버(RPC, ZeroMQ)와 통신하고, SSH/SFTP로 호스트에 mesh를 업로드하는 간단한 유틸 패키지입니다.
49
-
50
- ## 설치
51
- ```bash
52
- pip install neuromeka_vfm
53
- ```
54
-
55
- ### 로컬 개발
56
- ```bash
57
- pip install -e .
58
- ```
59
-
60
- ## 사용 예
61
- ### Python API
62
- ```python
63
- from neuromeka_vfm import PoseEstimation, upload_mesh
64
- # (옵션) Realtime segmentation client도 포함됩니다.
65
-
66
- # 1) 서버로 mesh 업로드 (호스트 경로는 컨테이너에 -v로 마운트된 곳)
67
- upload_mesh(
68
- host="192.168.10.72",
69
- user="user",
70
- password="pass", # 또는 key="~/.ssh/id_rsa"
71
- local="mesh/123.stl",
72
- remote="/home/user/meshes/123.stl",
73
- )
74
-
75
- # 2) PoseEstimation 클라이언트
76
- pose = PoseEstimation(host="192.168.10.72", port=5557)
77
- pose.init(mesh_path="/app/modules/foundation_pose/mesh/123.stl")
78
- # ...
79
- pose.close()
80
-
81
- # 3) Realtime segmentation client (예)
82
- from neuromeka_vfm import Segmentation
83
- seg = Segmentation(
84
- hostname="192.168.10.72",
85
- port=5432, # 해당 도커/서버 포트
86
- compression_strategy="png", # none | png | jpeg | h264
87
- benchmark=False,
88
- )
89
- # seg.register_first_frame(...), seg.get_next(...), seg.finish(), seg.reset()
90
- ```
91
-
92
- ### CLI 업로드
93
- ```bash
94
- neuromeka-upload-mesh --host 192.168.10.72 --user user --password pass \
95
- --local mesh/123.stl --remote /home/user/meshes/123.stl
96
- ```
97
-
98
- ## 주의
99
- - `remote`는 **호스트** 경로입니다. 컨테이너 실행 시 `-v /home/user/meshes:/app/modules/foundation_pose/mesh`처럼 마운트하면, 업로드 직후 컨테이너에서 접근 가능합니다.
100
- - RPC 포트(기본 5557)는 서버가 `-p 5557:5557`으로 노출되어 있어야 합니다.
101
-
102
- ## 링크
103
- - Website: http://www.neuromeka.com
104
- - Source code: https://github.com/neuromeka-robotics/neuromeka_vfm
105
- - PyPI package: https://pypi.org/project/neuromeka_vfm/
106
- - Documents: https://docs.neuromeka.com
107
-
108
- ## 릴리스 노트
109
- - 0.1.0: 초기 공개 버전. FoundationPose RPC 클라이언트, 실시간 세그멘테이션 클라이언트, SSH 기반 mesh 업로드 CLI/API 포함.
File without changes
File without changes