PyPI - synapse-sdk - Versions diffs - 2025.9.1__py3-none-any.whl → 2025.9.3__py3-none-any.whl - Mend - Supply Chain Defender

synapse-sdk 2025.9.1py3-none-any.whl → 2025.9.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of synapse-sdk might be problematic. Click here for more details.

Files changed (80) hide show

synapse_sdk/devtools/docs/i18n/ko/docusaurus-plugin-content-docs/current/api/clients/data-collection-mixin.md ADDED Viewed

@@ -0,0 +1,356 @@
+---
+id: data-collection-mixin
+title: DataCollectionClientMixin
+sidebar_position: 13
+---
+# DataCollectionClientMixin
+Synapse 백엔드를 위한 데이터 수집 및 파일 관리 작업을 제공합니다.
+## 개요
+`DataCollectionClientMixin`은 데이터 컬렉션, 파일 업로드, 데이터 유닛, 일괄 처리와 관련된 모든 작업을 처리합니다. 이 믹스인은 `BackendClient`에 자동으로 포함되며 대규모 데이터 작업을 관리하기 위한 메서드를 제공합니다.
+## 데이터 컬렉션 작업
+### `list_data_collection()`
+사용 가능한 모든 데이터 컬렉션 목록을 가져옵니다.
+```python
+collections = client.list_data_collection()
+for collection in collections:
+    print(f"컬렉션: {collection['name']} (ID: {collection['id']})")
+```
+**반환값:**
+- `list`: 데이터 컬렉션 객체의 목록
+### `get_data_collection(data_collection_id)`
+특정 데이터 컬렉션에 대한 상세 정보를 가져옵니다.
+```python
+collection = client.get_data_collection(123)
+print(f"컬렉션: {collection['name']}")
+print(f"설명: {collection['description']}")
+# 파일 사양 접근
+file_specs = collection['file_specifications']
+for spec in file_specs:
+    print(f"파일 유형: {spec['name']}, 필수: {spec['is_required']}")
+```
+**매개변수:**
+- `data_collection_id` (int): 데이터 컬렉션 ID
+**반환값:**
+- `dict`: 파일 사양을 포함한 상세 컬렉션 정보
+**컬렉션 구조:**
+- `id`: 컬렉션 ID
+- `name`: 컬렉션 이름
+- `description`: 컬렉션 설명
+- `file_specifications`: 필수 파일 유형 및 형식 목록
+- `project`: 연관된 프로젝트 ID
+- `created_at`: 생성 타임스탬프
+## 파일 작업
+### `create_data_file(file_path, use_chunked_upload=False)`
+백엔드에 데이터 파일을 생성하고 업로드합니다.
+```python
+from pathlib import Path
+# 작은 파일을 위한 일반 업로드
+data_file = client.create_data_file(Path('/path/to/image.jpg'))
+print(f"업로드된 파일 ID: {data_file['id']}")
+# 대용량 파일을 위한 청크 업로드 (50MB 이상 권장)
+large_file = client.create_data_file(
+    Path('/path/to/large_dataset.zip'),
+    use_chunked_upload=True
+)
+print(f"대용량 파일 업로드됨: {large_file['id']}")
+```
+**매개변수:**
+- `file_path` (Path): 업로드할 파일을 가리키는 Path 객체
+- `use_chunked_upload` (bool): 대용량 파일을 위한 청크 업로드 활성화
+**반환값:**
+- `dict` 또는 `str`: 파일 ID와 메타데이터가 포함된 파일 업로드 응답
+**청크 업로드를 사용해야 하는 경우:**
+- 50MB보다 큰 파일
+- 불안정한 네트워크 연결
+- 업로드 진행률 추적이 필요한 경우
+- 더 나은 오류 복구를 위해
+### `upload_data_file(organized_file, collection_id, use_chunked_upload=False)`
+정리된 파일 데이터를 특정 컬렉션에 업로드합니다.
+```python
+# 파일 데이터 정리
+organized_file = {
+    'files': {
+        'image': Path('/path/to/image.jpg'),
+        'annotation': Path('/path/to/annotation.json'),
+        'metadata': Path('/path/to/metadata.xml')
+    },
+    'meta': {
+        'origin_file_stem': 'sample_001',
+        'origin_file_extension': '.jpg',
+        'created_at': '2023-10-01T12:00:00Z',
+        'batch_id': 'batch_001'
+    }
+}
+# 컬렉션에 업로드
+result = client.upload_data_file(
+    organized_file=organized_file,
+    collection_id=123,
+    use_chunked_upload=False
+)
+```
+**매개변수:**
+- `organized_file` (dict): 파일과 메타데이터가 포함된 구조화된 파일 데이터
+- `collection_id` (int): 대상 데이터 컬렉션 ID
+- `use_chunked_upload` (bool): 청크 업로드 활성화
+**정리된 파일 구조:**
+- `files` (dict): 파일 유형을 파일 경로에 매핑하는 딕셔너리
+- `meta` (dict): 파일 그룹과 연관된 메타데이터
+**반환값:**
+- `dict`: 파일 참조와 ID가 포함된 업로드 결과
+### `create_data_units(uploaded_files)`
+이전에 업로드된 파일에서 데이터 유닛을 생성합니다.
+```python
+# 업로드된 파일들
+uploaded_files = [
+    {
+        'id': 1,
+        'file': {'image': 'file_id_123', 'annotation': 'file_id_124'},
+        'meta': {'batch': 'batch_001'}
+    },
+    {
+        'id': 2,
+        'file': {'image': 'file_id_125', 'annotation': 'file_id_126'},
+        'meta': {'batch': 'batch_001'}
+    }
+]
+# 데이터 유닛 생성
+data_units = client.create_data_units(uploaded_files)
+print(f"{len(data_units)}개의 데이터 유닛 생성됨")
+```
+**매개변수:**
+- `uploaded_files` (list): 업로드된 파일 구조의 목록
+**반환값:**
+- `list`: ID와 메타데이터가 포함된 생성된 데이터 유닛
+## 일괄 처리
+믹스인은 대규모 작업을 위한 효율적인 일괄 처리를 지원합니다:
+```python
+from multiprocessing import Pool
+from pathlib import Path
+# 예제: 여러 파일 일괄 업로드
+file_paths = [
+    Path('/data/batch1/file1.jpg'),
+    Path('/data/batch1/file2.jpg'),
+    Path('/data/batch1/file3.jpg'),
+    # ... 더 많은 파일
+]
+# 파일을 배치로 처리
+batch_size = 10
+for i in range(0, len(file_paths), batch_size):
+    batch = file_paths[i:i+batch_size]
+    # 배치 업로드
+    uploaded_files = []
+    for file_path in batch:
+        result = client.create_data_file(file_path)
+        uploaded_files.append({
+            'id': len(uploaded_files) + 1,
+            'file': {'image': result['id']},
+            'meta': {'batch': f'batch_{i//batch_size}'}
+        })
+    # 배치용 데이터 유닛 생성
+    data_units = client.create_data_units(uploaded_files)
+    print(f"배치 {i//batch_size} 처리됨: {len(data_units)}개의 데이터 유닛")
+```
+## 진행률 추적
+대용량 업로드의 경우 진행률을 추적할 수 있습니다:
+```python
+import os
+from tqdm import tqdm
+def upload_with_progress(file_paths, collection_id):
+    """진행률 추적과 함께 파일 업로드."""
+    uploaded_files = []
+    with tqdm(total=len(file_paths), desc="파일 업로드 중") as pbar:
+        for file_path in file_paths:
+            try:
+                # 업로드 방법을 결정하기 위해 파일 크기 확인
+                file_size = os.path.getsize(file_path)
+                use_chunked = file_size > 50 * 1024 * 1024  # 50MB
+                # 파일 업로드
+                result = client.create_data_file(
+                    file_path,
+                    use_chunked_upload=use_chunked
+                )
+                # 컬렉션용 정리
+                organized_file = {
+                    'files': {'primary': file_path},
+                    'meta': {
+                        'origin_file_stem': file_path.stem,
+                        'origin_file_extension': file_path.suffix,
+                        'file_size': file_size
+                    }
+                }
+                upload_result = client.upload_data_file(
+                    organized_file,
+                    collection_id,
+                    use_chunked_upload=use_chunked
+                )
+                uploaded_files.append(upload_result)
+                pbar.update(1)
+            except Exception as e:
+                print(f"{file_path} 업로드 실패: {e}")
+                pbar.update(1)
+                continue
+    return uploaded_files
+# 사용법
+file_paths = [Path(f'/data/file_{i}.jpg') for i in range(100)]
+results = upload_with_progress(file_paths, collection_id=123)
+```
+## 데이터 검증
+### 파일 사양 검증
+```python
+def validate_files_against_collection(file_paths, collection_id):
+    """컬렉션 사양에 대해 파일을 검증."""
+    collection = client.get_data_collection(collection_id)
+    file_specs = collection['file_specifications']
+    # 사양 조회 생성
+    required_types = {spec['name'] for spec in file_specs if spec['is_required']}
+    optional_types = {spec['name'] for spec in file_specs if not spec['is_required']}
+    # 파일 정리 검증
+    organized_files = []
+    for file_path in file_paths:
+        # 경로 또는 메타데이터에서 파일 유형 추출
+        file_type = extract_file_type(file_path)  # 사용자 정의 함수
+        if file_type in required_types or file_type in optional_types:
+            organized_files.append({
+                'path': file_path,
+                'type': file_type,
+                'valid': True
+            })
+        else:
+            print(f"경고: {file_path}에 대한 알 수 없는 파일 유형 '{file_type}'")
+            organized_files.append({
+                'path': file_path,
+                'type': file_type,
+                'valid': False
+            })
+    return organized_files
+def extract_file_type(file_path):
+    """경로에서 파일 유형 추출 - 명명 규칙에 따라 구현."""
+    # 예제 구현
+    if 'image' in str(file_path):
+        return 'image'
+    elif 'annotation' in str(file_path):
+        return 'annotation'
+    elif 'metadata' in str(file_path):
+        return 'metadata'
+    else:
+        return 'unknown'
+```
+## 오류 처리 및 재시도 로직
+```python
+import time
+from synapse_sdk.clients.exceptions import ClientError
+def robust_upload(file_path, max_retries=3):
+    """안정성을 위한 재시도 로직이 있는 업로드."""
+    for attempt in range(max_retries):
+        try:
+            result = client.create_data_file(file_path, use_chunked_upload=True)
+            return result
+        except ClientError as e:
+            if e.status_code == 413:  # 파일이 너무 큼
+                print(f"파일 {file_path}이 너무 큼, 청크 업로드 시도")
+                try:
+                    return client.create_data_file(file_path, use_chunked_upload=True)
+                except Exception as retry_e:
+                    print(f"청크 업로드 실패: {retry_e}")
+                    if attempt == max_retries - 1:
+                        raise
+            elif e.status_code == 429:  # 요청 제한
+                wait_time = 2 ** attempt  # 지수 백오프
+                print(f"요청 제한됨, {wait_time}초 대기 중...")
+                time.sleep(wait_time)
+            else:
+                print(f"업로드 실패 (시도 {attempt + 1}): {e}")
+                if attempt == max_retries - 1:
+                    raise
+        except Exception as e:
+            print(f"예상치 못한 오류 (시도 {attempt + 1}): {e}")
+            if attempt == max_retries - 1:
+                raise
+            time.sleep(1)  # 재시도 전 잠시 대기
+```
+## 참고
+- [BackendClient](./backend.md) - 메인 백엔드 클라이언트
+- [CoreClientMixin](./core-mixin.md) - 핵심 파일 작업
+- [AnnotationClientMixin](./annotation-mixin.md) - 태스크 및 어노테이션 관리

synapse_sdk/devtools/docs/i18n/ko/docusaurus-plugin-content-docs/current/api/clients/hitl-mixin.md ADDED Viewed

@@ -0,0 +1,192 @@
+---
+id: hitl-mixin
+title: HITLClientMixin
+sidebar_position: 14
+---
+# HITLClientMixin
+Synapse 백엔드를 위한 Human-in-the-Loop (HITL) 할당 관리 작업을 제공합니다.
+## 개요
+`HITLClientMixin`은 할당 관리 및 태깅을 포함한 human-in-the-loop 워크플로와 관련된 모든 작업을 처리합니다. 이 믹스인은 `BackendClient`에 자동으로 포함되며 인간 어노테이션 및 검토 워크플로를 관리하기 위한 메서드를 제공합니다.
+## 할당 작업
+### `get_assignment(pk)`
+특정 할당에 대한 상세 정보를 가져옵니다.
+```python
+assignment = client.get_assignment(789)
+print(f"할당: {assignment['id']}")
+print(f"프로젝트: {assignment['project']}")
+print(f"상태: {assignment['status']}")
+print(f"할당자: {assignment['assignee']}")
+print(f"데이터: {assignment['data']}")
+```
+**매개변수:**
+- `pk` (int): 할당 ID
+**반환값:**
+- `dict`: 완전한 할당 정보
+**할당 구조:**
+- `id`: 할당 ID
+- `project`: 연관된 프로젝트 ID
+- `status`: 할당 상태 (`pending`, `in_progress`, `completed`, `rejected`)
+- `assignee`: 할당된 검토자의 사용자 ID
+- `data`: 할당 데이터 및 어노테이션
+- `file`: 연관된 파일
+- `created_at`: 생성 타임스탬프
+- `updated_at`: 마지막 업데이트 타임스탬프
+- `metadata`: 추가 할당 메타데이터
+### `list_assignments(params=None, url_conversion=None, list_all=False)`
+포괄적인 필터링 및 페이지네이션 지원과 함께 할당을 나열합니다.
+```python
+# 특정 프로젝트의 할당 나열
+assignments = client.list_assignments(params={'project': 123})
+# 상태별 할당 나열
+pending_assignments = client.list_assignments(params={
+    'project': 123,
+    'status': 'pending'
+})
+# 특정 할당자의 할당 나열
+user_assignments = client.list_assignments(params={
+    'assignee': 456
+})
+# 모든 할당 가져오기 (페이지네이션 자동 처리)
+all_assignments = client.list_assignments(list_all=True)
+# 파일에 대한 사용자 정의 URL 변환과 함께 할당 나열
+assignments = client.list_assignments(
+    params={'project': 123},
+    url_conversion={'files': lambda url: f"https://cdn.example.com{url}"}
+)
+```
+**매개변수:**
+- `params` (dict, 선택사항): 필터링 매개변수
+- `url_conversion` (dict, 선택사항): 파일 필드에 대한 사용자 정의 URL 변환
+- `list_all` (bool): True인 경우, 페이지네이션을 자동 처리
+**일반적인 필터링 params:**
+- `project`: 프로젝트 ID로 필터링
+- `status`: 할당 상태로 필터링
+- `assignee`: 할당된 사용자 ID로 필터링
+- `created_after`: 생성 날짜로 필터링
+- `updated_after`: 마지막 업데이트 날짜로 필터링
+- `priority`: 할당 우선순위로 필터링
+- `search`: 할당 내용에서 텍스트 검색
+**반환값:**
+- `tuple`: `list_all=False`인 경우 (assignments_list, total_count)
+- `list`: `list_all=True`인 경우 모든 할당
+### `set_tags_assignments(data, params=None)`
+일괄 작업으로 여러 할당에 태그를 설정합니다.
+```python
+# 여러 할당에 태그 설정
+client.set_tags_assignments({
+    'assignment_ids': [789, 790, 791],
+    'tag_ids': [1, 2, 3]  # 적용할 태그 ID
+})
+# 교체 옵션과 함께 태그 설정
+client.set_tags_assignments(
+    {
+        'assignment_ids': [789, 790],
+        'tag_ids': [1, 2]
+    },
+    params={'replace': True}  # 기존 태그 교체
+)
+# 우선순위 태그 설정
+client.set_tags_assignments({
+    'assignment_ids': [789],
+    'tag_ids': [5]  # 높은 우선순위 태그
+})
+```
+**매개변수:**
+- `data` (dict): 일괄 태깅 데이터
+- `params` (dict, 선택사항): 추가 매개변수
+**데이터 구조:**
+- `assignment_ids` (list): 태그를 설정할 할당 ID 목록
+- `tag_ids` (list): 적용할 태그 ID 목록
+**선택적 params:**
+- `replace` (bool): True인 경우 기존 태그 교체, False인 경우 기존 태그에 추가
+- `notify` (bool): True인 경우 태그 변경을 할당자에게 알림
+**반환값:**
+- `dict`: 태깅 작업 결과
+## 오류 처리
+```python
+from synapse_sdk.clients.exceptions import ClientError
+def robust_assignment_operations():
+    """오류 처리가 있는 안정적인 할당 작업 예제."""
+    try:
+        # 할당 가져오기 시도
+        assignment = client.get_assignment(999)
+    except ClientError as e:
+        if e.status_code == 404:
+            print("할당을 찾을 수 없음")
+            return None
+        elif e.status_code == 403:
+            print("권한이 거부됨 - 액세스 권한 부족")
+            return None
+        else:
+            print(f"할당 가져오기 오류: {e}")
+            raise
+    try:
+        # 태그 설정 시도
+        client.set_tags_assignments({
+            'assignment_ids': [999],
+            'tag_ids': [1, 2, 3]
+        })
+    except ClientError as e:
+        if e.status_code == 400:
+            print(f"잘못된 태깅 데이터: {e.response}")
+        elif e.status_code == 404:
+            print("할당 또는 태그를 찾을 수 없음")
+        else:
+            print(f"태그 설정 오류: {e}")
+    return assignment
+# 안정적인 작업 사용
+assignment = robust_assignment_operations()
+```
+## 참고
+- [BackendClient](./backend.md) - 메인 백엔드 클라이언트
+- [AnnotationClientMixin](./annotation-mixin.md) - 태스크 및 어노테이션 관리
+- [IntegrationClientMixin](./integration-mixin.md) - 플러그인 및 작업 관리