PyPI - agora-python-server-sdk - Versions diffs - 2.1.6__tar.gz → 2.2.0__tar.gz - Mend

agora-python-server-sdk 2.1.6tar.gz → 2.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of agora-python-server-sdk might be problematic. Click here for more details.

Files changed (42) hide show

{agora_python_server_sdk-2.1.6/agora_python_server_sdk.egg-info → agora_python_server_sdk-2.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: agora_python_server_sdk
-Version: 2.1.6
+Version: 2.2.0
 Summary: A Python SDK for Agora Server
 Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
 Classifier: Intended Audience :: Developers
@@ -51,6 +51,25 @@ python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx
 # Change log
+2025.01.08 Release 2.2.0
+-- Updates:
+  - Update the SDK version from 4.4.30 to 4.4.31. Done.
+-- FEAT:
+  - Add serviceconfigure.
+    - Add domain_limit. Done.
+    - Add should_callback_when_muted. Done.
+    - Add colorspacetype to ExternalVideoFrame to support the encoding of solid-color backgrounds in virtual human scenarios. Done.
+-- FEAT:
+  - Add the AudioMetaData interface: localuser::send_audio_meta_data. Done.
+  - Add the OnAudioMetaDataReceived callback to localuserObserver::on_audio_meta_data_received. Done.
+-- Sample modifications.
+2024.12.17 Release 2.1.7
+--Changes:
+  Fixed the typeError issue in LocalUser::sub/unsub audio/video.
+  Adjusted the default stopRecogCount for VAD from 30 to 50.
+  Modified sample_vad.
 ## 2024.12.09 Release 2.1.6
 - New Features:
   -- Added AudioVadManager to manage VAD (Voice Activity Detection) instances.
@@ -279,3 +298,41 @@ Store the LLM results in a cache as they are received.
 Perform a reverse scan of the cached data to find the most recent punctuation mark.
 Truncate the data from the start to the most recent punctuation mark and pass it to TTS for synthesis.
 Remove the truncated data from the cache. The remaining data should be moved to the beginning of the cache and continue waiting for additional data from the LLM.
+##VAD Configuration Parameters
+AgoraAudioVadConfigV2 Properties
+Property Name	Type	Description	Default Value	Value Range
+preStartRecognizeCount	int	Number of audio frames saved before detecting speech	16	[0, ]
+startRecognizeCount	int	Total number of audio frames to detect speech start	30	[1, max]
+stopRecognizeCount	int	Number of audio frames to detect speech stop	50	[1, max]
+activePercent	float	Percentage of active frames in startRecognizeCount frames	0.7	[0.0, 1.0]
+inactivePercent	float	Percentage of inactive frames in stopRecognizeCount frames	0.5	[0.0, 1.0]
+startVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+stopVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+startRmsThreshold	int	Energy dB threshold for detecting speech start	-50	[-100, 0]
+stopRmsThreshold	int	Energy dB threshold for detecting speech stop	-50	[-100, 0]
+Notes:
+startRmsThreshold and stopRmsThreshold:
+The higher the value, the louder the speaker's voice needs to be compared to the surrounding background noise.
+In quiet environments, it is recommended to use the default value of -50.
+In noisy environments, you can increase the threshold to between -40 and -30 to reduce false positives.
+Adjusting these thresholds based on the actual use case and audio characteristics can achieve optimal performance.
+stopRecognizeCount:
+This value reflects how long to wait after detecting non-human voice before concluding that the user has stopped speaking. It controls the gap between consecutive speech utterances. Within this gap, VAD will treat adjacent sentences as part of the same speech.
+A shorter gap will increase the likelihood of adjacent sentences being recognized as separate speech segments. Typically, it is recommended to set this value between 50 and 80.
+For example: "Good afternoon, [interval_between_sentences] what are some fun places to visit in Beijing?"
+If the interval_between_sentences between the speaker's phrases is greater than the stopRecognizeCount, the VAD will recognize the above as two separate VADs:
+VAD1: Good afternoon
+VAD2: What are some fun places to visit in Beijing?
+If the interval_between_sentences is less than stopRecognizeCount, the VAD will recognize the above as a single VAD:
+VAD: Good afternoon, what are some fun places to visit in Beijing?
+If latency is a concern, you can lower this value, or consult with the development team to determine how to manage latency while ensuring semantic continuity in speech recognition. This will help avoid the AI being interrupted too sensitively.

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/README.md RENAMED Viewed

@@ -36,6 +36,25 @@ python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx
 # Change log
+2025.01.08 Release 2.2.0
+-- Updates:
+  - Update the SDK version from 4.4.30 to 4.4.31. Done.
+-- FEAT:
+  - Add serviceconfigure.
+    - Add domain_limit. Done.
+    - Add should_callback_when_muted. Done.
+    - Add colorspacetype to ExternalVideoFrame to support the encoding of solid-color backgrounds in virtual human scenarios. Done.
+-- FEAT:
+  - Add the AudioMetaData interface: localuser::send_audio_meta_data. Done.
+  - Add the OnAudioMetaDataReceived callback to localuserObserver::on_audio_meta_data_received. Done.
+-- Sample modifications.
+2024.12.17 Release 2.1.7
+--Changes:
+  Fixed the typeError issue in LocalUser::sub/unsub audio/video.
+  Adjusted the default stopRecogCount for VAD from 30 to 50.
+  Modified sample_vad.
 ## 2024.12.09 Release 2.1.6
 - New Features:
   -- Added AudioVadManager to manage VAD (Voice Activity Detection) instances.
@@ -263,4 +282,42 @@ To achieve a balance between clarity and minimal delay, the following steps shou
 Store the LLM results in a cache as they are received.
 Perform a reverse scan of the cached data to find the most recent punctuation mark.
 Truncate the data from the start to the most recent punctuation mark and pass it to TTS for synthesis.
-Remove the truncated data from the cache. The remaining data should be moved to the beginning of the cache and continue waiting for additional data from the LLM.
+Remove the truncated data from the cache. The remaining data should be moved to the beginning of the cache and continue waiting for additional data from the LLM.
+##VAD Configuration Parameters
+AgoraAudioVadConfigV2 Properties
+Property Name	Type	Description	Default Value	Value Range
+preStartRecognizeCount	int	Number of audio frames saved before detecting speech	16	[0, ]
+startRecognizeCount	int	Total number of audio frames to detect speech start	30	[1, max]
+stopRecognizeCount	int	Number of audio frames to detect speech stop	50	[1, max]
+activePercent	float	Percentage of active frames in startRecognizeCount frames	0.7	[0.0, 1.0]
+inactivePercent	float	Percentage of inactive frames in stopRecognizeCount frames	0.5	[0.0, 1.0]
+startVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+stopVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+startRmsThreshold	int	Energy dB threshold for detecting speech start	-50	[-100, 0]
+stopRmsThreshold	int	Energy dB threshold for detecting speech stop	-50	[-100, 0]
+Notes:
+startRmsThreshold and stopRmsThreshold:
+The higher the value, the louder the speaker's voice needs to be compared to the surrounding background noise.
+In quiet environments, it is recommended to use the default value of -50.
+In noisy environments, you can increase the threshold to between -40 and -30 to reduce false positives.
+Adjusting these thresholds based on the actual use case and audio characteristics can achieve optimal performance.
+stopRecognizeCount:
+This value reflects how long to wait after detecting non-human voice before concluding that the user has stopped speaking. It controls the gap between consecutive speech utterances. Within this gap, VAD will treat adjacent sentences as part of the same speech.
+A shorter gap will increase the likelihood of adjacent sentences being recognized as separate speech segments. Typically, it is recommended to set this value between 50 and 80.
+For example: "Good afternoon, [interval_between_sentences] what are some fun places to visit in Beijing?"
+If the interval_between_sentences between the speaker's phrases is greater than the stopRecognizeCount, the VAD will recognize the above as two separate VADs:
+VAD1: Good afternoon
+VAD2: What are some fun places to visit in Beijing?
+If the interval_between_sentences is less than stopRecognizeCount, the VAD will recognize the above as a single VAD:
+VAD: Good afternoon, what are some fun places to visit in Beijing?
+If latency is a concern, you can lower this value, or consult with the development team to determine how to manage latency while ensuring semantic continuity in speech recognition. This will help avoid the AI being interrupted too sensitively.

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/__init__.py RENAMED Viewed

@@ -26,13 +26,23 @@ def _check_download_and_extract_sdk():
     sdk_dir = os.path.join(agora_service_path, "agora_sdk")
     zip_path = os.path.join(agora_service_path, "agora_rtc_sdk.zip")
-    url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.30-20241024_101940-398537.zip"
+    #url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.30-20241024_101940-398537.zip"
+    # version 2.2.0 for linux
+    url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.31-20241223_111509-491956.zip"
     libagora_rtc_sdk_path = os.path.join(sdk_dir, "libagora_rtc_sdk.so")
-    rtc_md5 = "7031dd10d1681cd88fd89d68c5b54282"
+    #rtc_md5 = "7031dd10d1681cd88fd89d68c5b54282"
+    rtc_md5 = "f2a9e3ed15f872cb7e3b62d1528ac5cb"
     if sys.platform == 'darwin':
-        url = "https://download.agora.io/sdk/release/agora_rtc_sdk_mac_rel.v4.4.30_22472_FULL_20241024_1224_398653.zip"
+        #url = "https://download.agora.io/sdk/release/agora_rtc_sdk_mac_rel.v4.4.30_22472_FULL_20241024_1224_398653.zip"
+        # version   2.2.0 for mac
+        url = "https://download.agora.io/sdk/release/agora_sdk_mac_v4.4.31_23136_FULL_20241223_1245_492039.zip"
         libagora_rtc_sdk_path = os.path.join(sdk_dir, "libAgoraRtcKit.dylib")
-        rtc_md5 = "ca3ca14f9e2b7d97eb2594d1f32dab9f"
+        #rtc_md5 = "ca3ca14f9e2b7d97eb2594d1f32dab9f"
+        rtc_md5 = "6821cae218c8f31f8d720ac0c77edab0"
     if os.path.exists(libagora_rtc_sdk_path) and get_file_md5(libagora_rtc_sdk_path) == rtc_md5:
         return

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/_ctypes_handle/_ctypes_data.py RENAMED Viewed

@@ -718,6 +718,29 @@ class RemoteVideoTrackStatsInner(ctypes.Structure):
             stats.total_active_time,
             stats.publish_duration
         )
+class ColorSpaceTypeInner(ctypes.Structure):
+    _fields_ = [
+        ("primaries_id", ctypes.c_int),
+        ("transfer_id", ctypes.c_int),
+        ("matrix_id", ctypes.c_int),
+        ("range_id", ctypes.c_int)
+    ]
+    def get(self):
+        return ColorSpaceType(
+            primaries_id=self.primaries_id,
+            transfer_id=self.transfer_id,
+            matrix_id=self.matrix_id,
+            range_id=self.range_id
+        )
+    @staticmethod
+    def create(colorspace:ColorSpaceType) -> 'ColorSpaceTypeInner':
+        return ColorSpaceTypeInner(
+            primaries_id=colorspace.primaries_id,
+            transfer_id=colorspace.transfer_id,
+            matrix_id=colorspace.matrix_id,
+            range_id=colorspace.range_id
+        )
 class ExternalVideoFrameInner(ctypes.Structure):
@@ -741,7 +764,8 @@ class ExternalVideoFrameInner(ctypes.Structure):
         ("metadata_size", ctypes.c_int),
         ("alpha_buffer", ctypes.c_void_p),
         ("fill_alpha_buffer", ctypes.c_uint8),
-        ("alpha_mode", ctypes.c_int)
+        ("alpha_mode", ctypes.c_int),
+        ("color_space", ColorSpaceTypeInner)
     ]
     def get(self):
@@ -765,7 +789,8 @@ class ExternalVideoFrameInner(ctypes.Structure):
             metadata_size=self.metadata_size,
             alpha_buffer=self.alpha_buffer,
             fill_alpha_buffer=self.fill_alpha_buffer,
-            alpha_mode=self.alpha_mode
+            alpha_mode=self.alpha_mode,
+            color_space=self.color_space.get()
         )
     @staticmethod
@@ -815,7 +840,8 @@ class ExternalVideoFrameInner(ctypes.Structure):
             c_metadata_size,
             c_alpha_buffer_ptr,
             frame.fill_alpha_buffer,
-            frame.alpha_mode
+            frame.alpha_mode,
+            ColorSpaceTypeInner.create(frame.color_space)
         )
@@ -1105,6 +1131,7 @@ class AgoraServiceConfigInner(ctypes.Structure):
         ('audio_scenario', ctypes.c_int),
         ('use_string_uid', ctypes.c_int),
+        ('domain_limit', ctypes.c_int),
     ]
     def get(self):
@@ -1117,7 +1144,8 @@ class AgoraServiceConfigInner(ctypes.Structure):
             area_code=self.area_code,
             channel_profile=self.channel_profile,
             audio_scenario=self.audio_scenario,
-            use_string_uid=self.use_string_uid
+            use_string_uid=self.use_string_uid,
+            domain_limit=self.domain_limit
         )
     @staticmethod
@@ -1131,7 +1159,8 @@ class AgoraServiceConfigInner(ctypes.Structure):
             config.area_code,
             config.channel_profile,
             config.audio_scenario,
-            config.use_string_uid
+            config.use_string_uid,
+            config.domain_limit
         )

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/_ctypes_handle/_local_user_observer.py RENAMED Viewed

@@ -44,6 +44,7 @@ ON_INTRA_REQUEST_RECEIVED_CALLBACK = ctypes.CFUNCTYPE(None, AGORA_HANDLE)
 ON_REMOTE_SUBSCRIBE_FALLBACK_TO_AUDIO_ONLY_CALLBACK = ctypes.CFUNCTYPE(None, AGORA_HANDLE, user_id_t, ctypes.c_int)
 ON_STREAM_MESSAGE_CALLBACK = ctypes.CFUNCTYPE(None, AGORA_HANDLE, user_id_t, ctypes.c_int, ctypes.c_char_p, ctypes.c_size_t)
 ON_USER_STATE_CHANGED_CALLBACK = ctypes.CFUNCTYPE(None, AGORA_HANDLE, user_id_t, ctypes.c_uint32)
+ON_AUDIO_META_DATA_RECEIVED_CALLBACK = ctypes.CFUNCTYPE(None, AGORA_HANDLE, user_id_t, ctypes.c_char_p, ctypes.c_size_t)
 class RTCLocalUserObserverInner(ctypes.Structure):
@@ -85,7 +86,8 @@ class RTCLocalUserObserverInner(ctypes.Structure):
         ("on_intra_request_received", ON_INTRA_REQUEST_RECEIVED_CALLBACK),
         ("on_remote_subscribe_fallback_to_audio_only", ON_REMOTE_SUBSCRIBE_FALLBACK_TO_AUDIO_ONLY_CALLBACK),
         ("on_stream_message", ON_STREAM_MESSAGE_CALLBACK),
-        ("on_user_state_changed", ON_USER_STATE_CHANGED_CALLBACK)
+        ("on_user_state_changed", ON_USER_STATE_CHANGED_CALLBACK),
+        ("on_audio_meta_data_received", ON_AUDIO_META_DATA_RECEIVED_CALLBACK)
     ]
     def __init__(self, local_user_observer: IRTCLocalUserObserver, local_user: 'LocalUser') -> None:
@@ -128,6 +130,7 @@ class RTCLocalUserObserverInner(ctypes.Structure):
         self.on_remote_subscribe_fallback_to_audio_only = ON_REMOTE_SUBSCRIBE_FALLBACK_TO_AUDIO_ONLY_CALLBACK(self._on_remote_subscribe_fallback_to_audio_only)
         self.on_stream_message = ON_STREAM_MESSAGE_CALLBACK(self._on_stream_message)
         self.on_user_state_changed = ON_USER_STATE_CHANGED_CALLBACK(self._on_user_state_changed)
+        self.on_audio_meta_data_received = ON_AUDIO_META_DATA_RECEIVED_CALLBACK(self._on_audio_meta_data_received)
     """
     it seems that this interface does not provide much value to the user's business,
@@ -348,3 +351,8 @@ class RTCLocalUserObserverInner(ctypes.Structure):
         logger.debug(f"LocalUserCB _on_user_state_changed: {local_user_handle}, {user_id}, {state}")
         user_id_str = user_id.decode('utf-8') if user_id else ""
         self.local_user_observer.on_user_state_changed(self.local_user, user_id_str, state)
+    def _on_audio_meta_data_received(self, local_user_handle, user_id, audio_meta_data, size):
+        user_id_str = user_id.decode('utf-8') if user_id else ""
+        bytes_from_c = bytearray(ctypes.string_at(audio_meta_data, size))
+        self.local_user_observer.on_audio_meta_data_received(self.local_user, user_id_str, bytes_from_c)

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/agora_base.py RENAMED Viewed

@@ -307,6 +307,15 @@ class AgoraServiceConfig:
     channel_profile: ChannelProfileType = ChannelProfileType.CHANNEL_PROFILE_LIVE_BROADCASTING
     audio_scenario: AudioScenarioType = AudioScenarioType.AUDIO_SCENARIO_CHORUS
     use_string_uid: int = 0
+    #version 2.2.0
+    # default to 0
+    domain_limit: int = 0
+    '''
+    // if >0, when remote user muted itself, the onplaybackbeforemixing will be still called badk with active pacakage
+	// if <=0, when remote user muted itself, the onplaybackbeforemixing will be no longer called back
+	// default to 0, i.e when muted, no callback will be triggered
+    '''
+    should_callbck_when_muted: int = 0
 @dataclass(kw_only=True)
@@ -423,6 +432,13 @@ class VideoEncoderConfiguration:
     mirror_mode: int = 0
     encode_alpha: int = 0
+@dataclass(kw_only=True)
+class ColorSpaceType:
+    primaries_id: int = 0
+    transfer_id: int = 0
+    matrix_id: int = 0
+    range_id: int = 0
 @dataclass(kw_only=True)
 class ExternalVideoFrame:
@@ -445,6 +461,7 @@ class ExternalVideoFrame:
     alpha_buffer: bytearray = None
     fill_alpha_buffer: int = 0
     alpha_mode: int = 0
+    color_space: ColorSpaceType = field(default_factory=ColorSpaceType)
 @dataclass(kw_only=True)

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/agora_service.py RENAMED Viewed

@@ -92,6 +92,10 @@ class AgoraService:
         # force audio vad v2 to be enabled
         agora_parameter.set_parameters("{\"che.audio.label.enable\": true}")
+        #versio 2.2.0 for callback when muted
+        if config.should_callbck_when_muted > 0:
+            agora_parameter.set_parameters("{\"rtc.audio.enable_user_silence_packet\": true}")
         if config.log_path:
             log_size = 512 * 1024
             if config.log_size > 0:

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/local_user.py RENAMED Viewed

@@ -62,7 +62,7 @@ agora_local_user_subscribe_all_audio.argtypes = [AGORA_HANDLE]
 agora_local_user_unsubscribe_audio = agora_lib.agora_local_user_unsubscribe_audio
 agora_local_user_unsubscribe_audio.restype = AGORA_API_C_INT
-agora_local_user_unsubscribe_audio.argtypes = [AGORA_HANDLE, ctypes.c_uint]
+agora_local_user_unsubscribe_audio.argtypes = [AGORA_HANDLE, user_id_t]
 agora_local_user_unsubscribe_all_audio = agora_lib.agora_local_user_unsubscribe_all_audio
 agora_local_user_unsubscribe_all_audio.restype = AGORA_API_C_INT
@@ -184,7 +184,7 @@ agora_local_user_subscribe_all_video.argtypes = [AGORA_HANDLE, ctypes.POINTER(Vi
 agora_local_user_unsubscribe_video = agora_lib.agora_local_user_unsubscribe_video
 agora_local_user_unsubscribe_video.restype = AGORA_API_C_INT
-agora_local_user_unsubscribe_video.argtypes = [AGORA_HANDLE, ctypes.c_uint]
+agora_local_user_unsubscribe_video.argtypes = [AGORA_HANDLE, user_id_t]
 agora_local_user_unsubscribe_all_video = agora_lib.agora_local_user_unsubscribe_all_video
 agora_local_user_unsubscribe_all_video.restype = AGORA_API_C_INT
@@ -223,6 +223,15 @@ agora_local_user_set_audio_scenario = agora_lib.agora_local_user_set_audio_scena
 agora_local_user_set_audio_scenario.restype = AGORA_API_C_INT
 agora_local_user_set_audio_scenario.argtypes = [AGORA_HANDLE, ctypes.c_int]
+#verison 2.2.0
+#AGORA_API_C_INT agora_local_user_send_audio_meta_data(AGORA_HANDLE agora_local_user, const char* meta_data, size_t length);
+agora_local_user_send_aduio_meta_data = agora_lib.agora_local_user_send_audio_meta_data
+agora_local_user_send_aduio_meta_data.restype = AGORA_API_C_INT
+agora_local_user_send_aduio_meta_data.argtypes = [AGORA_HANDLE, ctypes.c_char_p, ctypes.c_size_t]
 class LocalUser:
     def __init__(self, local_user_handle, connection):
@@ -361,7 +370,13 @@ class LocalUser:
         return ret
     def subscribe_audio(self, user_id):
-        ret = agora_local_user_subscribe_audio(self.user_handle, ctypes.c_char_p(user_id.encode()))
+        if user_id is None:
+            return -1
+        uid_str = user_id.encode('utf-8')
+        #ret = agora_local_user_subscribe_audio(self.user_handle, ctypes.create_string_buffer(uid_str))
+        # note：both ctypes.create_string_buffer and ctypes.c_char_p are all can change python's str to c_char_p
+        # but ctypes.c_char_p is more suitable for this case for the c api never change the content of c_char_p
+        ret = agora_local_user_subscribe_audio(self.user_handle, ctypes.c_char_p(uid_str))
         return ret
     def subscribe_all_audio(self):
@@ -369,7 +384,11 @@ class LocalUser:
         return ret
     def unsubscribe_audio(self, user_id):
-        ret = agora_local_user_unsubscribe_audio(self.user_handle, ctypes.c_char_p(user_id.encode()))
+        #validity check
+        if user_id is None:
+            return -1
+        uid_str = user_id.encode('utf-8')
+        ret = agora_local_user_unsubscribe_audio(self.user_handle, ctypes.c_char_p(uid_str))
         if ret < 0:
             logger.error("Failed to unsubscribe audio")
         else:
@@ -485,18 +504,33 @@ class LocalUser:
     #     return ret
     def subscribe_video(self, user_id, options: VideoSubscriptionOptions):
-        user_id_t = user_id.encode('utf-8')
+        if user_id is None:
+            return -1
+        uid_str = user_id.encode('utf-8')
-        ret = agora_local_user_subscribe_video(self.user_handle, user_id_t, ctypes.byref(options))
+        if options is  None:
+            inner = VideoSubscriptionOptionsInner()
+        else:
+            inner = VideoSubscriptionOptionsInner.create(options)
+        c_ptr = ctypes.byref(inner)
+        ret = agora_local_user_subscribe_video(self.user_handle, ctypes.c_char_p(uid_str), c_ptr)
         return ret
     def subscribe_all_video(self, options: VideoSubscriptionOptions):
-        ret = agora_local_user_subscribe_all_video(self.user_handle, ctypes.byref(VideoSubscriptionOptionsInner.create(options)))
+        if options is  None:
+            inner = VideoSubscriptionOptionsInner()
+        else:
+            inner = VideoSubscriptionOptionsInner.create(options)
+        ret = agora_local_user_subscribe_all_video(self.user_handle, ctypes.byref(inner))
         return ret
     def unsubscribe_video(self, user_id):
-        user_id_t = user_id.encode('utf-8')
-        ret = agora_local_user_unsubscribe_video(self.user_handle, user_id_t)
+        if user_id is None:
+            return -1
+        uid_str = user_id.encode('utf-8')
+        ret = agora_local_user_unsubscribe_video(self.user_handle, ctypes.c_char_p(uid_str))
         if ret < 0:
             logger.error("Failed to unsubscribe video")
         else:
@@ -569,3 +603,12 @@ class LocalUser:
     def set_audio_scenario(self, scenario_type: AudioScenarioType):
         ret = agora_local_user_set_audio_scenario(self.user_handle, scenario_type.value)
         return ret
+    # data can be str or bytes/bytearray object,is diff to send_sream_message which is a str object
+    def send_audio_meta_data(self, data):
+        # chang to ctypes.c_char_p
+        if isinstance(data, str):
+            data = data.encode('utf-8')
+        c_data = ctypes.create_string_buffer(bytes(data))
+        size = len(data)
+        ret = agora_local_user_send_aduio_meta_data(self.user_handle, c_data, ctypes.c_size_t(size))
+        return ret

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/local_user_observer.py RENAMED Viewed

@@ -117,3 +117,6 @@ class IRTCLocalUserObserver():
     def on_user_state_changed(self, agora_local_user, user_id, state):
         pass
+    # data is bytearray object, is diff to on_stream_msg which is str object
+    def on_audio_meta_data_received(self, agora_local_user, user_id, data):
+        pass

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/agora/rtc/utils/audio_consumer.py RENAMED Viewed

@@ -67,7 +67,7 @@ class AudioConsumer:
         pass
     def consume(self):
-        print("consume begin")
+        #print("consume begin")
         if self._init == False:
             return -1
         now = time.time()*1000

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0/agora_python_server_sdk.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: agora_python_server_sdk
-Version: 2.1.6
+Version: 2.2.0
 Summary: A Python SDK for Agora Server
 Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
 Classifier: Intended Audience :: Developers
@@ -51,6 +51,25 @@ python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx
 # Change log
+2025.01.08 Release 2.2.0
+-- Updates:
+  - Update the SDK version from 4.4.30 to 4.4.31. Done.
+-- FEAT:
+  - Add serviceconfigure.
+    - Add domain_limit. Done.
+    - Add should_callback_when_muted. Done.
+    - Add colorspacetype to ExternalVideoFrame to support the encoding of solid-color backgrounds in virtual human scenarios. Done.
+-- FEAT:
+  - Add the AudioMetaData interface: localuser::send_audio_meta_data. Done.
+  - Add the OnAudioMetaDataReceived callback to localuserObserver::on_audio_meta_data_received. Done.
+-- Sample modifications.
+2024.12.17 Release 2.1.7
+--Changes:
+  Fixed the typeError issue in LocalUser::sub/unsub audio/video.
+  Adjusted the default stopRecogCount for VAD from 30 to 50.
+  Modified sample_vad.
 ## 2024.12.09 Release 2.1.6
 - New Features:
   -- Added AudioVadManager to manage VAD (Voice Activity Detection) instances.
@@ -279,3 +298,41 @@ Store the LLM results in a cache as they are received.
 Perform a reverse scan of the cached data to find the most recent punctuation mark.
 Truncate the data from the start to the most recent punctuation mark and pass it to TTS for synthesis.
 Remove the truncated data from the cache. The remaining data should be moved to the beginning of the cache and continue waiting for additional data from the LLM.
+##VAD Configuration Parameters
+AgoraAudioVadConfigV2 Properties
+Property Name	Type	Description	Default Value	Value Range
+preStartRecognizeCount	int	Number of audio frames saved before detecting speech	16	[0, ]
+startRecognizeCount	int	Total number of audio frames to detect speech start	30	[1, max]
+stopRecognizeCount	int	Number of audio frames to detect speech stop	50	[1, max]
+activePercent	float	Percentage of active frames in startRecognizeCount frames	0.7	[0.0, 1.0]
+inactivePercent	float	Percentage of inactive frames in stopRecognizeCount frames	0.5	[0.0, 1.0]
+startVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+stopVoiceProb	int	Probability that an audio frame contains human voice	70	[0, 100]
+startRmsThreshold	int	Energy dB threshold for detecting speech start	-50	[-100, 0]
+stopRmsThreshold	int	Energy dB threshold for detecting speech stop	-50	[-100, 0]
+Notes:
+startRmsThreshold and stopRmsThreshold:
+The higher the value, the louder the speaker's voice needs to be compared to the surrounding background noise.
+In quiet environments, it is recommended to use the default value of -50.
+In noisy environments, you can increase the threshold to between -40 and -30 to reduce false positives.
+Adjusting these thresholds based on the actual use case and audio characteristics can achieve optimal performance.
+stopRecognizeCount:
+This value reflects how long to wait after detecting non-human voice before concluding that the user has stopped speaking. It controls the gap between consecutive speech utterances. Within this gap, VAD will treat adjacent sentences as part of the same speech.
+A shorter gap will increase the likelihood of adjacent sentences being recognized as separate speech segments. Typically, it is recommended to set this value between 50 and 80.
+For example: "Good afternoon, [interval_between_sentences] what are some fun places to visit in Beijing?"
+If the interval_between_sentences between the speaker's phrases is greater than the stopRecognizeCount, the VAD will recognize the above as two separate VADs:
+VAD1: Good afternoon
+VAD2: What are some fun places to visit in Beijing?
+If the interval_between_sentences is less than stopRecognizeCount, the VAD will recognize the above as a single VAD:
+VAD: Good afternoon, what are some fun places to visit in Beijing?
+If latency is a concern, you can lower this value, or consult with the development team to determine how to manage latency while ensuring semantic continuity in speech recognition. This will help avoid the AI being interrupted too sensitively.

{agora_python_server_sdk-2.1.6 → agora_python_server_sdk-2.2.0}/setup.py RENAMED Viewed

@@ -20,10 +20,15 @@ class CustomInstallCommand(install):
         agora_service_path = os.path.join(site.getsitepackages()[0], 'agora', 'rtc')
         sdk_dir = os.path.join(agora_service_path, "agora_sdk")
         zip_path = os.path.join(agora_service_path, "agora_rtc_sdk.zip")
-        url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.30-20241024_101940-398537.zip"
+        '''# version before 2.2.0
+        #url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.30-20241024_101940-398537.zip"
+        #url = "https://download.agora.io/sdk/release/agora_rtc_sdk_mac_rel.v4.4.30_22472_FULL_20241024_1224_398653.zip"
+        '''
+        # verison 2.2.0
+        url = "https://download.agora.io/sdk/release/agora_rtc_sdk-x86_64-linux-gnu-v4.4.31-20241223_111509-491956.zip"
         if sys.platform == 'darwin':
-            url = "https://download.agora.io/sdk/release/agora_rtc_sdk_mac_rel.v4.4.30_22472_FULL_20241024_1224_398653.zip"
+            url = "https://download.agora.io/sdk/release/agora_sdk_mac_v4.4.31_23136_FULL_20241223_1245_492039.zip"
         if os.path.exists(sdk_dir):
             os.system(f"rm -rf {sdk_dir}")
@@ -45,7 +50,7 @@ class CustomInstallCommand(install):
 setup(
     name='agora_python_server_sdk',
-    version='2.1.6',
+    version='2.2.0',
     description='A Python SDK for Agora Server',
     long_description=open('README.md').read(),
     long_description_content_type='text/markdown',