PyPI - agora-python-server-sdk - Versions diffs - 2.1.3__tar.gz → 2.1.5__tar.gz - Mend

agora-python-server-sdk 2.1.3tar.gz → 2.1.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of agora-python-server-sdk might be problematic. Click here for more details.

Files changed (45) hide show

agora_python_server_sdk-2.1.5/PKG-INFO ADDED Viewed

@@ -0,0 +1,140 @@
+Metadata-Version: 2.1
+Name: agora_python_server_sdk
+Version: 2.1.5
+Summary: A Python SDK for Agora Server
+Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Topic :: Multimedia :: Sound/Audio
+Classifier: Topic :: Multimedia :: Video
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3 :: Only
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+# Note
+- This is a Python SDK wrapper for the Agora RTC SDK.
+- It supports Linux and Mac platforms.
+- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
+# Very Important Notice !!!
+- A process can only have one instance.
+- An instance can have multiple connections.
+- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
+# Required Operating Systems and Python Versions
+- Supported Linux versions:
+  - Ubuntu 18.04 LTS and above
+  - CentOS 7.0 and above
+- Supported Mac versions:
+  - MacOS 13 and above
+- Python version:
+  - Python 3.10 and above
+# Using Agora-Python-Server-SDK
+```
+pip install agora_python_server_sdk
+```
+# Running Examples
+## Preparing Test Data
+- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
+## Executing Test Script
+```
+python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --userId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
+```
+# Change log
+## 2024.12.03 release Version 2.1.5
+- Modifications:
+  - LocalUser/audioTrack:
+    -- When the scenario is chorus, developers don't need to call setSendDelayInMs.
+    -- When the scenario is chorus, developers don't need to set the audio scenario of the track to chorus.
+    -- NOTE: This can reduce the difficulty for developers. In AI scenarios, developers only need to set the service to chorus.
+- Additions:
+  -- Added the VadDump class, which can assist in troubleshooting vad issues in the testing environment. However, it should not be enabled in the online env  ironment.
+  -- Added the on_volume_indication callback.
+  -- Added the on_remote_video_track_state_changed callback.
+- Removals:
+  -- Removed Vad V1 version, only retaining the V2 version. Refer to voice_detection.py and sample_audio_vad.py.
+- Updates:
+  -- Updated relevant samples: audioconsume, vad sample.
+## 2024.11.12 release 2.1.4
+- Modify the type of metadata in videoFrame from str to bytes type to be consistent with C++; thus, it can support byte streams.
+- The internal encapsulation of ExteranlVideoFrame has been modified to support byte streams. Regarding the support for alpha encoding, a logical judgment has been made. If fill_alpha_buffer is 0, it will not be processed.
+## 2024.11.11 release 2.1.3
+- Added a new sample: example_jpeg_send.py which can push JPEG files or JPEG streams to a channel.
+-
+- Performance overhead, as noted in the example comments, can be summarized as follows:
+- For a 1920x1080 JPEG file, the process from reading the file to converting it to an RGBA bytearray - takes approximately 11 milliseconds.
+## 2024.11.07 release 2.1.2
+- Updates `user_id` in the `AudioVolumeInfoInner and AudioVolumeInfo` structure to `str` type.
+- Fixes the bug in `_on_audio_volume_indication` callback, where it could only handle one callback to speaker_number
+- Corrects the parameter type in `IRTCLocalUserObserver::on_audio_volume_indication` callback to `list` type.
+## 2024.10.29 release 2.1.1
+Add audio VAD interface of version 2 and corresponding example.
+## 2024.10.24 release 2.1.0
+Fixed some bug.
+### Common Usage Q&A
+## The relationship between service and process?
+- A process can only have one service, and the service can only be initialized once.
+- A service can only have one media_node_factory.
+- A service can have multiple connections.
+- Release media_node_factory.release() and service.release() when the process exits.
+## If using Docker with one user per Docker, when the user starts Docker and logs out, how should Docker be released?
+- In this case, create service/media_node_factory and connection when the process starts.
+- Release service/media_node_factory and connection when the process exits, ensuring that...
+## If Docker is used to support multiple users and Docker runs for a long time, what should be done?
+- In this case, we recommend using the concept of a connection pool.
+- Create service/media_node_factory and a connection pool (only new connections, without initialization) when the process starts.
+- When a user logs in, get a connection from the connection pool, initialize it, execute con.connect() and set up callbacks, and then join the channel.
+- Handle business operations.
+- When a user logs out, execute con.disconnect() and release the audio/video tracks and observers associated with the connection, but do not call con.release(); then put the connection back into the connection pool.
+- When the process exits, release the connection pool (release each con.release()), service/media_node_factory, and the connection pool (release each con.release()) to ensure resource release and optimal performance.
+## Use of VAD
+# Source code: voice_detection.py
+# Sample code: example_audio_vad.py
+# It is recommended to use VAD V2 version, and the class is: AudioVadV2; Reference: voice_detection.py.
+# Use of VAD:
+  1. Call _vad_instance.init(AudioVadConfigV2) to initialize the vad instance. Reference: voice_detection.py. Assume the instance is: _vad_instance
+  2. In audio_frame_observer::on_playback_audio_frame_before_mixing(audio_frame):
+  3. Call the process of the vad module: state, bytes = _vad_instance.process(audio_frame)
+Judge the value of state according to the returned state, and do corresponding processing.
+    A. If state is _vad_instance._vad_state_startspeaking, it indicates that the user is "starting to speak", and speech recognition (STT/ASR) operations can be started. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    B. If state is _vad_instance._vad_state_stopspeaking, it indicates that the user is "stopping speaking", and speech recognition (STT/ASR) operations can be stopped. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    C. If state is _vad_instance._vad_state_speaking, it indicates that the user is "speaking", and speech recognition (STT/ASR) operations can be continued. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# Note:
+  If the vad module is used and it is expected to use the vad module for speech recognition (STT/ASR) and other operations, then be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# How to better troubleshoot VAD issues: It includes two aspects, configuration and debugging.
+  1. Ensure that the initialization parameters of the vad module are correct. Reference: voice_detection.py.
+  2. In state, bytes = on_playback_audio_frame_before_mixing(audio_frame):
+    - A . Save the data of audio_frame to a local file, reference: example_audio_pcm_send.py. This is to record the original audio data. For example, it can be named: source_{time.time()*1000}.pcm
+    - B.Save the result of each vad processing:
+      - a When state == start_speaking: create a new binary file, for example, named: vad_{time.time()*1000}.pcm, and write bytes to the file.
+      - b When state == speaking: write bytes to the file.
+      - c When state == stop_speaking: write bytes to the file and close the file.
+    Note: In this way, problems can be troubleshot based on the original audio file and the audio file processed by vad. This function can be disabled in the production environment.
+### How to push the audio generated by TTS into the channel?
+  # Source code: audio_consumer.py
+  # Sample code: example_audio_consumer.py
+### How to release resources?

agora_python_server_sdk-2.1.5/README.md ADDED Viewed

@@ -0,0 +1,125 @@
+# Note
+- This is a Python SDK wrapper for the Agora RTC SDK.
+- It supports Linux and Mac platforms.
+- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
+# Very Important Notice !!!
+- A process can only have one instance.
+- An instance can have multiple connections.
+- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
+# Required Operating Systems and Python Versions
+- Supported Linux versions:
+  - Ubuntu 18.04 LTS and above
+  - CentOS 7.0 and above
+- Supported Mac versions:
+  - MacOS 13 and above
+- Python version:
+  - Python 3.10 and above
+# Using Agora-Python-Server-SDK
+```
+pip install agora_python_server_sdk
+```
+# Running Examples
+## Preparing Test Data
+- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
+## Executing Test Script
+```
+python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --userId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
+```
+# Change log
+## 2024.12.03 release Version 2.1.5
+- Modifications:
+  - LocalUser/audioTrack:
+    -- When the scenario is chorus, developers don't need to call setSendDelayInMs.
+    -- When the scenario is chorus, developers don't need to set the audio scenario of the track to chorus.
+    -- NOTE: This can reduce the difficulty for developers. In AI scenarios, developers only need to set the service to chorus.
+- Additions:
+  -- Added the VadDump class, which can assist in troubleshooting vad issues in the testing environment. However, it should not be enabled in the online env  ironment.
+  -- Added the on_volume_indication callback.
+  -- Added the on_remote_video_track_state_changed callback.
+- Removals:
+  -- Removed Vad V1 version, only retaining the V2 version. Refer to voice_detection.py and sample_audio_vad.py.
+- Updates:
+  -- Updated relevant samples: audioconsume, vad sample.
+## 2024.11.12 release 2.1.4
+- Modify the type of metadata in videoFrame from str to bytes type to be consistent with C++; thus, it can support byte streams.
+- The internal encapsulation of ExteranlVideoFrame has been modified to support byte streams. Regarding the support for alpha encoding, a logical judgment has been made. If fill_alpha_buffer is 0, it will not be processed.
+## 2024.11.11 release 2.1.3
+- Added a new sample: example_jpeg_send.py which can push JPEG files or JPEG streams to a channel.
+-
+- Performance overhead, as noted in the example comments, can be summarized as follows:
+- For a 1920x1080 JPEG file, the process from reading the file to converting it to an RGBA bytearray - takes approximately 11 milliseconds.
+## 2024.11.07 release 2.1.2
+- Updates `user_id` in the `AudioVolumeInfoInner and AudioVolumeInfo` structure to `str` type.
+- Fixes the bug in `_on_audio_volume_indication` callback, where it could only handle one callback to speaker_number
+- Corrects the parameter type in `IRTCLocalUserObserver::on_audio_volume_indication` callback to `list` type.
+## 2024.10.29 release 2.1.1
+Add audio VAD interface of version 2 and corresponding example.
+## 2024.10.24 release 2.1.0
+Fixed some bug.
+### Common Usage Q&A
+## The relationship between service and process?
+- A process can only have one service, and the service can only be initialized once.
+- A service can only have one media_node_factory.
+- A service can have multiple connections.
+- Release media_node_factory.release() and service.release() when the process exits.
+## If using Docker with one user per Docker, when the user starts Docker and logs out, how should Docker be released?
+- In this case, create service/media_node_factory and connection when the process starts.
+- Release service/media_node_factory and connection when the process exits, ensuring that...
+## If Docker is used to support multiple users and Docker runs for a long time, what should be done?
+- In this case, we recommend using the concept of a connection pool.
+- Create service/media_node_factory and a connection pool (only new connections, without initialization) when the process starts.
+- When a user logs in, get a connection from the connection pool, initialize it, execute con.connect() and set up callbacks, and then join the channel.
+- Handle business operations.
+- When a user logs out, execute con.disconnect() and release the audio/video tracks and observers associated with the connection, but do not call con.release(); then put the connection back into the connection pool.
+- When the process exits, release the connection pool (release each con.release()), service/media_node_factory, and the connection pool (release each con.release()) to ensure resource release and optimal performance.
+## Use of VAD
+# Source code: voice_detection.py
+# Sample code: example_audio_vad.py
+# It is recommended to use VAD V2 version, and the class is: AudioVadV2; Reference: voice_detection.py.
+# Use of VAD:
+  1. Call _vad_instance.init(AudioVadConfigV2) to initialize the vad instance. Reference: voice_detection.py. Assume the instance is: _vad_instance
+  2. In audio_frame_observer::on_playback_audio_frame_before_mixing(audio_frame):
+  3. Call the process of the vad module: state, bytes = _vad_instance.process(audio_frame)
+Judge the value of state according to the returned state, and do corresponding processing.
+    A. If state is _vad_instance._vad_state_startspeaking, it indicates that the user is "starting to speak", and speech recognition (STT/ASR) operations can be started. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    B. If state is _vad_instance._vad_state_stopspeaking, it indicates that the user is "stopping speaking", and speech recognition (STT/ASR) operations can be stopped. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    C. If state is _vad_instance._vad_state_speaking, it indicates that the user is "speaking", and speech recognition (STT/ASR) operations can be continued. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# Note:
+  If the vad module is used and it is expected to use the vad module for speech recognition (STT/ASR) and other operations, then be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# How to better troubleshoot VAD issues: It includes two aspects, configuration and debugging.
+  1. Ensure that the initialization parameters of the vad module are correct. Reference: voice_detection.py.
+  2. In state, bytes = on_playback_audio_frame_before_mixing(audio_frame):
+    - A . Save the data of audio_frame to a local file, reference: example_audio_pcm_send.py. This is to record the original audio data. For example, it can be named: source_{time.time()*1000}.pcm
+    - B.Save the result of each vad processing:
+      - a When state == start_speaking: create a new binary file, for example, named: vad_{time.time()*1000}.pcm, and write bytes to the file.
+      - b When state == speaking: write bytes to the file.
+      - c When state == stop_speaking: write bytes to the file and close the file.
+    Note: In this way, problems can be troubleshot based on the original audio file and the audio file processed by vad. This function can be disabled in the production environment.
+### How to push the audio generated by TTS into the channel?
+  # Source code: audio_consumer.py
+  # Sample code: example_audio_consumer.py
+### How to release resources?

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora/rtc/_ctypes_handle/_ctypes_data.py RENAMED Viewed

@@ -151,7 +151,7 @@ class VideoFrameInner(ctypes.Structure):
             rotation=self.rotation,
             render_time_ms=self.render_time_ms,
             avsync_type=self.avsync_type,
-            metadata=ctypes.string_at(self.metadata_buffer, self.metadata_size).decode() if self.metadata_buffer else None,
+            metadata=ctypes.string_at(self.metadata_buffer, self.metadata_size) if self.metadata_buffer else None,
             shared_context=self.shared_context.decode() if self.shared_context else None,
             texture_id=self.texture_id,
             matrix=self.matrix,
@@ -770,12 +770,31 @@ class ExternalVideoFrameInner(ctypes.Structure):
     @staticmethod
     def create(frame: ExternalVideoFrame) -> 'ExternalVideoFrameInner':
-        c_buffer = (ctypes.c_uint8 * len(frame.buffer)).from_buffer(frame.buffer)
-        c_buffer_ptr = ctypes.cast(c_buffer, ctypes.c_void_p)
-        c_metadata = bytearray(frame.metadata.encode('utf-8'))
-        c_metadata_ptr = (ctypes.c_uint8 * len(c_metadata)).from_buffer(c_metadata)
-        c_alpha_buffer = (ctypes.c_uint8 * len(frame.alpha_buffer)).from_buffer(frame.alpha_buffer)
-        c_alpha_buffer_ptr = ctypes.cast(c_alpha_buffer, ctypes.c_void_p)
+        if frame.buffer is not None:
+            c_buffer = (ctypes.c_uint8 * len(frame.buffer)).from_buffer(frame.buffer)
+            c_buffer_ptr = ctypes.cast(c_buffer, ctypes.c_void_p)
+        else:
+            c_buffer_ptr = ctypes.c_void_p(0)
+        #aplha_buffer and is_fill alpha_buffer
+        if (frame.fill_alpha_buffer >0) and (frame.alpha_buffer is not None):
+            c_alpha_buffer = (ctypes.c_uint8 * len(frame.alpha_buffer)).from_buffer(frame.alpha_buffer)
+            c_alpha_buffer_ptr = ctypes.cast(c_alpha_buffer, ctypes.c_void_p)
+        else:
+            c_alpha_buffer_ptr = ctypes.c_void_p(0)
+        c_metadata_size = len (frame.metadata)
+        if frame.metadata is not None and c_metadata_size > 0:
+            c_metadata_ptr = (ctypes.c_uint8 * len(frame.metadata)).from_buffer(frame.metadata)
+            c_metadata_size = len(frame.metadata)
+        else:
+            c_metadata_ptr = ctypes.c_void_p(0)
+            c_metadata_size = 0
         return ExternalVideoFrameInner(
             frame.type,
             frame.format,
@@ -793,7 +812,7 @@ class ExternalVideoFrameInner(ctypes.Structure):
             frame.texture_id,
             (ctypes.c_float * 16)(*frame.matrix),
             c_metadata_ptr,
-            len(c_metadata),
+            c_metadata_size,
             c_alpha_buffer_ptr,
             frame.fill_alpha_buffer,
             frame.alpha_mode

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora/rtc/agora_base.py RENAMED Viewed

@@ -286,7 +286,7 @@ class VideoFrame():
     rotation: int = 0
     render_time_ms: int = 0
     avsync_type: int = 0
-    metadata: str = None
+    metadata: bytearray = None
     shared_context: str = None
     texture_id: int = 0
     matrix: list = None
@@ -441,8 +441,8 @@ class ExternalVideoFrame:
     egl_type: int = 0
     texture_id: int = 0
     matrix: list = field(default_factory=list)
-    metadata: str = ""
-    alpha_buffer: bytearray = field(default_factory=bytearray)
+    metadata: bytearray = None  #change type from str to bytearray to match the C++ version,i.e to support bytes as metadata
+    alpha_buffer: bytearray = None
     fill_alpha_buffer: int = 0
     alpha_mode: int = 0

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora/rtc/agora_service.py RENAMED Viewed

@@ -75,6 +75,8 @@ class AgoraService:
         if result == 0:
             self.inited = True
         logger.debug(f'Initialization result: {result}')
+        self._is_low_delay = True if  config.audio_scenario == AudioScenarioType.AUDIO_SCENARIO_CHORUS  else False
         # to enable plugin
         provider = "agora.builtin"
@@ -87,6 +89,9 @@ class AgoraService:
         agora_parameter = self.get_agora_parameter()
         agora_parameter.set_int("rtc.set_app_type", 18)
+        # force audio vad v2 to be enabled
+        agora_parameter.set_parameters("{\"che.audio.label.enable\": true}")
         if config.log_path:
             log_size = 512 * 1024
             if config.log_size > 0:
@@ -128,7 +133,7 @@ class AgoraService:
         rtc_conn_handle = agora_rtc_conn_create(self.service_handle, ctypes.byref(RTCConnConfigInner.create(con_config)))
         if rtc_conn_handle is None:
             return None
-        return RTCConnection(rtc_conn_handle)
+        return RTCConnection(rtc_conn_handle, self._is_low_delay)
     # createCustomAudioTrackPcm: creatae a custom audio track from pcm data sender
     def create_custom_audio_track_pcm(self, audio_pcm_data_sender: AudioPcmDataSender) -> LocalAudioTrack:
@@ -138,7 +143,11 @@ class AgoraService:
         custom_audio_track = agora_service_create_custom_audio_track_pcm(self.service_handle, audio_pcm_data_sender.sender_handle)
         if custom_audio_track is None:
             return None
-        return LocalAudioTrack(custom_audio_track)
+        local_track =  LocalAudioTrack(custom_audio_track)
+        #default for ai senario to set min delay to 10ms
+        if local_track is not None:
+            local_track.set_send_delay_ms(10)
+        return local_track
     # mix_mode: MIX_ENABLED = 0, MIX_DISABLED = 1
     def create_custom_audio_track_encoded(self, audio_encoded_frame_sender: AudioEncodedFrameSender, mix_mode: int):

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora/rtc/local_user.py RENAMED Viewed

@@ -506,7 +506,7 @@ class LocalUser:
         if ret < 0:
             logger.error("Failed to unsubscribe all video")
         else:
-            self.del_remote_video_map_all(None)
+            self.del_remote_video_map(None)
         return ret
     def set_audio_volume_indication_parameters(self, interval_in_ms, smooth, report_vad):

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora/rtc/rtc_connection.py RENAMED Viewed

@@ -53,16 +53,19 @@ agora_rtc_conn_renew_token.argtypes = [AGORA_HANDLE, ctypes.c_char_p]
 class RTCConnection:
-    def __init__(self, conn_handle) -> None:
+    def __init__(self, conn_handle, is_low_delay: bool = False) -> None:
         self.conn_handle = conn_handle
         self.con_observer = None
         self.local_user = None
         self.local_user_handle = agora_rtc_conn_get_local_user(conn_handle)
         if self.local_user_handle:
             self.local_user = LocalUser(self.local_user_handle, self)
+        #added to set low delay mode
+        if is_low_delay == True:
+            self.local_user.set_audio_scenario(AudioScenarioType.AUDIO_SCENARIO_CHORUS)
         # add to map
-        AgoraHandleInstanceMap().set_local_user_map(self.conn_handle, self)
-        AgoraHandleInstanceMap().set_con_map(self.conn_handle, self)
+        #AgoraHandleInstanceMap().set_local_user_map(self.conn_handle, self)
+        #AgoraHandleInstanceMap().set_con_map(self.conn_handle, self)
     #
     def connect(self, token: str, chan_id: str, user_id: str) -> int:
@@ -130,7 +133,7 @@ class RTCConnection:
     def release(self):
         # release local user map
         if self.conn_handle:
-            AgoraHandleInstanceMap().del_local_user_map(self.conn_handle)
+            #AgoraHandleInstanceMap().del_local_user_map(self.conn_handle)
             agora_rtc_conn_release(self.conn_handle)
         self.conn_handle = None
         self.local_user = None

agora_python_server_sdk-2.1.5/agora/rtc/utils/audio_consumer.py ADDED Viewed

@@ -0,0 +1,133 @@
+#!env python
+import threading
+import time
+from agora.rtc.audio_pcm_data_sender import PcmAudioFrame
+from agora.rtc.audio_pcm_data_sender import AudioPcmDataSender
+import logging
+import asyncio
+logger = logging.getLogger(__name__)
+"""
+# AudioConsumer
+# 基础类，用于消费PCM数据，并将PCM数据推送到RTC频道中
+# 在AI场景中：
+#   当TTS有数据返回的时候：调用AudioConsumer::push_pcm_data方法，将返回的TTS数据直接push到AudioConsumer
+#   在另外的一个“timer”的触发函数中，调用 AudioConsumer::consume()方法，将数据推送到rtc
+    # 推荐：
+    # “Timer”可以是asycio的模式；也可以是threading.Timer的模式；也可以和业务已有的timer结合在一起使用，都可以。只需要在timer 触发的函数中，调用 AudioConsumer::consume()即可
+    # “Timer”的触发间隔，可以和业务已有的timer间隔一致，也可以根据业务需求调整，推荐在40～80ms之间
+AudioConsumer调用方式：
+1. 使用该类的前提：
+    - 需要客户在应用层自己实现一个timer，该timer间隔需要在[40ms,80ms]之间。这个timer的触发方法下面用app::TimeFunc表示。
+    - 一个用户只能对应一个AudioConsumer对象，也就是保障一个生产者产生的内容对应一个消费者。
+2. 使用方式：
+    A 对每一个“生产pcm数据“的 userid 创建一个AudioConsumer对象，也就是保障一个生产者产生的内容对应一个消费者。
+    B 当有pcm数据生成的时候，比如TTS的返回，调用 AudioConsumer::push_pcm_data(data)
+    C 当需要消费的时候（通常用app::TimerFunc)，调用 AudioConsumer::consume()方法，会自动完成对数据的消费，也就是推送到rtc 频道中
+    D 如果需要打断：也就是AI场景中，要停止播放当前AI的对话：调用 AudioConsumer::clear()方法,会自动清空当前buffer中的数据
+    E 退出的时候，调用release()方法，释放资源
+"""
+class AudioConsumer:
+    def __init__(self, pcm_sender: AudioPcmDataSender, sample_rate: int, channels: int) -> None:
+        self._lock = threading.Lock()
+        self._start_time = 0
+        self._data = bytearray()
+        self._consumed_packages = 0
+        self._pcm_sender = pcm_sender
+        self._frame = PcmAudioFrame()
+        #init sample rate and channels
+        self._frame.sample_rate = sample_rate
+        self._frame.number_of_channels = channels
+        #audio parame
+        self._frame.bytes_per_sample = 2
+        self._bytes_per_frame = sample_rate // 100 * channels * 2
+        self._samples_per_channel = sample_rate // 100* channels
+        #init pcmaudioframe
+        self._frame.timestamp = 0
+        self._init = True
+        pass
+    def push_pcm_data(self, data) ->None:
+        if self._init == False:
+            return
+        # add to buffer, lock
+        with self._lock:
+            self._data += data
+        pass
+    def _reset(self):
+        if self._init == False:
+            return
+        self._start_time = time.time()*1000
+        self._consumed_packages = 0
+        pass
+    def consume(self):
+        print("consume begin")
+        if self._init == False:
+            return
+        now = time.time()*1000
+        elapsed_time = int(now - self._start_time)
+        expected_total_packages = int(elapsed_time//10)
+        besent_packages = expected_total_packages - self._consumed_packages
+        data_len = len(self._data)
+        if besent_packages > 18 and data_len //self._bytes_per_frame < 18: #for fist time, if data_len is not enough, just return and wait for next time
+            #print("-----underflow data")
+            return
+        if besent_packages > 18: #rest to start state, push 18 packs in Start_STATE
+            self._reset()
+            besent_packages = min(18, data_len//self._bytes_per_frame)
+            self._consumed_packages = -besent_packages
+        #get min packages
+        act_besent_packages = (int)(min(besent_packages, data_len//self._bytes_per_frame))
+        #print("consume 1:", act_besent_packages, data_len)
+        if act_besent_packages < 1:
+            return
+        #construct an audio frame to push
+        #frame = PcmAudioFrame()
+        with self._lock:
+            #frame = PcmAudioFrame()
+            self._frame.data = self._data[:self._bytes_per_frame*act_besent_packages]
+            self._frame.timestamp = 0
+            self._frame.samples_per_channel = self._samples_per_channel*act_besent_packages
+            #reset data
+            self._data = self._data[self._bytes_per_frame*act_besent_packages:]
+            self._consumed_packages += act_besent_packages
+        self._pcm_sender.send_audio_pcm_data(self._frame)
+        #print(f"act_besent_packages: {now},{now - self._start_time}, {besent_packages}, {act_besent_packages},{self._consumed_packages},{data_len}")
+        pass
+    def len(self) -> int:
+        if self._init == False:
+            return 0
+        with self._lock:
+            return len(self._data)
+    def clear(self):
+        if self._init == False:
+            return
+        with self._lock:
+            self._data = bytearray()
+        pass
+    def release(self):
+        if self._init == False:
+            return
+        self._init = False
+        with self._lock:
+            self._data = None
+            self._frame = None
+            self._pcm_sender = None
+        self._lock = None
+        pass

agora_python_server_sdk-2.1.5/agora/rtc/utils/vad_dump.py ADDED Viewed

@@ -0,0 +1,104 @@
+#!env python
+import time
+from datetime import datetime
+import logging
+import os
+from agora.rtc.agora_base import AudioFrame
+logger = logging.getLogger(__name__)
+"""
+## VadDump helper class
+"""
+class VadDump():
+    def __init__(self, path: str) -> None:
+        self._file_path = path
+        self._count = 0
+        self._frame_count = 0
+        self._is_open = False
+        self._source_file = None
+        self._label_file = None
+        self._vad_file = None
+        #check path is existed or not? if not, create new dir
+        if self._check_directory_exists(path) is False:
+            os.makedirs(path)
+        # make suddirectory : ("%s/%04d%02d%02d%02d%02d%02d
+        now = datetime.now()
+        #format to YYYYMMDDHHMMSS
+        self._file_path = "%s/%04d%02d%02d%02d%02d%02d" % (path, now.year, now.month, now.day, now.hour, now.minute, now.second)
+        os.makedirs(self._file_path)
+        pass
+    def _check_directory_exists(self, path: str) -> bool:
+        return os.path.exists(path) and os.path.isdir(path)
+    def _create_vad_file(self) -> None:
+        self._close_vad_file()
+        #create a new one
+        vad_file_path = "%s/vad_%d.pcm" % (self._file_path, self._count)
+        self._vad_file = open(vad_file_path, "wb")
+        #increment the count
+        self._count += 1
+        pass
+    def _close_vad_file(self) -> None:
+        if self._vad_file:
+            self._vad_file.close()
+            self._vad_file = None
+        pass
+    def open(self) -> int:
+        if self._is_open is True:
+            return 1
+        self._is_open = True
+        #open source file
+        source_file_path = self._file_path + "/source.pcm"
+        self._source_file = open(source_file_path, "wb")
+        #open label file
+        label_file_path = self._file_path + "/label.txt"
+        self._label_file = open(label_file_path, "w")
+        #open vad file
+        pass
+    def write(self, frame:AudioFrame, vad_result_bytes: bytearray, vad_result_state : int) -> None:
+        #write pcm to source
+        if self._is_open is False:
+            return
+        if self._source_file:
+            self._source_file.write(frame.buffer)
+        # fomat frame 's label informaiton and write to label file
+        if self._label_file:
+            label_str = "ct:%d fct:%d state:%d far:%d vop:%d rms:%d pitch:%d mup:%d\n" % (self._count, self._frame_count,vad_result_state, frame.far_field_flag, frame.voice_prob, frame.rms, frame.pitch, frame.music_prob)
+            self._label_file.write(label_str)
+        #write to vad result
+        if vad_result_state == 1: # start speaking
+            #open new vad file and write header
+            self._create_vad_file()
+            if self._vad_file:
+                self._vad_file.write(vad_result_bytes)
+        if vad_result_state == 2:
+            if self._vad_file:
+                self._vad_file.write(vad_result_bytes)
+        if vad_result_state == 3:
+            if self._vad_file:
+                self._vad_file.write(vad_result_bytes)
+            self._close_vad_file()
+        #increment frame counter
+        self._frame_count += 1
+        pass
+    def close(self) -> None:
+        if self._is_open == False:
+            return
+        self._is_open = False
+        if self._vad_file:
+            self._close_vad_file()
+            self._vad_file = None
+        if self._label_file:
+            self._label_file.close()
+            self._label_file = None
+        self._close_vad_file()
+        # assign to None
+        self._count = 0
+        self._frame_count = 0
+        self._file_path = None
+        pass

agora_python_server_sdk-2.1.5/agora_python_server_sdk.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,140 @@
+Metadata-Version: 2.1
+Name: agora_python_server_sdk
+Version: 2.1.5
+Summary: A Python SDK for Agora Server
+Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Topic :: Multimedia :: Sound/Audio
+Classifier: Topic :: Multimedia :: Video
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3 :: Only
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+# Note
+- This is a Python SDK wrapper for the Agora RTC SDK.
+- It supports Linux and Mac platforms.
+- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
+# Very Important Notice !!!
+- A process can only have one instance.
+- An instance can have multiple connections.
+- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
+# Required Operating Systems and Python Versions
+- Supported Linux versions:
+  - Ubuntu 18.04 LTS and above
+  - CentOS 7.0 and above
+- Supported Mac versions:
+  - MacOS 13 and above
+- Python version:
+  - Python 3.10 and above
+# Using Agora-Python-Server-SDK
+```
+pip install agora_python_server_sdk
+```
+# Running Examples
+## Preparing Test Data
+- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
+## Executing Test Script
+```
+python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --userId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
+```
+# Change log
+## 2024.12.03 release Version 2.1.5
+- Modifications:
+  - LocalUser/audioTrack:
+    -- When the scenario is chorus, developers don't need to call setSendDelayInMs.
+    -- When the scenario is chorus, developers don't need to set the audio scenario of the track to chorus.
+    -- NOTE: This can reduce the difficulty for developers. In AI scenarios, developers only need to set the service to chorus.
+- Additions:
+  -- Added the VadDump class, which can assist in troubleshooting vad issues in the testing environment. However, it should not be enabled in the online env  ironment.
+  -- Added the on_volume_indication callback.
+  -- Added the on_remote_video_track_state_changed callback.
+- Removals:
+  -- Removed Vad V1 version, only retaining the V2 version. Refer to voice_detection.py and sample_audio_vad.py.
+- Updates:
+  -- Updated relevant samples: audioconsume, vad sample.
+## 2024.11.12 release 2.1.4
+- Modify the type of metadata in videoFrame from str to bytes type to be consistent with C++; thus, it can support byte streams.
+- The internal encapsulation of ExteranlVideoFrame has been modified to support byte streams. Regarding the support for alpha encoding, a logical judgment has been made. If fill_alpha_buffer is 0, it will not be processed.
+## 2024.11.11 release 2.1.3
+- Added a new sample: example_jpeg_send.py which can push JPEG files or JPEG streams to a channel.
+-
+- Performance overhead, as noted in the example comments, can be summarized as follows:
+- For a 1920x1080 JPEG file, the process from reading the file to converting it to an RGBA bytearray - takes approximately 11 milliseconds.
+## 2024.11.07 release 2.1.2
+- Updates `user_id` in the `AudioVolumeInfoInner and AudioVolumeInfo` structure to `str` type.
+- Fixes the bug in `_on_audio_volume_indication` callback, where it could only handle one callback to speaker_number
+- Corrects the parameter type in `IRTCLocalUserObserver::on_audio_volume_indication` callback to `list` type.
+## 2024.10.29 release 2.1.1
+Add audio VAD interface of version 2 and corresponding example.
+## 2024.10.24 release 2.1.0
+Fixed some bug.
+### Common Usage Q&A
+## The relationship between service and process?
+- A process can only have one service, and the service can only be initialized once.
+- A service can only have one media_node_factory.
+- A service can have multiple connections.
+- Release media_node_factory.release() and service.release() when the process exits.
+## If using Docker with one user per Docker, when the user starts Docker and logs out, how should Docker be released?
+- In this case, create service/media_node_factory and connection when the process starts.
+- Release service/media_node_factory and connection when the process exits, ensuring that...
+## If Docker is used to support multiple users and Docker runs for a long time, what should be done?
+- In this case, we recommend using the concept of a connection pool.
+- Create service/media_node_factory and a connection pool (only new connections, without initialization) when the process starts.
+- When a user logs in, get a connection from the connection pool, initialize it, execute con.connect() and set up callbacks, and then join the channel.
+- Handle business operations.
+- When a user logs out, execute con.disconnect() and release the audio/video tracks and observers associated with the connection, but do not call con.release(); then put the connection back into the connection pool.
+- When the process exits, release the connection pool (release each con.release()), service/media_node_factory, and the connection pool (release each con.release()) to ensure resource release and optimal performance.
+## Use of VAD
+# Source code: voice_detection.py
+# Sample code: example_audio_vad.py
+# It is recommended to use VAD V2 version, and the class is: AudioVadV2; Reference: voice_detection.py.
+# Use of VAD:
+  1. Call _vad_instance.init(AudioVadConfigV2) to initialize the vad instance. Reference: voice_detection.py. Assume the instance is: _vad_instance
+  2. In audio_frame_observer::on_playback_audio_frame_before_mixing(audio_frame):
+  3. Call the process of the vad module: state, bytes = _vad_instance.process(audio_frame)
+Judge the value of state according to the returned state, and do corresponding processing.
+    A. If state is _vad_instance._vad_state_startspeaking, it indicates that the user is "starting to speak", and speech recognition (STT/ASR) operations can be started. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    B. If state is _vad_instance._vad_state_stopspeaking, it indicates that the user is "stopping speaking", and speech recognition (STT/ASR) operations can be stopped. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+    C. If state is _vad_instance._vad_state_speaking, it indicates that the user is "speaking", and speech recognition (STT/ASR) operations can be continued. Remember: be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# Note:
+  If the vad module is used and it is expected to use the vad module for speech recognition (STT/ASR) and other operations, then be sure to pass the returned bytes to the recognition module instead of the original audio_frame, otherwise the recognition result will be incorrect.
+# How to better troubleshoot VAD issues: It includes two aspects, configuration and debugging.
+  1. Ensure that the initialization parameters of the vad module are correct. Reference: voice_detection.py.
+  2. In state, bytes = on_playback_audio_frame_before_mixing(audio_frame):
+    - A . Save the data of audio_frame to a local file, reference: example_audio_pcm_send.py. This is to record the original audio data. For example, it can be named: source_{time.time()*1000}.pcm
+    - B.Save the result of each vad processing:
+      - a When state == start_speaking: create a new binary file, for example, named: vad_{time.time()*1000}.pcm, and write bytes to the file.
+      - b When state == speaking: write bytes to the file.
+      - c When state == stop_speaking: write bytes to the file and close the file.
+    Note: In this way, problems can be troubleshot based on the original audio file and the audio file processed by vad. This function can be disabled in the production environment.
+### How to push the audio generated by TTS into the channel?
+  # Source code: audio_consumer.py
+  # Sample code: example_audio_consumer.py
+### How to release resources?

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/agora_python_server_sdk.egg-info/SOURCES.txt RENAMED Viewed

@@ -10,7 +10,6 @@ agora/rtc/audio_encoded_frame_sender.py
 agora/rtc/audio_frame_observer.py
 agora/rtc/audio_pcm_data_sender.py
 agora/rtc/audio_sessionctrl.py
-agora/rtc/audio_vad.py
 agora/rtc/local_audio_track.py
 agora/rtc/local_user.py
 agora/rtc/local_user_observer.py
@@ -32,6 +31,8 @@ agora/rtc/_ctypes_handle/_rtc_connection_observer.py
 agora/rtc/_ctypes_handle/_video_encoded_frame_observer.py
 agora/rtc/_ctypes_handle/_video_frame_observer.py
 agora/rtc/_utils/globals.py
+agora/rtc/utils/audio_consumer.py
+agora/rtc/utils/vad_dump.py
 agora_python_server_sdk.egg-info/PKG-INFO
 agora_python_server_sdk.egg-info/SOURCES.txt
 agora_python_server_sdk.egg-info/dependency_links.txt

{agora_python_server_sdk-2.1.3 → agora_python_server_sdk-2.1.5}/setup.py RENAMED Viewed

@@ -45,12 +45,12 @@ class CustomInstallCommand(install):
 setup(
     name='agora_python_server_sdk',
-    version='2.1.3',
+    version='2.1.5',
     description='A Python SDK for Agora Server',
     long_description=open('README.md').read(),
     long_description_content_type='text/markdown',
     url='https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK',
-    packages=["agora.rtc", "agora.rtc._ctypes_handle", "agora.rtc._utils"],
+    packages=["agora.rtc", "agora.rtc._ctypes_handle", "agora.rtc._utils","agora.rtc.utils"],
     classifiers=[
         "Intended Audience :: Developers",
         'License :: OSI Approved :: MIT License',

agora_python_server_sdk-2.1.3/PKG-INFO DELETED Viewed

@@ -1,51 +0,0 @@
-Metadata-Version: 2.1
-Name: agora_python_server_sdk
-Version: 2.1.3
-Summary: A Python SDK for Agora Server
-Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
-Classifier: Intended Audience :: Developers
-Classifier: License :: OSI Approved :: MIT License
-Classifier: Topic :: Multimedia :: Sound/Audio
-Classifier: Topic :: Multimedia :: Video
-Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3 :: Only
-Requires-Python: >=3.10
-Description-Content-Type: text/markdown
-# Note
-- This is a Python SDK wrapper for the Agora RTC SDK.
-- It supports Linux and Mac platforms.
-- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
-# Very Important Notice !!!
-- A process can only have one instance.
-- An instance can have multiple connections.
-- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
-# Required Operating Systems and Python Versions
-- Supported Linux versions:
-  - Ubuntu 18.04 LTS and above
-  - CentOS 7.0 and above
-- Supported Mac versions:
-  - MacOS 13 and above
-- Python version:
-  - Python 3.10 and above
-# Using Agora-Python-Server-SDK
-```
-pip install agora_python_server_sdk
-```
-# Running Examples
-## Preparing Test Data
-- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
-## Executing Test Script
-```
-python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
-```

agora_python_server_sdk-2.1.3/README.md DELETED Viewed

@@ -1,36 +0,0 @@
-# Note
-- This is a Python SDK wrapper for the Agora RTC SDK.
-- It supports Linux and Mac platforms.
-- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
-# Very Important Notice !!!
-- A process can only have one instance.
-- An instance can have multiple connections.
-- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
-# Required Operating Systems and Python Versions
-- Supported Linux versions:
-  - Ubuntu 18.04 LTS and above
-  - CentOS 7.0 and above
-- Supported Mac versions:
-  - MacOS 13 and above
-- Python version:
-  - Python 3.10 and above
-# Using Agora-Python-Server-SDK
-```
-pip install agora_python_server_sdk
-```
-# Running Examples
-## Preparing Test Data
-- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
-## Executing Test Script
-```
-python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
-```

agora_python_server_sdk-2.1.3/agora/rtc/audio_vad.py DELETED Viewed

@@ -1,164 +0,0 @@
-from . import lib_path
-import ctypes
-import os
-import sys
-from enum import Enum, IntEnum
-import logging
-logger = logging.getLogger(__name__)
-if sys.platform == 'darwin':
-    agora_vad_lib_path = os.path.join(lib_path, 'libuap_aed.dylib')
-elif sys.platform == 'linux':
-    agora_vad_lib_path = os.path.join(lib_path, 'libagora_uap_aed.so')
-try:
-    agora_vad_lib = ctypes.CDLL(agora_vad_lib_path)
-except OSError as e:
-    logger.error(f"Error loading the library: {e}")
-    logger.error(f"Attempted to load from: {agora_vad_lib_path}")
-    sys.exit(1)
-class VAD_STATE(ctypes.c_int):
-    VAD_STATE_NONE_SPEAKING = 0
-    VAD_STATE_START_SPEAKING = 1
-    VAD_STATE_SPEAKING = 2
-    VAD_STATE_STOP_SPEAKING = 3
-# struct def
-"""
-typedef struct Vad_Config_ {
-  int fftSz;  // fft-size, only support: 128, 256, 512, 1024, default value is 1024
-  int hopSz;  // fft-Hop Size, will be used to check, default value is 160
-  int anaWindowSz;  // fft-window Size, will be used to calc rms, default value is 768
-  int frqInputAvailableFlag;  // whether Aed_InputData will contain external freq. power-sepctra, default value is 0
-  int useCVersionAIModule; // whether to use the C version of AI submodules, default value is 0
-  float voiceProbThr;  // voice probability threshold 0.0f ~ 1.0f, default value is 0.8
-  float rmsThr; // rms threshold in dB, default value is -40.0
-  float jointThr; // joint threshold in dB, default value is 0.0
-  float aggressive; // aggressive factor, greater value means more aggressive, default value is 5.0
-  int startRecognizeCount; // start recognize count, buffer size for 10ms 16KHz 16bit 1channel PCM, default value is 10
-  int stopRecognizeCount; // max recognize count, buffer size for 10ms 16KHz 16bit 1channel PCM, default value is 6
-  int preStartRecognizeCount; // pre start recognize count, buffer size for 10ms 16KHz 16bit 1channel PCM, default value is 10
-  float activePercent; // active percent, if over this percent, will be recognized as speaking, default value is 0.6
-  float inactivePercent; // inactive percent, if below this percent, will be recognized as non-speaking, default value is 0.2
-} Vad_Config;
-"""
-class VadConfig(ctypes.Structure):
-    _fields_ = [
-        ("fftSz", ctypes.c_int),
-        ("hopSz", ctypes.c_int),
-        ("anaWindowSz", ctypes.c_int),
-        ("frqInputAvailableFlag", ctypes.c_int),
-        ("useCVersionAIModule", ctypes.c_int),
-        ("voiceProbThr", ctypes.c_float),
-        ("rmsThr", ctypes.c_float),
-        ("jointThr", ctypes.c_float),
-        ("aggressive", ctypes.c_float),
-        ("startRecognizeCount", ctypes.c_int),
-        ("stopRecognizeCount", ctypes.c_int),
-        ("preStartRecognizeCount", ctypes.c_int),
-        ("activePercent", ctypes.c_float),
-        ("inactivePercent", ctypes.c_float)
-    ]
-    def __init__(self) -> None:
-        self.fftSz = 1024
-        self.hopSz = 160
-        self.anaWindowSz = 768
-        self.frqInputAvailableFlag = 0
-        self.useCVersionAIModule = 0
-        self.voiceProbThr = 0.8
-        self.rmsThr = -40.0
-        self.jointThr = 0.0
-        self.aggressive = 5.0
-        self.startRecognizeCount = 10
-        self.stopRecognizeCount = 6
-        self.preStartRecognizeCount = 10
-        self.activePercent = 0.6
-        self.inactivePercent = 0.2
-# struct def
-class VadAudioData(ctypes.Structure):
-    _fields_ = [
-        ("audioData", ctypes.c_void_p),
-        ("size", ctypes.c_int)
-    ]
-    # def __init__(self) -> None:
-    # self.data = None
-"""
-int Agora_UAP_VAD_Create(void** stPtr, const Vad_Config* config);
-int Agora_UAP_VAD_Destroy(void** stPtr);
-int Agora_UAP_VAD_Proc(void* stPtr, const Vad_AudioData* pIn, Vad_AudioData* pOut, VAD_STATE* state);
-"""
-agora_uap_vad_create = agora_vad_lib.Agora_UAP_VAD_Create
-agora_uap_vad_create.restype = ctypes.c_int
-agora_uap_vad_create.argtypes = [ctypes.POINTER(ctypes.c_void_p), ctypes.POINTER(VadConfig)]
-agora_uap_vad_destroy = agora_vad_lib.Agora_UAP_VAD_Destroy
-agora_uap_vad_destroy.restype = ctypes.c_int
-agora_uap_vad_destroy.argtypes = [ctypes.POINTER(ctypes.c_void_p)]
-agora_uap_vad_proc = agora_vad_lib.Agora_UAP_VAD_Proc
-agora_uap_vad_proc.restype = ctypes.c_int
-agora_uap_vad_proc.argtypes = [ctypes.c_void_p, ctypes.POINTER(VadAudioData), ctypes.POINTER(VadAudioData), ctypes.POINTER(VAD_STATE)]
-class AudioVad:
-    def __init__(self) -> None:
-        self.vadCfg = VadConfig()
-        self.handler = None
-        self.lastOutTs = 0
-        self.initialized = False
-    # return 0 if success， -1 if failed
-    def Create(self, vadCfg):
-        if self.initialized:
-            return 0
-        self.vadCfg = vadCfg
-        self.initialized = True
-        # creat handler
-        self.handler = ctypes.c_void_p()
-        ret = agora_uap_vad_create(ctypes.byref(self.handler), ctypes.byref(self.vadCfg))
-        return ret
-    # Destroy
-    # return 0 if success， -1 if failed
-    def Destroy(self):
-        if self.initialized:
-            agora_uap_vad_destroy(ctypes.byref(self.handler))
-        self.initialized = False
-        self.handler = None
-        return 0
-    # Proc
-    # framein: bytearray object, include audio data
-    # return ret, frameout, flag, ret: 0 if success， -1 if failed; frameout: bytearray object, include audio data; flag: 0 if non-speaking, 1 if speaking
-    def Proc(self, framein):
-        ret = -1
-        if not self.initialized:
-            return -1
-        # supporse vadout is empty,vadin byte array
-        inVadData = VadAudioData()
-        buffer = (ctypes.c_ubyte * len(framein)).from_buffer(framein)  # only a pointer to the buffer is needed, not a copy
-        inVadData.audioData = ctypes.cast(buffer, ctypes.c_void_p)
-        inVadData.size = len(framein)
-        outVadData = VadAudioData(None, 0)  # c api will allocate memory
-        vadflag = VAD_STATE(0)
-        ret = agora_uap_vad_proc(self.handler, ctypes.byref(inVadData), ctypes.byref(outVadData), ctypes.byref(vadflag))
-        # convert from c_char to bytearray
-        bytes_from_c = ctypes.string_at(outVadData.audioData, outVadData.size)
-        frameout = bytearray(bytes_from_c)
-        flag = vadflag.value
-        return ret, frameout, flag

agora_python_server_sdk-2.1.3/agora_python_server_sdk.egg-info/PKG-INFO DELETED Viewed

@@ -1,51 +0,0 @@
-Metadata-Version: 2.1
-Name: agora_python_server_sdk
-Version: 2.1.3
-Summary: A Python SDK for Agora Server
-Home-page: https://github.com/AgoraIO-Extensions/Agora-Python-Server-SDK
-Classifier: Intended Audience :: Developers
-Classifier: License :: OSI Approved :: MIT License
-Classifier: Topic :: Multimedia :: Sound/Audio
-Classifier: Topic :: Multimedia :: Video
-Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3 :: Only
-Requires-Python: >=3.10
-Description-Content-Type: text/markdown
-# Note
-- This is a Python SDK wrapper for the Agora RTC SDK.
-- It supports Linux and Mac platforms.
-- The examples are provided as very simple demonstrations and are not recommended for use in production environments.
-# Very Important Notice !!!
-- A process can only have one instance.
-- An instance can have multiple connections.
-- In all observers or callbacks, you must not call the SDK's own APIs, nor perform CPU-intensive tasks in the callbacks; data copying is allowed.
-# Required Operating Systems and Python Versions
-- Supported Linux versions:
-  - Ubuntu 18.04 LTS and above
-  - CentOS 7.0 and above
-- Supported Mac versions:
-  - MacOS 13 and above
-- Python version:
-  - Python 3.10 and above
-# Using Agora-Python-Server-SDK
-```
-pip install agora_python_server_sdk
-```
-# Running Examples
-## Preparing Test Data
-- Download and unzip [test_data.zip](https://download.agora.io/demo/test/test_data_202408221437.zip) to the Agora-Python-Server-SDK directory.
-## Executing Test Script
-```
-python agora_rtc/examples/example_audio_pcm_send.py --appId=xxx --channelId=xxx --audioFile=./test_data/demo.pcm --sampleRate=16000 --numOfChannels=1
-```