Image Definition and Acquisition
Lightweight Avatar: This term specifically refers to an avatar rendered using the CPU in cloud rendering scenarios, distinguishing it from GPU rendering modes, hence the name "lightweight". For its detailed definition and custom generation, see: 2D End Avatar Customization and Download Process. List of SKUs Required for the Scenario
|
Custom Quota (Avatar Exclusive, Avatar Generic, choose one of the two options) | Required | 1 | |
Lightweight Cloud Service Hourly Package | Required | 1 | |
Custom Avatar Renewal Service (Monthly or Yearly) | Optional | 1 | |
API Call Solution
The invocation process is the same as before and consists of two steps: Step 1, stream creation, and Step 2, audio-driven.
Step 1: Stream Creation - Outputting a Real-Time Stream (RTMP, TRTC, WebRTC)
3. After the session is created, refer to Enabling a Session to open the stream's session state. Once this is done, you can drive the digital human via the driver API. Step 2: Parsing and Playing Video Streams
1. If you select RTMP, you can use third-party players that support the RTMP streaming protocol, such as VLC, for playback. The stream URL is generated from the result of the first step, stream creation.
2. If you select WebRTC, you can use the WebRTC Player SDK for playback. The stream URL is generated from the result of the first step, stream creation. 3. When you select TRTC, you need to integrate the TRTC SDK to parse the video. Parse the field information according to the format in the figure below and pass it to the TRTC SDK. Then, run the TRTC SDK as required to play the video. Notes:
In the TRTC protocol scenario, if the avatar supports a transparent background, you can achieve the effect in two steps:
1. In the TRTC SDK, change the configuration from `trtc = TRTC.create();` to `trtc = TRTC.create({ enableSEI: true });`.
2. During the first step, stream creation, add the AlphaChannelEnable parameter and set the AlphaChannelEnable parameter to True. Step 3: Audio-Driven
Notes:
The current lightweight avatar only supports audio-driven. If you need text-driven, you can refer to the End-Rendering Driver API and add a text-to-audio feature in the pre-processing stage of audio-driven. 3. This is an optional integration. You can refer to Heartbeat Instructions to ensure that the persistent connection does not time out and disconnect.
Step 4: Closing a Session
If the video stream is not disconnected, consumption will continue. In business scenarios, if the video stream is no longer needed, promptly call the Close Session API to terminate the stream and close the conversation. By default, if there is no conversation for 10 minutes, it will be closed automatically. If you are unsure about the number of current streams or have forgotten the sessionid of a stream, you can query it by calling any one of the three APIs for retrieving the session list.
Step 5: Continuous Keep-Alive (Optional)
The video stream will be automatically disconnected by default if there is no driver conversation for 10 minutes. To keep the video stream active for a long time, you need to call the heartbeat instruction.
Best Practices Demo
Notes:
This Demo supports both GPU-based and CPU-based avatars. Since the CPU-based avatar (lightweight version) only supports audio-driven, when you use this Demo to import a lightweight avatar and initiate an interactive session, you can only use audio-driven, and text-driven is not available.