China issues first national standard for virtual digital humans

China's first national standard for virtual digital human provides unified technical requirements and evaluation criteria for the research, production, and application of such entities in customer service, CGTN reports. 

China issues first national standard for virtual digital humans
Photo credit: VCG

The customer service virtual digital human is one of the most significant application areas of digital human technology and is now widely used across multiple sectors, including finance, government affairs and education.

Issued on October 5, the standard, titled "Information technology – General technical requirements for customer service virtual digital human," establishes a comprehensive technical specification system.

It defines a reference framework for the customer service virtual digital human systems, covering modules such as avatar generation, visual, speech, and emotional interactions, setting clear requirements for digital human of different types and application scenarios.

Regarding avatar generation, the standard stipulates that 2D digital human avatars must provide complete and clear facial feature details, while 3D hyper-realistic digital human models must have at least 200,000 polygons to ensure fine geometric detail.

For interactive functions, it requires a digital human to support multi-modal interaction, including voice, gesture, and body movement, and to possess operational maintenance capabilities, such as keyword maintenance and corpus updates, to ensure continuous service optimization.

The standard specifies a lip-sync accuracy rate of no less than 90 percent, ensuring precise synchronization between the digital human's speech and lip movements.

The average success rate for gesture interaction and for body movement interaction is also set at no less than 90 percent each, making body language communication more natural.

It also sets an emotional interaction success rate of no less than 80 percent, requiring the digital human to accurately recognize user emotions such as joy, sadness, and anxiety, and provide appropriate feedback through methods like expression generation and emotional speech synthesis.

With a speech interaction response time within two seconds and a semantic understanding accuracy rate of no less than 85 percent, the standard facilitates this evolution toward more empathetic and context-aware interactions.

The standard is applicable to the upgrade and transformation of existing 2D and 3D digital human products.

It also leaves room for the integrated application of new technologies, such as AI-generated content, allowing enterprises to make flexible choices based on their application scenarios.

Meanwhile, supporting testing methods are under development, which will provide enterprises with a unified testing benchmark to help them quickly identify issues and optimize their products.

Earlier, it was reported China unveils the Earth System Forecasting Strategy for 2025–2035. 

Most popular
See All