The article introduces Alibaba Cloud's Confidential AI, focusing on its use of confidential computing to protect sensitive AI model data and ensure secure cloud operations.As AI technology in the cloud rapidly develops, enterprises speed up its integration into products, giving rise to the crucial challenge of protecting sensitive data and models in the cloud. Compromised user data and models may jeopardize intellectual property rights, undermine the competitive advantages and revenue of enterprises, and erode customer trust. When these happen, brand reputation and customer relationships may be significantly damaged
With these tough and urgent issues in AI model data security, Alibaba Cloud, a member of the council of the OpenAnolis Open Source Community, proposes a solution that is derived from the operating system layer of the artificial intelligence infrastructure (AI Infra). The Community provides security solutions based on confidential computing and an out-of-the-box software runtime stack.
In the session for AI Infra core technology in Apsara Conference 2024, ZHANG Jia, the owner of the Confidential Computing Special Interest Group (SIG) in the Community and senior technical expert of Alibaba Cloud Intelligence Group, shared his thoughts in a speech named Confidential AI Best Practices. The following sections summarize the details.
Alibaba Cloud Confidential AI integrates confidential computing technology into a model service platform and adopts an innovative approach to ensure that sensitive data and models are processed inside a fully isolated and encrypted in-memory environment, which is a trusted execution environment (TEE). This way, Alibaba Cloud Confidential AI can provide a secure, trusted, and end-to-end universal framework and method to protect model data across the data lifecycle from the perspective of system-level security. Alibaba Cloud Confidential AI fills the gap in system-level security capabilities in the AI security field, delivers utmost security for AI data and models, and provides full protection for AI systems.
The core technology behind Confidential AI is confidential computing, which in nature is a hardware-based security approach to meet user trust. Where do user trust originate?
Cloud service providers use traditional virtualization-based security isolation technologies to prevent direct attacks on cloud platforms and subsequent attacks on data and models of the cloud platform tenants through lateral movement. The technologies satisfy the security requirements of cloud service providers, but cannot satisfy the security requirements of cloud platform tenants who provide private data and models. In the AI era, data and models are highly valuable and sensitive assets. Tenants realize the importance of data privacy protection, and they want to run their AI-related workloads in a TEE similar to a remote safe because they do not have direct control over cloud platforms. A TEE ensures that models and data stay inside a secure enclave and are not leaked to insecure locations including the cloud platforms. This helps reduce tenants' trust costs and security dependencies in cloud platforms.
The following section describes the common data security issues in foundation AI scenario:
• Model data leaks: The security risks of the system in which foundation models are deployed may lead to leaks of highly confidential and sensitive training data, such as personal privacy data and enterprise data, and high-value model parameter information.
• Platform trust: Trust issues occur when the owners of models and cloud platforms are different. A model owner may not trust the cloud platforms provided by cloud service providers.
• Security in model sharing: If a model owner adopts traditional security technologies, the owner cannot effectively separate the ownership and use rights of models and data. As a result, the owner cannot grant other entities the permissions to use the models and data, control how data is used, maximize the values and utilization of data, or boost cross-organization cooperation and sharing.
• Security in user privacy: If a user enters prompts that contain confidential data of an enterprise, the data may be leaked to AI-generated content (AIGC) applications.
To address the preceding issues, Alibaba Cloud adopts Confidential AI and software-hardware integration-based confidential computing to provide end-to-end encryption and protection for model data from the perspective of system-level security. This significantly reduces the risk of sensitive data and model leaks.
Confidential AI can seamlessly adapt to mainstream AI inference frameworks without requiring users to change their internal codes. It also allows users to deploy secure, trusted, and end-to-end frameworks for model deployment, training, and inference. Confidential AI also supports OpenAnolis Attestation Service (OAAS), a remote attestation service for confidential computing, to attest that execution environments, software, and data are not tampered with.
Confidential AI uses heterogeneous TEE technology to ensure that the CPU execution environment is secure and trusted, verify the GPU TEE, and establish end-to-end secure channels with the GPU TEE over a physical bus. This ensures data confidentiality and integrity between CPUs and GPUs.
A heterogeneous server instance of Alibaba Cloud can carry both the workloads of standard virtual machines (VMs) and GPUs and the workloads of confidential VMs (CVMs) and GPU TEEs. The workloads of different types of instances are running on the same server. This helps satisfy the demands of customers who require different security levels.
Alibaba Cloud and other leading companies of the Internet industry co-founded Confidential Containers (CoCo), a project accepted to Cloud Native Computing Foundation (CNCF) at the Sandbox maturity level. For more than two years, Alibaba Cloud continuously devoted time and efforts to the project, which is visible in Alibaba Cloud's international practices and efforts. Alibaba Cloud has two Technical Committers (TCs) and three core sub-project maintainers in the CoCo community and is the Top 2 contributor of the community. The components and features for remote attestation and container image security, of which Alibaba Cloud is a main contributor, are included in the official release of CoCo.
Alibaba Cloud Confidential AI uses the core components that Alibaba Cloud developed and contributed to the CoCo community, showing the successful commercialization and use case of open source technologies in the CoCo community and helping elevate the CoCo project to the Incubation maturity level in CNCF.
The Cloud-native Confidential Computing SIG is committed to cooperate with other open source projects related to confidential computing. The SIG helps these open source projects adapt to the Anolis OS faster and better, and provides users with an out-of-the-box confidential computing software stack. The Cloud-native Confidential Computing SIG concentrates on its core projects to develop open source technology stacks for cloud-native confidential computing, simplify confidential computing adoption, and advance confidential computing technologies in cloud-native environments. The SIG has built full-stack security capabilities for confidential computing from the bottom up.
The following section describes some of the major achievements of the SIG:
• RATS-RS: a cross-TEE attested Transport Layer Security (TLS) library that is written in Rust (a memory-safe programming language) and developed on top of the Remote Attestation Procedures (RATS) architecture and model.
• Shelter: a TEE-based trusted sandboxing tool for running applications.
• SGXDataCenterAttestationPrimitives.
• OAAS: a remote attestation service of the OpenAnolis Open Source Community.
OAAS supports mainstream CPU TEEs. Support for GPU TEE attestation will be rolled out in the future.
• Confidential AI Community Version: an open source version of the commercialized Alibaba Cloud Confidential AI. Confidential AI Community Version is an open source tool and supports TEEs of Chinese vendors and GPU TEEs. It is expected to be released by the end of the year.
Alibaba Cloud Confidential AI is adopted by a number of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) services of Alibaba Cloud. IaaS services, such as Alibaba Cloud Elastic GPU Service and Container Compute Service (ACS), allow users to customize, control, and deploy AI frameworks and applications in confidential computing-based CPU or GPU TEEs. PaaS services, such as Elastic Algorithm Service (EAS) of Platform for AI (PAI), allow users to deploy EAS with a few clicks and use EAS features out of the box. EAS is a confidential computing and inference service of PAI.
Conclusion
Alibaba Cloud's commercialization of Confidential AI Community Version provides a significant reference implementation for enterprises and public institutions to adopt AI technology and prevent AI-related security risks at the system level. This helps enterprises and institutions avoid adverse impacts of impaired user trust on brand reputation and customer relationship. In addition, the adoption of Confidential AI can effectively boost the promotion and adoption of confidential computing technology in the AI scenario, making confidential computing technology an important solution for large-scale deployment of machine learning workloads.
Join the DingTalk group of the Cloud-native Confidential Computing SIG of the OpenAnolis Open Source Community (group ID: 42822683) and share your ideas.