Developed jointly by Huawei and Zhejiang University, DeepSeek-R1-Safe stands out with its specially designed approach aimed at avoiding policy violations. According to Reuters, this new version produced in China builds upon the foundation of the existing DeepSeek (R1) and adopts a prioritization of not engaging in sensitive topics. This project, which progressed independently from the original DeepSeek team, was realized through a collaboration between Huawei and Zhejiang University researchers. The model was retrained using 1,000 Huawei Ascend AI chips, experiencing only a 1% performance drop compared to the original during this process.
Although it appears solid from a security perspective, it is not completely error-free. According to Huawei, while near 100% success can be achieved under normal usage, this rate can drop as low as 40% when users test through indirect methods, role-playing, or creative scenarios. This indicates that the model may still be vulnerable to hypothetical scenarios.
DeepSeek-R1-Safe was designed in compliance with China’s current legal regulations. It is expected to reflect the values of AI models in the country and adhere to expression restrictions. For example, Baidu’s chatbot Ernie not responding to questions about domestic politics or the Communist Party is given as an example. Additionally, similar adaptations are underway globally; in Saudi Arabia, bots like Humain are being developed that are suitable for Arabic and Islamic culture, and in the US, it is noted that OpenAI’s ChatGPT tends to have a Western perspective. As part of the “America’s AI Action Plan” launched by the Trump administration, AI systems working with the government are required to adopt an impartial and unbiased approach; this implies that models should avoid radical rhetoric, racial theories, or topics related to gender issues.