The new artificial intelligence (AI) models from Chinese startup DeepSeek are causing a stir in the technology industry. The company claims that these models are as good as or better than the most high-end US models, but they are significantly less expensive.
The company garnered global attention when it disclosed that the cost of training its DeepSeek-V3 model using Nvidia H800 processors was less than $6 million.
DeepSeek’s AI Assistant, which is powered by DeepSeek-V3, has now surpassed ChatGPT to become the most popular free app on Apple’s App Store in the United States.
This has prompted concerns regarding the rationale behind the billions of dollars that US technology companies are investing in AI. As a result, the equities of major companies, such as Nvidia, have experienced a decline.
What is DeepSeek and why is it generating so much excitement?
DeepSeek is an AI platform that offers sophisticated artificial intelligence models, notably for tasks such as data analysis, natural language processing, and a broad range of machine learning applications.
Its models, including DeepSeek-V3 and DeepSeek-R1, are engineered to be highly efficient, cost-effective, and capable of performing intricate tasks with precision.
In late 2022, the release of OpenAI’s ChatGPT incited a competition among Chinese technology companies to create their own AI-powered chatbots. Nevertheless, the disparity in AI capabilities between Chinese and American companies resulted in widespread dissatisfaction in China following the introduction of the first Chinese equivalent of ChatGPT by Baidu, the search engine giant.
DeepSeek has revolutionized this narrative with its models, DeepSeek-V3 and DeepSeek-R1, which have garnered praise from both Silicon Valley executives and US tech engineers. The Chinese startup asserts that these models are comparable to the most sophisticated offerings of OpenAI and Meta.
Additionally, they are significantly more cost-effective. According to a post on DeepSeek’s official WeChat account, the newly released DeepSeek-R1 is 20 to 50 times more cost-effective to operate than OpenAI’s GPT-3 model, contingent upon the task.
Who is Liang Wenfeng, the man responsible for the development of DeepSeek?
According to Chinese corporate documents, Liang Wenfeng, the co-founder of the quantitative hedge fund High-Flyer, is the owner of DeepSeek, a Hangzhou-based startup.
Liang’s fund announced in March 2023 that it would transition from trading to the establishment of a “new and independent research group” that would concentrate on Artificial General Intelligence (AGI) on its official WeChat account. As part of this new direction, DeepSeek was established later that year.
AGI is defined by OpenAI, the developers of ChatGPT, as autonomous systems that are capable of surpassing human performance in the most economically valuable duties.
The extent to which High-Flyer has invested in DeepSeek remains uncertain. As per corporate records, High-Flyer is the owner of patents for processor clusters that are employed in the training of AI models, and the two organizations share office space. The AI division of the fund disclosed on WeChat in July 2022 that it manages a cluster of 10,000 A100 processors.
What is the reason for DeepSeek’s expansion in China?
The triumph of DeepSeek has captivated the attention of China’s most prominent political figures. Xinhua, the state news agency, reported that Liang Wenfeng, the founder of DeepSeek-R1, participated in a closed-door symposium for business leaders and experts hosted by Chinese Premier Li Qiang on January 20, the same day the project was inaugurated.
Liang’s participation in the event suggests that DeepSeek’s accomplishments may be consistent with Beijing’s objective of overcoming U.S. export restrictions and attaining self-sufficiency in critical sectors such as AI.
Robin Li, the CEO of Baidu, participated in an analogous symposium the previous year.