DeepSeek: Meet the 'Nervous' 40-Year-Old Financier Behind the AI Phenomenon

Deepseek Caught AI Specialists, Investors, and Governments off Guard. But Who Is Behind This Revolution?

The FastForward News Team

28 Jan 2025

A small blue whale emerged on Wall Street, shaking up the market value of Silicon Valley giants. DeepSeek, a groundbreaking phenomenon, caught AI specialists, investors, and governments off guard. But who is behind this revolution?

The mastermind is not a tech prodigy or a typical Silicon Valley nerd. Instead, Liang Wenfeng, a successful financier-turned-entrepreneur, entered the tech world and disrupted everything. His name is one we will hear much more about in the future.

From Hedge Funds to AI

Liang, a 40-year-old graduate of Zhejiang University, started his career in finance before founding his own hedge fund, High-Flyer, in 2015. The fund became highly successful, allowing Liang to focus on long-term research and development in AI without the pressure of external investors.

In 2021, Liang began purchasing thousands of Nvidia chips for an AI project, long before the Biden administration restricted the export of advanced semiconductors to China. His vision seemed eccentric at the time, and many dismissed him as a quirky billionaire with an unusual hobby.

One of his business partners told the Financial Times: “When we first met him, he was this nervous guy with wild hair talking about creating a 10,000-chip cluster to train his own models. We didn’t take him seriously. He simply said, ‘I want to build this, and it will be a game-changer.’ We thought only giants like ByteDance or Alibaba could pull it off.”

Despite initial skepticism, Liang turned his passion project, DeepSeek, into a serious venture. Focused on research and development, he recruited top talent from Chinese universities, offering competitive salaries equivalent to those at ByteDance and other tech giants. This strategy helped DeepSeek attract skilled researchers, even if they lacked extensive AI experience.

Game-Changing Models

DeepSeek's journey began with the release of DeepSeek Coder in November 2023, an open-source model designed for coding tasks. It was followed by DeepSeek LLM, a 67B-parameter language model aiming to rival similar systems.

The company's breakthrough came in May 2024 with DeepSeek-V2, which gained widespread acclaim for its performance and low cost. The release disrupted China's AI market, forcing companies like ByteDance, Tencent, and Alibaba to slash prices to stay competitive.

Later, DeepSeek-Coder-V2, a 236B-parameter model, launched with advanced coding capabilities and an economical API, priced at $0.14 per million input tokens and $0.28 per million output tokens.

In January 2025, DeepSeek released DeepSeek-R1, a reasoning-focused model designed to compete with OpenAI's GPT-4. Remarkably, it required only $5.6 million in computational power, significantly less than the hundreds of millions spent by U.S. tech giants.

Liang's team trained DeepSeek-R1 on Nvidia H800 chips, which are less powerful than Nvidia's Blackwell GPUs, showcasing their efficient approach to AI development.

DeepSeek's innovative strategies and cost-efficient models have already established it as a global AI leader, further solidifying China's position on the AI map.

DeepSeek

Liang Wenfeng