DeepSeek R1 (Product Review)

3 min readJan 29, 2025

By now, the Internet is all about DeepSeek and what makes its LLM different from other LLMs like OpenAI’s ChatGpT or Anthropic’s Claude. What stands out to me is DeepSeek’s thinking and reasoning output. As a simple example of DeepSeek’s ability to think in real-time, I asked it to tell me how to build a chat app with a React front-end:

Note that the first time I asked DeepSeek this question, I got a “Sorry, DeepSeek search service is busy. Please disable search or try again later.” This was a useful cue for me to select “DeepThink (R1)” only and not use DeepSeek’s web search capability.

The above snippet from DeepSeek’s response is a good example of it’s real-time thinking and reasoning capabilities. Not too dissimilar from ChatGpT, Claude and others is DeepSeek’s code generation capability:

When I ask DeepSeek a followup question, the transparency of the reasoning the LLM used to arrive at its answer is impressive:

I believe there are several reasons that explain / justify the current hype surrounding DeepSeek:

It’s open source. It means that product companies can switch between different providers and focus their efforts on the product and its customer experience. To illustrate the open source element of DeepSeek; since its release only a few days ago, more than 500 derivative models of DeepSeek had been created and the data model had been downloaded more than 2.5m times (via Clement Delangue).
Race to the bottom for foundational data models. DeepSeek’s claims about the limited investment it required for the creation of DeepSeek R1 — both in terms of computing power required and AI tokens (the latter are required to securely access the AI capability and the underlying data). DeepSeek’s $0.55 and $2.19 per million input and output tokens respectively is significantly cheaper than ChatGpT’s $15 and $60 respectively. Worth checking out the thoughts of Suno founder Mikey Shulman on this topic as well as those of venture capital titan Marc Andreessen .
Increasing focus on inference. Or: using the data model vs training it. If it becomes easier and cheaper to train LLMs, the application of these models will take centre stage. Running the model on new data and increased focus on the overall product experience.

Main learning: Crazy times. The pace of innovation is incredible, and the prospect of significant efficiencies is very exciting. Will definitely need to more time to play with DeepSeek R1 and think more about the implications (and risks) of its overnight success.

DeepSeek R1 (Product Review)

Written by MAA1

No responses yet