DeepSeek is an interesting platform. We explore its history, the technology, if and how you should use it and what it means for AI. Beyond the feature we’ve got insights about the rise of Big Potato, AI in Washington, what chatbot to use, and how humans process information at 10 bits per second.
The Distilled Spirit
How Business is Done
🥔 The Rise of Big Potato (The Bittman Project)
Not unlike a number of other businesses in America, it seems that there has been significant consolidation in the potato business. Four companies stand accused of being the potato cartel, conspiring to keep the prices of frozen spuds high for American consumers.
🏛 OpenAI’s Growing Washington Office (MIT Technology Review)
OpenAI is playing the Washington game. Their lobbying spending rose significantly in 2024, hiring an in-house lobbyist in the process. Spending appears to be set to rise as a new regime moves in with a different set of interests.
🐕 DOGEMaxing (Second Best)
The specter of DOGE is stalking DC as I write this. DOGE is now organized under the banner of the US Digital Service — the outfit Obama created to fix Healthcare.gov. Samuel Hammond explains how it is charged with modernizing US software systems. We could see web3 come to government services soon.
Non-DeepSeek AI Thoughts
💯 Dr. Mollick’s Recommendations for AI Right Now ()
Ethan Mollick shares his state of AI report for January 2025. He compares capabilities between the list of paid general purpose AI apps. This is worth the click for his chart comparing features alone. He continues to compare features like live modes, web access and image generation between the platforms. His recommendation is not too different than our holiday recommendations: ChatGPT is still the general best, Gemini looks interesting and Claude has a few particular strengths worth mentioning.
🔍 AI is a Proficient Historian ()
At one point in my life I thought I would become a historian. It looks like had I decided to follow that path, I might have to worry about AI replacing me sooner rather than later. ChatGPT-4o seems to be competent at rendering early modern text readable and has lost a lot of its hallucination issues. For more advanced translations, OpenAI o1 does a much better job of interpreting complicated texts than it’s predecessors. Moreover it can come up with a historical interpretation of the work. It will be an interesting few years.
About That Mark I Cranium
)Peter Gray illustrates how labeling problematic gaming and social media use the same ways we label substance use disorders is counterproductive to treatment. Substance addiction uses entirely different brain mechanisms, the terms end up stereotyping people and the concept might be misconstruing causes and effects. Treating internet use problems as a time management issue might be more successful.
👁 Living at 10 Bits per Second (arvix.org)
I try not to recommend academic papers here as they are typically unreadable. This one is an exception — it is readable as a normal well educated human and it is fascinating. It covers a subject we touched once already: how the human nervous systems information processing is fairly limited to something around 10 bits per second. This paper investigates why this comes to be and identifies some of the causes and some of the places where further research is needed. It is a fascinating window into how people process information.
DeepSeek Distilled
DeepSeek is a Chinese AI company building impressive and cost-efficient models. Their name popped into headlines over the last few days due to their low cost AI engine. It is low cost because they have taken a different approach to training compared to OpenAI or Anthropic, and it appears to be paying off. Lian Wenfang, the founder of DeepSeek, is also the founder of a successful Chinese hedge fund, High-Flyer. His experience with High-Flyer has provided him with significant financial resources and a strategic mindset. He has funneled profits and GPU access into DeepSeek, enabling a focus on long-term research and attracting top-tier talent. The lab is unique for being entirely local to China, composed of young and aggressive professionals. This environment, combined with Wenfang's management style, has resulted in innovative and powerful AI models. This interview is a must read.
It is very clear that DeepSeek has built some more efficient software, but I would be very careful to jump to the conclusion many in the West did on Monday. The $5.8m figure is real but probably marginal. A lot of the doom and gloom about the AI bubble bursting is likely a bit premature. There are lots of GPUs involved in this product. Still, the existence of this product has Silicon Valley really squirming. It is definitely running much cheaper than their platforms. Jevons paradox could be striking again and the Chinese are in the lead.
The New and Exciting DeepSeek Models
DeepSeek gained attention in May 2024 with the release of DeepSeek-V2, a GPT-3.5-level model that was significantly cheaper to operate than competitors, costing approximately one-third as much per million tokens processed. Its key innovation was a Mixture of Experts (MoE) system, which activates only the necessary parts of the neural network for a given task, drastically improving efficiency.
In late 2024, DeepSeek released V3, a model trained on 14.8 trillion tokens that came close to matching the best in class, such as GPT-4o and Claude Sonnet. Remarkably, V3 was trained for under $6 million, about a tenth of GPT-4's cost, thanks to the efficiency of the MoE system. While slightly behind cutting-edge models like OpenAI o1, it marked a significant advancement. It was just about as good as GPT-4o at a fraction of the cost.
On Inauguration Day 2025, DeepSeek released DeepSeek R1, a powerful reasoning model benchmarking near OpenAI o1 levels in terms of accuracy on language comprehension tasks and problem-solving benchmarks, such as ARC and MMLU. Built for approximately $12 million, including V3 training and fine-tuning costs, R1 is a fraction of the cost of GPT-o1's $100 million training expense. In addition to its large-scale 671 billion parameter version, DeepSeek offers distilled versions for smaller hardware. You can run this on your home gaming PC. Demonstrations are impressive.
How to Play with It Yourself
DeepSeek provides their tools online and through mobile apps for iOS and Android. They also offer a Chrome extension for direct integration while browsing. As of right now you cannot create new accounts on the platform due to “malicious activity.” But that ban will lift soon.
Caveat: The tools are currently free, but users should note that DeepSeek's terms of service allow the company to retain and use data from interactions. As the data resides in China, caution is advised when sharing sensitive or identifiable information.
For privacy-conscious users, DeepSeek can be run locally with Ollama. Setup instructions available on Reddit make the process straightforward.
I still would avoid it for anything sensitive, but the public API pricing showcases its how cost effective (or deeply pocketed) the venture is. You can get DeepSeek R1 output for $2.19 per million output tokens; OpenAI o1-mini will cost $12 per million outputs; full blown OpenAI o1 is $60 per the same measure. OpenAI o1 is certainly a bit better in practice and much safer — but is it six to twenty-five times better and safer?
Good Enough, Fast Enough, and Definitely Cheaper
DeepSeek's apps are limited compared to larger competitors. They are not multimodal, lack advanced real-time camera capabilities, and do not support voice learning. OpenAI and Claude remain ahead in these areas. However, DeepSeek's core LLM is sufficiently powerful, especially for self-hosted scenarios, where it offers significant value.
DeepSeek's efficiency presents a challenge for the OpenAIs and Anthropics of the world. Its innovative use of the Mixture of Experts system allows for significant cost reductions, making it possible to train high-performing models at a fraction of the expense incurred by competitors. Additionally, DeepSeek's ability to achieve near-competitive results on far smaller budgets forces larger labs to reassess their own expensive and resource-intensive approaches. For organizations like OpenAI, which invest hundreds of millions in training, the diminishing lead could be concerning. As the gap narrows, questions arise: Is "good enough" combined with lower costs a better strategy? Could DeepSeek's approach represent a beachhead in the evolving AI market?
Bibliography
This report draws from the following sources:
- : Their DeepSeek coverage is excellent, particularly DeepSeek: What the Headlines Miss and the interview with Lian Wenfang.
- : In-depth analysis of DeepSeek V3 and DeepSeek R1 and ties it up with Panic at the App Store.
Ben Thompson has an excellent DeepSeek FAQ.
The Look
Charts #61: Substack is the new media. They are even having a live streamed financial summit.Did you enjoy reading this post? Hit the ♥ button above or below because it helps more people discover great Substacks like this one and it helps train your algorithm to get you more posts you like. Please share here or in your networks to help us grow!