The new model from DeepSeek has been making waves over the last week thanks to its competitive performance against GPT-o1 while being open source. It is also able to transparently share its thought chain.
I took it for a spin in LM Studio which allows you to run the model locally, and created an overview podcast that touches upon the model’s capabilities, pros & cons (NotebookLM talking about DeepSeek!). Check it out below:
Thanks to the great overview provided by Zvi.
In case you are curious on how the model introduces itself, here’s a chat excerpt including the chain of thought:
Thought:
Alright, the user asked about what I can do and who built me. Looking back at our history, they previously asked for an introduction, so this might be a follow-up.
I need to clearly outline my capabilities without being too technical. They probably want a straightforward answer.
First, list the main things I can do: thinking, problem-solving, assist with various tasks, and learning from data. Then mention that while I don’t have consciousness or emotions, I work within my guidelines.
Next, explain who built me—DeepSeek, an AI research company. Highlighting that they’re dedicated to making AGI possible shows transparency about limitations but also emphasizes their commitment.
Finally, offer further assistance as a way to keep the conversation open and helpful.
Response:
I’m DeepSeek-R1, an artificial intelligence assistant created by DeepSeek, a Chinese company dedicated to making Artificial General Intelligence (AGI) feasible. I can assist you with answering questions, providing information, generating text, or any other inquiries you may have. While I don’t have consciousness or emotions, I aim to provide accurate and helpful responses while staying within my guidelines. Let me know how I can help!
