DeepSeek’s Popular AI App Is Explicitly Sending US Data to China

1 Like

This article comes across a little FUD since you can run DeepSeek locally and entirely offline.

I don’t see how its data collection is any different from ChatGPT’s, besides the obvious being that its servers are based in China, which will set off alarm bells for some. However, the fact that it’s open source and able to be locally hosted actually leaves it as a strong option for privacy advocates IMO.

6 Likes

Yeah - you’re seeing a lot of different views and opinions on this tool that wiped tens of billions off the AI marketcap at large.

2 Likes

Haven’t they just had a massive cyberattack against them today too?

Pretty sure US big tech did that - has to be the logical opposition who would do it, no?

1 Like

I saw the news and immediately came here. Has the code been audited yet? I really want to use this, but I’m not downloading anything (or its forks) from china to my pc until I see that the code has been audited by a third party. Even running locally, I can never know if they added something to phone home when I’m connected that no one took the time to check

1 Like

You can download the model from HuggingFace.

Which you can use with any number of AI programs like:

5 Likes

How many users are going to do that or are even capable of doing that on their hardware? I’d guess a really small percentage.

DeepSeek Privacy Policy

I am not familiar with other privacy policies or legalese, but keystroke patterns and rhythms seem a bit extensive for my taste…

At the end of the day you’re sending your queries to a third party server, none of these AI chat services are good for privacy because you basically just have to trust that they’re not logging your queries.

DeepSeek isn’t specifically bad, when compared to all the other AI companies like: OpenAI, Meta AI, Gemini etc

Your data isn’t safe with any of these companies whether it’s based in the US or China, the only solution is to run the model locally. Otherwise you shouldn’t use AI at all because it’s a privacy nightmare.

5 Likes

Depending on what you’re doing with AI, I don’t think the advice needs to be this absolute, especially because we can also use privacy-respecting AI services like Brave’s Leo and Duck.ai.

What is really the difference between AI chat services and traditional search engines in regards to this? Don’t they have the same issue that you’re essentially trusting your data with these 3rd party services? At least with something like Brave’s Leo or Duck.ai, there is at least a promise that the data isn’t stored or used to train AI models, which is a similar claim that both Brave and DuckDuckGo make with their traditional search engines (Although DuckDuckGo stores your searches).

So, I would argue that when it comes to privacy, these AI services are very similar to traditional search engines, and while going completely local will always give you the best privacy, it isn’t absolutely needed if you value your privacy, and you can always use something like Tor to access these more privacy-respecting AI services if you want.

1 Like

I guess with a search engine it’s hard to selfhost a search index so you usually have to rely on third party services like DuckDuckGo to get relevant results.

AI models don’t have this issue they can be run completely offline without any network connections. They’re already trained with a certain dataset they don’t need to be constantly up-to-date like a search engine which needs to index new sites constantly.

DuckDuckGo’s AI and Brave’s Leo both probably have much better privacy policies than the other big tech AI chat services, but if there’s very accessible private alternatives then I don’t really see much of a reason to use them.

At least on iOS and macOS there’s Enclave AI which makes it very accessible for people to access this technology on low powered devices like their phones and laptops.

I haven’t looked into Android options tbh, I’m not really a big fan of AI overall but I think we should be strongly pushing people towards offline solutions if they have to use AI for any reason.

2 Likes

If anyone here wants to try the full-size model but their hardware is not beefy enough for it, they can try these quantized versions:

https://unsloth.ai/blog/deepseekr1-dynamic

We provide 4 dynamic quantized versions. The first 3 uses an importance matrix to calibrate the quantization process (imatrix via llama.cpp) to allow lower bit representations. The last 212GB version is a general 2bit quant with no calibration done.

MoE Bits Disk Size Quality Link
1.58-bit 131 GB Fair DeepSeek-R1-UD-IQ1_S
1.73-bit 158 GB Good DeepSeek-R1-UD-IQ1_M
2.22-bit 183 GB Better DeepSeek-R1-UD-IQ2_XXS
2.51-bit 212 GB Best DeepSeek-R1-UD-Q2_K_XL

You can view our full R1 collection of GGUF’s including 4-bit, distilled versions & more: huggingface.co/collections/unsloth/deepseek-r1

https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/

3 Likes

I can’t signup to DeeSeek with cock.li address :frowning:

Relevant info:

1 Like

Any risk running locally?

It doesn’t make any connections it’s just a model that you download and use with an AI program, so I would say the risk is very low. Still if you’re uncomfortable with DeepSeek you can try other models like Gemma or Llama.

Like others have said, probably avoid their cloud based AI because it has privacy issues.

2 Likes

If using Little Snitch or Portmaster, you can always block the connections to be absolutely sure.

7 Likes

Yeah I know. Maybe I’m paranoid, but my fear was more that there could be hidden code that sends your data off that way, but maybe that’s impossible. I don’t know enough honestly.

1 Like

The official models on HuggingFace by DeepSeek are all packaged with the Safetensors format which is design to avoid exactly these potential problems

Also, by reading online it seems that the offline models are not censored like the hosted version by them is. Can’t vouch for it personally since I’ve only tried a few queries in the hosted version with a temporary email

1 Like

The model R1 is really good. If I were a regular knowledge worker, I wouldn’t avoid it just because “China” as it provides enough value (I don’t see it any differently than using Microsoft / Google / Apple / Amazon software or using Lenovo / Xiaomi / BBK hardware).

1 Like