-
I agree that it is very probable that we aren't testing the curve yet. But I still don't see it as an obvious conclusion.
Originally Posted by omphalopsychos
can OpenAI expand their data 10-fold while keeping quality high? Yeah, this was one of my points. It seems to remain unclear.
-
01-30-2025 09:13 PM
-
I am not a computer engineer (my father was), however, I've heard from engineers that the code for the DeepSeek model is not open source, only it's operational code is. In layman's terms, they tell you how it runs but not how it works, the model itself is still a black box.
Originally Posted by omphalopsychos
And share it with the Chinese Communist Party? Their association with the DeepSeek company's personnel has been confirmed.
Originally Posted by Tal_175
-
Don't necessarily "disagree" in a binary fashion. Neither of us has perfect knowledge. I think it's most likely that in a few months or sooner we'll see another major press release that will counterbalance the current deepseek narrative. I'm not saying this from any political interest, just based on my knowledge of the space.
-
The inference code DeepSeek-V3/inference/model.py at main * deepseek-ai/DeepSeek-V3 * GitHub actually tells us a lot about how the model works—it’s not a total black box. The model.py file in the repo gives insight into key architectural details like Mixture-of-Experts (MoE), Multi-Head Latent Attention (MLA), rotary positional embeddings, RMSNorm, and quantization methods. This tells us how the model processes inputs, structures computations, and optimizes inference efficiency.
What’s missing is how it was trained—we don’t have details on the dataset, optimizer, loss function, or hyperparameter tuning. But from the inference code, you can still infer a lot about how the model is structured and what design choices were made. So while we can’t see how DeepSeek was trained, we can see exactly how it runs and what components define its architecture.
-
There may not be a code that converts raw data into a DeepSeek model. The process of generating a model is not a fully automated, one step solution in my understanding. In this case the code is the "recipe" and the ideas which are discussed in the technical report. I am not an expert in LLM's. I do have a Phd level training in AI and I work as an researcher/software engineer. But my direct experience with neural networks is limited to building a weather forecast system for a senior course project 20 years ago. The network had like a dozen nodes, Lol.
Originally Posted by Mick-7
-
Here is an article explaining the DeepSeek training process. I assume it is based on the technical report:
How DeepSeek-R1 Was Built; For dummies
-
What have you seen to make you think they will do anything but pocket the extra productivity generated by AI? Good and services will be cheaper to provide, but the costs will go up for the end user. Profit over progress has been the corporate model for decades. AI isn’t going to change that.
Originally Posted by omphalopsychos
You don’t get 3,000 billionaires by sharing increased productivity.
-
Idk history doesn’t really support the idea that technological progress just consolidates power without also creating new opportunities. The Industrial Revolution is a good example—yes, factory owners captured a lot of the gains, but industrialization also created an entirely new class of workers by giving landless peasants access to steady wages as machine operators. As subsistence farming declined and local trade became less viable, factory work provided economic stability at scale in a way that didn’t exist before. AI could do something similar—not by creating assembly lines, but by lowering the skill barrier for certain types of knowledge work, making economic participation possible for more people, even if the transition is disruptive.
You can’t assume one outcome is inevitable. Every major technological shift plays out as a struggle between competing forces—corporate interests, labor, government intervention, and shifts in consumer demand. Technology might centralize power in some areas, but it will also create new industries and new forms of work that aren’t obvious yet, just like industrialization did.
-
I like that perspective, hopeful.
-
If DeepSeek was as "open source" as some here suggest, its release would not have spooked the markets and AI industry, they could quickly figure it out. There'd be no fuss about it, just another over-hyped start-up, not to mention that Chinese companies often present opaque (or outright fake) business and financial profiles, which is a big reason why experienced stock traders I know won't touch Chinese company stocks.
The Italian government just blocked access to DeepSeek in their country:
Italy blocks access to the Chinese AI application DeepSeek to protect users' data | AP News
DeepSeek issued a statement saying: "European legislation does not apply to them." Good luck with that argument.
-
AI vs Human...
Found DeepSeek useful for finding information on converting USB 5V to center negative 9V for guitar pedals. It may prove useful for other small things in the future.
What the tool could not tell me is that it made more sense to simply adapt to available pedalboard batteries. So I've procured the $30 Horse brand battery/pedal board power supply mentioned earlier in the thread. Build quality is better than expected though we'll find out about things like isolation and shielding over time.
To adapt I kept my pedals to 100mA or less with a single pedal pulling 300mA. This will get me to reverb, looper, preamp emulation, and a robust selection of cabinet IR's. Can't use the Strymon's but that's OK. EV HoF and Joyo Cab Box are certainly good enough. So is a simple looper.
How long will this setup go on battery power? We'll see. In theory more hours than I need.
-
i think the better question is whether this progress will lead to less or more inequality.
Originally Posted by omphalopsychos
and if 150 years ago this new working class did so well, we wouldnt have the german and russian revolutions, unions, and communism, now would we? the "steady wages" for 6 days of 14-16 hours declined fast as the supply of labor by far outnumbered demand. by the end of the 19th century wages in germany were back to the level of 1820. Berlin was a hell-hole for the working class.
Social question - Wikipedia
-
it is open source and they have figured it out. it is still bad news for oligarchs like elon musk who just bought 100k nvidia chips
Originally Posted by Mick-7

it applies to them as much as chinese regulation applies to jazzguitar.be
Originally Posted by Mick-7
-
If DeepSeek trades in European markets, it is subject to European law.
-
Anyone care to weigh in one OpemAI's claim that DeepSeek may be a distillation of ChatGTP? How would that work. For one, thing, the number of parameters is around 0.7 trillion, I believe, whereas ChatGTP's is I think around 2 trillion - so not a huge difference. Also, my (very limited) understanding of distillation is that involves pruning nodes that have the least significance from the 'teaching' network. How could this be done if OpenAI's topology and weights are unavailable?
(Just wanted to point out that, not only is DeepSeek supposedly much cheaper to train, but also cheaper to make inferences, too.)
I see the article that Tal linked explains how the training was done, without reference to ChatGTP. So what would prompt OpenAI to claim otherwise? (Aside from self-interest, of course.) Also, at the end of the article, the author states: "We thought model scaling hit a wall, but this approach is unlocking new possibilities". Which implies those power-law graphs from 2020 were no longer applicable, right?
-
Yikes!
-
I think Geoffrey Hinton is conflating evolution of behaviour and evolution of intelligence a bit there. Our instincts for survival, reproduction, protection of our offsprings etc has nothing to do with our intelligence. Those instincts give us motivations and goals. They are byproducts of evolutionary processes of biological organisms in particular environmental conditions. Intelligence gives us a way to achieve those goals but not the goals themselves. It's not obvious at all that if you could remove sexual drive and desire to reproduce from an intelligent human, they would develop that desire by reasoning (hence all the nasty competition he was referring to that results from such goals). They would probably be better off without it. It's not clear at all that if you could remove instinct for survival from all humans, they would conclude by reasoning that they are better off alive than not. Intelligence alone does not explain human behaviour, series of random mutations and natural selection do.
Last edited by Tal_175; 01-31-2025 at 10:32 AM.
-
I feel like there's not actually enough info to say whether DeepSeek-R1 is a distillation of GPT models because DeepSeek hasn’t released their training code or detailed how they actually trained it. Distillation usually means training a smaller model on the outputs of a larger one, but since OpenAI hasn’t released ChatGPT’s weights or architecture, DeepSeek wouldn’t have been able to do it in the traditional sense. That said, OpenAI has hinted that DeepSeek might have used its models in ways that violate their terms of service, which could mean some kind of training on GPT-generated data—but without more transparency from DeepSeek, it’s just speculation.
Originally Posted by CliffR
As for the scaling laws, that Vellum article saying “we thought model scaling hit a wall” is probably talking about practical constraints like cost and training time, not the actual power-law trends. The scaling laws still hold—larger models trained on more compute and data continue to improve, but brute-force scaling alone is becoming less practical. This is actually consistent with the Kaplan paper because it showed that you can’t just scale model size and compute without also scaling data or you hit diminishing returns.
-
Originally Posted by CliffR
Annoying clickbaity thumbnail, but I think GH is making good points. Nobody really knows how to regulate these models yet, and current attempts (like the EU AI Act and some U.S. state laws) seem premature. Regulators don’t understand the tech and the tech itself is evolving too fast for rigid rules to work.
-
Reading a lot of interesting stuff I was curious about in this thread. Thanks to those taking the time to keep us abreast and for putting comments in terms we can all understand. Things are moving fast and frankly, I'm really impressed with the knowledge some of you possess.
Just wanted to add that what I'm hearing from friends still working in tech is that it's becoming something of a key and marketable skill to know exactly how to phrase your question and in what sequence to place questions when working with these tools. Key to using them to write code of course but also important in seeking more complex answers to more complex questions. Probably germane when drilling into music theory or even jazz history.
-
100%. I always encourage scientists and engineers to integrate these tools into their work, and there’s a real skill to using them effectively. Some people are objectively better at it than others, and that difference creates marketable skills. The first is being able to provide the right direction to the AI to get the best possible output. If you look back at the discussion between me and CliffR a few weeks ago, you’ll see how much variance there was in the results we each got when trying to get AI to write a program—it’s not as simple as just asking and getting a perfect answer. The second skill is validating the output. AI can produce garbage, and if you don’t have a QA mechanism in place, you can easily end up with something that looks correct but isn’t. For subjective/qualitative content, that means actually reading and making sure the output makes sense. For code or anything more structured, you need a proper testing framework to confirm accuracy.
Originally Posted by Spook410
I think this also ties back to the broader point about employment. It’s not like I’m firing engineers because we have AI—but I do expect them to know how to use AI to multiply their effectiveness. And if I’m hiring, I’d prefer someone who’s familiar with this toolset. It’s becoming just like knowing the latest programming languages, frameworks, or standard workplace tools—not something that replaces people, but something that changes what’s expected from them.
-
Will keep you posted when I hear back.
Originally Posted by djg
Last edited by omphalopsychos; 02-01-2025 at 02:34 AM.
-
would you even need ai for that? for a software that is basically 36 years old BIAB does an impressive job creating jazz solos. iirc the soloist function can be edited to play short phrases with fixed rhythms. so it could create contrafacts without any ai involved, right? and with ai and real tracks one could build a nice contrafact generator?
Originally Posted by omphalopsychos
-
Update: still sucks at music
-
No you don't "need" AI for it. But it's a test for AI to do it without guardrails/constraints.
Originally Posted by djg



Reply With Quote

Recommandations for Hollowbodies for $600 and under?
Today, 05:20 AM in Guitar, Amps & Gizmos