-
Originally Posted by omphalopsychos
can OpenAI expand their data 10-fold while keeping quality high? Yeah, this was one of my points. It seems to remain unclear.
-
01-30-2025 09:13 PM
-
Originally Posted by omphalopsychos
Originally Posted by Tal_175
-
Don't necessarily "disagree" in a binary fashion. Neither of us has perfect knowledge. I think it's most likely that in a few months or sooner we'll see another major press release that will counterbalance the current deepseek narrative. I'm not saying this from any political interest, just based on my knowledge of the space.
-
The inference code DeepSeek-V3/inference/model.py at main * deepseek-ai/DeepSeek-V3 * GitHub actually tells us a lot about how the model works—it’s not a total black box. The model.py file in the repo gives insight into key architectural details like Mixture-of-Experts (MoE), Multi-Head Latent Attention (MLA), rotary positional embeddings, RMSNorm, and quantization methods. This tells us how the model processes inputs, structures computations, and optimizes inference efficiency.
What’s missing is how it was trained—we don’t have details on the dataset, optimizer, loss function, or hyperparameter tuning. But from the inference code, you can still infer a lot about how the model is structured and what design choices were made. So while we can’t see how DeepSeek was trained, we can see exactly how it runs and what components define its architecture.
-
Originally Posted by Mick-7
-
Here is an article explaining the DeepSeek training process. I assume it is based on the technical report:
How DeepSeek-R1 Was Built; For dummies
-
Originally Posted by omphalopsychos
You don’t get 3,000 billionaires by sharing increased productivity.
-
Idk history doesn’t really support the idea that technological progress just consolidates power without also creating new opportunities. The Industrial Revolution is a good example—yes, factory owners captured a lot of the gains, but industrialization also created an entirely new class of workers by giving landless peasants access to steady wages as machine operators. As subsistence farming declined and local trade became less viable, factory work provided economic stability at scale in a way that didn’t exist before. AI could do something similar—not by creating assembly lines, but by lowering the skill barrier for certain types of knowledge work, making economic participation possible for more people, even if the transition is disruptive.
You can’t assume one outcome is inevitable. Every major technological shift plays out as a struggle between competing forces—corporate interests, labor, government intervention, and shifts in consumer demand. Technology might centralize power in some areas, but it will also create new industries and new forms of work that aren’t obvious yet, just like industrialization did.
-
I like that perspective, hopeful.
-
If DeepSeek was as "open source" as some here suggest, its release would not have spooked the markets and AI industry, they could quickly figure it out. There'd be no fuss about it, just another over-hyped start-up, not to mention that Chinese companies often present opaque (or outright fake) business and financial profiles, which is a big reason why experienced stock traders I know won't touch Chinese company stocks.
The Italian government just blocked access to DeepSeek in their country:
Italy blocks access to the Chinese AI application DeepSeek to protect users' data | AP News
DeepSeek issued a statement saying: "European legislation does not apply to them." Good luck with that argument.
-
AI vs Human...
Found DeepSeek useful for finding information on converting USB 5V to center negative 9V for guitar pedals. It may prove useful for other small things in the future.
What the tool could not tell me is that it made more sense to simply adapt to available pedalboard batteries. So I've procured the $30 Horse brand battery/pedal board power supply mentioned earlier in the thread. Build quality is better than expected though we'll find out about things like isolation and shielding over time.
To adapt I kept my pedals to 100mA or less with a single pedal pulling 300mA. This will get me to reverb, looper, preamp emulation, and a robust selection of cabinet IR's. Can't use the Strymon's but that's OK. EV HoF and Joyo Cab Box are certainly good enough. So is a simple looper.
How long will this setup go on battery power? We'll see. In theory more hours than I need.
-
Originally Posted by omphalopsychos
and if 150 years ago this new working class did so well, we wouldnt have the german and russian revolutions, unions, and communism, now would we? the "steady wages" for 6 days of 14-16 hours declined fast as the supply of labor by far outnumbered demand. by the end of the 19th century wages in germany were back to the level of 1820. Berlin was a hell-hole for the working class.
Social question - Wikipedia
-
Originally Posted by Mick-7
Originally Posted by Mick-7
-
If DeepSeek trades in European markets, it is subject to European law.
-
Anyone care to weigh in one OpemAI's claim that DeepSeek may be a distillation of ChatGTP? How would that work. For one, thing, the number of parameters is around 0.7 trillion, I believe, whereas ChatGTP's is I think around 2 trillion - so not a huge difference. Also, my (very limited) understanding of distillation is that involves pruning nodes that have the least significance from the 'teaching' network. How could this be done if OpenAI's topology and weights are unavailable?
(Just wanted to point out that, not only is DeepSeek supposedly much cheaper to train, but also cheaper to make inferences, too.)
I see the article that Tal linked explains how the training was done, without reference to ChatGTP. So what would prompt OpenAI to claim otherwise? (Aside from self-interest, of course.) Also, at the end of the article, the author states: "We thought model scaling hit a wall, but this approach is unlocking new possibilities". Which implies those power-law graphs from 2020 were no longer applicable, right?
-
Yikes!
-
I think Geoffrey Hinton is conflating evolution of behaviour and evolution of intelligence a bit there. Our instincts for survival, reproduction, protection of our offsprings etc has nothing to do with our intelligence. Those instincts give us motivations and goals. They are byproducts of evolutionary processes of biological organisms in particular environmental conditions. Intelligence gives us a way to achieve those goals but not the goals themselves. It's not obvious at all that if you could remove sexual drive and desire to reproduce from an intelligent human, they would develop that desire by reasoning (hence all the nasty competition he was referring to that results from such goals). They would probably be better off without it. It's not clear at all that if you could remove instinct for survival from all humans, they would conclude by reasoning that they are better off alive than not. Intelligence alone does not explain human behaviour, series of random mutations and natural selection do.
Last edited by Tal_175; 01-31-2025 at 10:32 AM.
-
Originally Posted by CliffR
As for the scaling laws, that Vellum article saying “we thought model scaling hit a wall” is probably talking about practical constraints like cost and training time, not the actual power-law trends. The scaling laws still hold—larger models trained on more compute and data continue to improve, but brute-force scaling alone is becoming less practical. This is actually consistent with the Kaplan paper because it showed that you can’t just scale model size and compute without also scaling data or you hit diminishing returns.
-
Originally Posted by CliffR
Annoying clickbaity thumbnail, but I think GH is making good points. Nobody really knows how to regulate these models yet, and current attempts (like the EU AI Act and some U.S. state laws) seem premature. Regulators don’t understand the tech and the tech itself is evolving too fast for rigid rules to work.
-
Reading a lot of interesting stuff I was curious about in this thread. Thanks to those taking the time to keep us abreast and for putting comments in terms we can all understand. Things are moving fast and frankly, I'm really impressed with the knowledge some of you possess.
Just wanted to add that what I'm hearing from friends still working in tech is that it's becoming something of a key and marketable skill to know exactly how to phrase your question and in what sequence to place questions when working with these tools. Key to using them to write code of course but also important in seeking more complex answers to more complex questions. Probably germane when drilling into music theory or even jazz history.
-
Originally Posted by Spook410
I think this also ties back to the broader point about employment. It’s not like I’m firing engineers because we have AI—but I do expect them to know how to use AI to multiply their effectiveness. And if I’m hiring, I’d prefer someone who’s familiar with this toolset. It’s becoming just like knowing the latest programming languages, frameworks, or standard workplace tools—not something that replaces people, but something that changes what’s expected from them.
-
Originally Posted by djg
Last edited by omphalopsychos; 02-01-2025 at 02:34 AM.
-
Originally Posted by omphalopsychos
-
Update: still sucks at music
-
Originally Posted by djg
Questions for you Barry Harris disciples /...
Today, 07:49 AM in Improvisation