The model was trained on self-play, it’s unclear exactly how, whether via regular chain-of-thought reasoning or some kind of MCTS scheme. It no longer relies only on ideas from internet data, that’s where it started from. It can learn from mistakes it made during training, from making lucky guesses, etc. Now it’s way better as solving math problems, programming, and writing comedy. At what point do we call what it’s doing reasoning? Just like, never, because it’s a computer? Or you object to the transformer architecture specifically, what?
Yeah I admit that the self-play approach is more promising, but it still starts with the internet data to know what things are. I think the transformer architecture is the limiting factor: until there’s a way for the model to do something beyond generating words one at a time, sequentially, they are simply doing nothing more than a very advanced game of madlibs. I don’t know if they can get transformers to work in a different way, where it constructs a concept in a more abstract way then progressively finds a way to put it into words; I know that arguably that’s what it’s doing currently, but the fact that it does it separately for each token means it’s not constructing any kind of abstraction.
it constructs a concept in a more abstract way then progressively finds a way to put it into words; I know that arguably that’s what it’s doing currently,
Correct!
but the fact that it does it separately for each token means it’s not constructing any kind of abstraction
No!!! You simply cannot make judgements like this based on vague ideas like “autocomplete on steroids” or “stochastic parrot”, these were good for conceptualizing GPT-2, maybe. It’s actually very inefficient, but, by re-reading what it has previously written (plus one token) it’s actually acting sort of like an RNN. In fact we know theoretically that with simlified attention models the two architectures are mathematically equivalent.
Let me put it like this. Suppose you had the ability to summon a great novelist as they were at some particular point in their life, pull them from one exact moment in the past, and to do this as many times as you liked. You put a gun to their head, or perhaps offer them alcohol and cocaine, to start writing a novel. The moment they finish the first word, you shoot them in the head and summon the same version again. “Look I’ve got a great first word for a novel, and if you can turn it into a good paragraph I’ll give you this bottle of gin and a gram of cocaine!”. They think for a moment and begin to put down more words, but again you shoot them after word two. Rinse/repeat until a novel is formed. It takes a good while but eventually you’ve got yourself a first draft. You may also have them refine the novel using the same technique, also you may want to give them some of the drugs and alcohol before hand to improve their writing and allow them to put aside the fact that they’ve been summoned to the future by a sorcerer. Now I ask you, is there any theoretical reason why this novel wouldn’t be any good? Is the essence of it somehow different than any other novel, can we judge it as not being real art or creativity?
Now I ask you, is there any theoretical reason why this novel wouldn’t be any good? Is the essence of it somehow different than any other novel, can we judge it as not being real art or creativity?
Yes, it is not. You whipped up that fantastical paragraph with 99.99% fewer coal plants and a lifetime’s worth of creativity. that life was experienced not through meticulously analyzing each thought mathematically but with your body and mind together. More importantly, you’re typing that out because you believe in something which is a theoretical concept that the computer is incapable of analyzing.
be as debatey as you want. What I’m saying is not an entirely different argument. What I said is responding to your entire assertion. I’m not giving you credit for the thought experiment because it helps your argument, I’m trying to give you credit for it because it’s creative and beautiful.
Very well, I’ll take that as a sort of compliment lol.
So I guess I start where I always do, do you think a machine, in principal, has the capability to be intelligent and/or creative? If not, I really don’t have any counter, I suppose I’d be curious as to why though. Like I admit it’s possible there’s something non-physical or non-mechanistic driving our bodies that’s unknown to science. I find that very few hold this hard line opinion though, assuming you are also in that category…
So if that’s correct, what is it about the current paradigm of machine learning that you think is missing? Is it embodiment, is it the simplicity of artificial neurons compared to biological ones, something specific about the transformer architecture, a combination of these, or something else I haven’t thought of?
And hopefully it goes without saying, I don’t think o1-preview is a human level AGI, I merely believe that we’re getting there quite soon and without too many new architectural innovations, possibly just one or two, and none of them will be particularly groundbreaking, it’s fairly obvious what the next couple of steps will be as it was that MCTS + LLM was the next step 3 years ago.
I’ll make it perfectly clear that I’m not in anyway informed on the specific mechanisms or the mathematics involved with the algos related to LLMs. I have a computer science bachelor’s and worked as a full stack developer. When I had a more free-form web-programming project due it was still called NLP and was used to assess webcrawler data and make recommendations: that’s my entire hands on experience so that’s why I’m sticking entirely to philosophical aspect.
Is it embodiment, is it the simplicity of artificial neurons compared to biological ones
That’s basically it. I believe that by bowing to our machine god we aren’t giving human intellect enough credit because we don’t know and might never know how to measure it precisely enough to torture a machine with a soul. When you write an code the machine is always doing exactly what you told it to, it can’t make mistakes. If your code fucked up, you fucked up. Therefore, without a total understand of our body I do not believe we’ll be able to make a machine in our image because we don’t have the full picture (please forgive me for that).
Even if you were to 1:1 model a neuron in code, you’ve got about the entire rest of the body left to recreate. Every thought and feeling is tied to your mind and body. It’s not really “mind and body” it’s really just “body”. there are indescribable feelings that are expressed through song, poetry, paintings, dance, etc. There are entirely “illogical” processes that we simply cannot model. If we are ever able to, of course we can make a machine intelligent. If somehow we were able to get there I think the most obvious question is then: why? Someone on this site might be able to come up with a better use of our top minds and resources.
The model was trained on self-play, it’s unclear exactly how, whether via regular chain-of-thought reasoning or some kind of MCTS scheme. It no longer relies only on ideas from internet data, that’s where it started from. It can learn from mistakes it made during training, from making lucky guesses, etc. Now it’s way better as solving math problems, programming, and writing comedy. At what point do we call what it’s doing reasoning? Just like, never, because it’s a computer? Or you object to the transformer architecture specifically, what?
Yeah I admit that the self-play approach is more promising, but it still starts with the internet data to know what things are. I think the transformer architecture is the limiting factor: until there’s a way for the model to do something beyond generating words one at a time, sequentially, they are simply doing nothing more than a very advanced game of madlibs. I don’t know if they can get transformers to work in a different way, where it constructs a concept in a more abstract way then progressively finds a way to put it into words; I know that arguably that’s what it’s doing currently, but the fact that it does it separately for each token means it’s not constructing any kind of abstraction.
Correct!
No!!! You simply cannot make judgements like this based on vague ideas like “autocomplete on steroids” or “stochastic parrot”, these were good for conceptualizing GPT-2, maybe. It’s actually very inefficient, but, by re-reading what it has previously written (plus one token) it’s actually acting sort of like an RNN. In fact we know theoretically that with simlified attention models the two architectures are mathematically equivalent.
Let me put it like this. Suppose you had the ability to summon a great novelist as they were at some particular point in their life, pull them from one exact moment in the past, and to do this as many times as you liked. You put a gun to their head, or perhaps offer them alcohol and cocaine, to start writing a novel. The moment they finish the first word, you shoot them in the head and summon the same version again. “Look I’ve got a great first word for a novel, and if you can turn it into a good paragraph I’ll give you this bottle of gin and a gram of cocaine!”. They think for a moment and begin to put down more words, but again you shoot them after word two. Rinse/repeat until a novel is formed. It takes a good while but eventually you’ve got yourself a first draft. You may also have them refine the novel using the same technique, also you may want to give them some of the drugs and alcohol before hand to improve their writing and allow them to put aside the fact that they’ve been summoned to the future by a sorcerer. Now I ask you, is there any theoretical reason why this novel wouldn’t be any good? Is the essence of it somehow different than any other novel, can we judge it as not being real art or creativity?
Yes, it is not. You whipped up that fantastical paragraph with 99.99% fewer coal plants and a lifetime’s worth of creativity. that life was experienced not through meticulously analyzing each thought mathematically but with your body and mind together. More importantly, you’re typing that out because you believe in something which is a theoretical concept that the computer is incapable of analyzing.
deleted by creator
Removed by mod
be as debatey as you want. What I’m saying is not an entirely different argument. What I said is responding to your entire assertion. I’m not giving you credit for the thought experiment because it helps your argument, I’m trying to give you credit for it because it’s creative and beautiful.
Very well, I’ll take that as a sort of compliment lol.
So I guess I start where I always do, do you think a machine, in principal, has the capability to be intelligent and/or creative? If not, I really don’t have any counter, I suppose I’d be curious as to why though. Like I admit it’s possible there’s something non-physical or non-mechanistic driving our bodies that’s unknown to science. I find that very few hold this hard line opinion though, assuming you are also in that category…
So if that’s correct, what is it about the current paradigm of machine learning that you think is missing? Is it embodiment, is it the simplicity of artificial neurons compared to biological ones, something specific about the transformer architecture, a combination of these, or something else I haven’t thought of?
And hopefully it goes without saying, I don’t think o1-preview is a human level AGI, I merely believe that we’re getting there quite soon and without too many new architectural innovations, possibly just one or two, and none of them will be particularly groundbreaking, it’s fairly obvious what the next couple of steps will be as it was that MCTS + LLM was the next step 3 years ago.
I’ll make it perfectly clear that I’m not in anyway informed on the specific mechanisms or the mathematics involved with the algos related to LLMs. I have a computer science bachelor’s and worked as a full stack developer. When I had a more free-form web-programming project due it was still called NLP and was used to assess webcrawler data and make recommendations: that’s my entire hands on experience so that’s why I’m sticking entirely to philosophical aspect.
That’s basically it. I believe that by bowing to our machine god we aren’t giving human intellect enough credit because we don’t know and might never know how to measure it precisely enough to torture a machine with a soul. When you write an code the machine is always doing exactly what you told it to, it can’t make mistakes. If your code fucked up, you fucked up. Therefore, without a total understand of our body I do not believe we’ll be able to make a machine in our image because we don’t have the full picture (please forgive me for that).
Even if you were to 1:1 model a neuron in code, you’ve got about the entire rest of the body left to recreate. Every thought and feeling is tied to your mind and body. It’s not really “mind and body” it’s really just “body”. there are indescribable feelings that are expressed through song, poetry, paintings, dance, etc. There are entirely “illogical” processes that we simply cannot model. If we are ever able to, of course we can make a machine intelligent. If somehow we were able to get there I think the most obvious question is then: why? Someone on this site might be able to come up with a better use of our top minds and resources.
You’ve given me a lot to think about. Maybe I should read Attention is all you need again.