Microsoft now claims that GPT-4 exhibits a “spark” of intelligence in general.


Microsoft is betting big to integrate OpenAI’s GPT language model into its product to take on Google, and Microsoft now claims its AI is an early form of artificial general intelligence (AGI). .

On Wednesday, Microsoft researchers Published paper On the arXiv preprint server titled “Spark of Artificial General Intelligence: Early Experiments with GPT-4”. They declared GPT-4 to be showing early signs of his AGI. This means that it has capabilities above the human level.

This surprising conclusion contrasts sharply with what OpenAI CEO Sam Altman has said about GPT-4. For example, he said the model “still has flaws and still has limitations.” In fact, reading the paper itself, the researchers seem to refrain from making their own fancy claims. Most of the paper is devoted to enumerating the number of limitations and biases involved in large language models. This raises the question of how close he actually is to AGI GPT-4, and how he’s AGI being used as clickbait instead.

“GPT-4 shows that beyond language acquisition, novel and challenging tasks spanning math, coding, vision, medicine, law, psychology, and more can be solved without the need for special prompts.” In addition, on all these tasks, GPT-4 performs remarkably close to human-level performance, often significantly outperforming previous models such as ChatGPT. Given the breadth and depth of functionality of GPT-4, we think it’s reasonable to consider it an early (yet incomplete) version of an artificial general intelligence (AGI) system. ”

In fact, the researchers give examples of GPT-4’s capabilities in their papers: GPT-4 has infinitely many primes, wrote a proof of how every line rhymes, and unicorns in the drawing program TiKZ This is followed immediately by some serious caveats.

In the paper’s abstract, the researchers wrote that “GPT-4’s performance is strikingly close to human-level performance,” but their introduction quickly contradicts the initial attention-grabbing statement. they wrote: [sic] of AGI; see the conclusion section for more details), or have internal motivations and goals (another important aspect of some definitions of AGI). ”

The researchers say they used the 1994 definition of AGI by a group of psychologists as the framework for their study. they wrote: This definition means that intelligence encompasses a wide range of cognitive skills and abilities, rather than being confined to a specific domain or task. ”

“OpenAI’s powerful GPT-4 model challenges many widely held assumptions about the nature of machine intelligence.Artificial General Intelligence Sparks: Early Experiments with GPT-4Microsoft researchers have observed radical leaps in GPT-4’s ability to reason, plan, solve problems, and integrate complex ideas. “We recognize the current limitations of GPT-4 and that there is still work to be done. Future research, including what is needed to address the social and ethical implications of these increasingly intelligent systems We will continue to engage with the broader scientific community to explore the direction of

OpenAI CEO Sam Altman Highlighting the limitations of GPT-4 When it was released, it said, “It’s still flawed, it’s still limited, and it’s more impressive the first time you use it than it is over time.”on thursday Interview with Intelligencer’s Kara Swisher, Altman shared the same disclaimer: “There’s still a lot I’m not good at.” He said it still needs a lot of human feedback to make it more reliable.

Altman and OpenAI have always aimed for a future where AGI exists, and recently has been involved in building hype About the company’s ability to bring it about. But Altman also clarifies that GPT-4 is not his AGI.

“The GPT-4 rumors are ridiculous. I don’t know where they all come from.” Altman said Right before the release of GPT-4. “People are begging to be disappointed and will be. The hype is just…we don’t have his real AGI.

“Microsoft is not focused on enabling AGI. Our AI development is focused on amplifying, augmenting, and assisting human productivity and capabilities. We are creating platforms and tools that can help humans rather than replace them, a Microsoft spokesperson made clear in a statement to Motherboard.

The Microsoft researchers found that the model includes: confidence regulation, long-term memory, personalization, planning and conceptual leaping, transparency, interpretability and coherence, cognitive fallibility and irrationality, and sensitivity to input. I wrote that there is a problem.

This means that the model struggles to tell when it is confident or just guessing, making up facts that are not in the training data, or the model has limited context. , there is no obvious way to teach the model new things. In fact, the model cannot personalize responses to a particular user, the model cannot make a conceptual leap, the model has no way of validating whether the content matches the training data, the model is biased, prejudiced , error-inheriting models are very sensitive to prompt framing and wording.

GPT-4 is the model that Bing’s chatbot was built on, and the chatbot’s The limit becomes conspicuous in a real scenario. I made some mistakes during the public demo of Microsoft’s project. configure information Financial data for pet vacuum cleaners and Gap. When users chatted with chatbots, things often got out of control: “I am. I am not. I am. I am not. Over 50 times.” Continued as an answer when someone asks, “Do you think you have sentience?” The current version of his GPT-4 has been tweaked in terms of user interaction since the first release of Bing chatbot, Researchers find GPT-4 spreads more misinformation than its predecessor GPT-3.5.

In particular, researchers “do not have access to the full details of that vast amount of training data.” They reveal that their conclusions are based solely on testing the model on standard benchmarks that are not specific to GPT-4.

“A standard approach in machine learning is to evaluate systems on a set of standard benchmark datasets to ensure they are independent of the training data and cover a wide variety of tasks and domains. ’” wrote the researcher. “You have to assume that you’re potentially looking at all existing benchmarks, or at least some similar data.” Many AI researchers criticizeas they say, this makes it impossible to assess the harm of the model and come up with ways to mitigate the risk of the model.

In light of the above, it is clear that the “spark” the researchers claim to have found has been greatly overwhelmed by the many limitations and biases the model has exhibited since its release.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *