
Artificial intelligence: ‘Everyone on Earth will die’
In recent days I have been playing with Chat GPT and have been stunned by what it can do, such as write a poem instantly.
But I don’t want to sing the praises of the new technology but to warn of the dire dangers humanity faces.
Here was a video I found on GPT4
Why GPT-4 Might be the Most Dangerous AI Yet (Nobody is Talking about this!)
Neurolink’s Elon Musk has been putting out warnings despite being involved in the technology himself.
Elon Musk Signs Open Letter Urging AI Labs to Pump the Brakes
An open letter with signatures from hundreds of the biggest names in tech, including Elon Musk, has urged the world’s leading artificial intelligence labs to pause the training of new super-powerful systems for six months, saying that recent advances in AI present “profound risks to society and humanity.”
The letter comes just two weeks after the public release of OpenAI’s GPT-4, the most powerful AI system ever released, which has led researchers to slash their expectations for when AGI—or artificial general intelligence that surpasses human cognitive ability—will arrive. Many experts fear that, as an AI arms race heats up, humanity is sleepwalking into catastrophe.
However, that is not nearly enough according to a top researcher.
Can you imagine the senile Joe Biden or the idiotic NZ government (which is fascinated by the technology) ever heading the warning below?!
I think we’re all as good as dead from any number of directions.
‘Everyone on Earth will die,’ top AI researcher warns
Humanity is unprepared to survive an encounter with a much smarter artificial intelligence, Eliezer Yudkowsky says
Shutting down the development of advanced artificial intelligence systems around the globe and harshly punishing those violating the moratorium is the only way to save humanity from extinction, a high-profile AI researcher has warned.
Eliezer Yudkowsky, a co-founder of the Machine Intelligence Research Institute (MIRI), has written an opinion piece for TIME magazine on Wednesday, explaining why he didn’t sign a petition calling upon “all AI labs to immediately pause for at least six months the training of AI systems more powerful than GPT-4,” a multimodal large language model, released by OpenAI earlier this month.
Yudkowsky argued that the letter, signed by the likes of Elon Musk and Apple’s Steve Wozniak, was “asking for too little to solve” the problem posed by rapid and uncontrolled development of AI.
“The most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die,” Yudkowsky wrote.
Surviving an encounter with a computer system that “does not care for us nor for sentient life in general” would require “precision and preparation and new scientific insights” that humanity lacks at the moment and is unlikely to obtain in the foreseeable future, he argued.
“A sufficiently intelligent AI won’t stay confined to computers for long,” Yudkowsky warned. He explained that the fact that it’s already possible to email DNA strings to laboratories to produce proteins will likely allow the AI “to build artificial life forms or bootstrap straight to postbiological molecular manufacturing” and get out into the world.
According to the researcher, an indefinite and worldwide moratorium on new major AI training runs has to be introduced immediately. “There can be no exceptions, including for governments or militaries,” he stressed.
International deals should be signed to place a ceiling on how much computing power anyone may use in training such systems, Yudkowsky insisted.
“If intelligence says that a country outside the agreement is building a GPU (graphics processing unit) cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike,” he wrote.
The threat from artificial intelligence is so great that it should be made “explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange,” he added.
This is an article written byEliezer Yudkowski for Time magazine,
Pausing AI Developments Isn’t Enough. We Need to Shut it All Down
This 6-month moratorium would be better than no moratorium. I have respect for everyone who stepped up and signed it. It’s an improvement on the margin.
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
I refrained from signing because I think the letter is understating the seriousness of the situation and asking for too little to solve it.
The key issue is not “human-competitive” intelligence (as the open letter puts it); it’s what happens after AI gets to smarter-than-human intelligence. Key thresholds there may not be obvious, we definitely can’t calculate in advance what happens when, and it currently seems imaginable that a research lab would cross critical lines without noticing.
Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in “maybe possibly some remote chance,” but as in “that is the obvious thing that would happen.” It’s not that you can’t, in principle, survive creating something much smarter than you; it’s that it would require precision and preparation and new scientific insights, and probably not having AI systems composed of giant inscrutable arrays of fractional numbers.
Absent that caring, we get “the AI does not love you, nor does it hate you, and you are made of atoms it can use for something else.”
The likely result of humanity facing down an opposed superhuman intelligence is a total loss. Valid metaphors include “a 10-year-old trying to play chess against Stockfish 15”, “the 11th century trying to fight the 21st century,” and “Australopithecus trying to fight Homo sapiens“.
To visualize a hostile superhuman AI, don’t imagine a lifeless book-smart thinker dwelling inside the internet and sending ill-intentioned emails. Visualize an entire alien civilization, thinking at millions of times human speeds, initially confined to computers—in a world of creatures that are, from its perspective, very stupid and very slow. A sufficiently intelligent AI won’t stay confined to computers for long. In today’s world you can email DNA strings to laboratories that will produce proteins on demand, allowing an AI initially confined to the internet to build artificial life forms or bootstrap straight to postbiological molecular manufacturing.
If somebody builds a too-powerful AI, under present conditions, I expect that every single member of the human species and all biological life on Earth dies shortly thereafter.
There’s no proposed plan for how we could do any such thing and survive. OpenAI’s openly declared intention is to make some future AI do our AI alignment homework. Just hearing that this is the plan ought to be enough to get any sensible person to panic. The other leading AI lab, DeepMind, has no plan at all.
An aside: None of this danger depends on whether or not AIs are or can be conscious; it’s intrinsic to the notion of powerful cognitive systems that optimize hard and calculate outputs that meet sufficiently complicated outcome criteria. With that said, I’d be remiss in my moral duties as a human if I didn’t also mention that we have no idea how to determine whether AI systems are aware of themselves—since we have no idea how to decode anything that goes on in the giant inscrutable arrays—and therefore we may at some point inadvertently create digital minds which are truly conscious and ought to have rights and shouldn’t be owned.
The rule that most people aware of these issues would have endorsed 50 years earlier, was that if an AI system can speak fluently and says it’s self-aware and demands human rights, that ought to be a hard stop on people just casually owning that AI and using it past that point. We already blew past that old line in the sand. And that was probably correct; I agree that current AIs are probably just imitating talk of self-awareness from their training data. But I mark that, with how little insight we have into these systems’ internals, we do not actually know.
If that’s our state of ignorance for GPT-4, and GPT-5 is the same size of giant capability step as from GPT-3 to GPT-4, I think we’ll no longer be able to justifiably say “probably not self-aware” if we let people make GPT-5s. It’ll just be “I don’t know; nobody knows.” If you can’t be sure whether you’re creating a self-aware AI, this is alarming not just because of the moral implications of the “self-aware” part, but because being unsure means you have no idea what you are doing and that is dangerous and you should stop.
On Feb. 7, Satya Nadella, CEO of Microsoft, publicly gloated that the new Bing would make Google “come out and show that they can dance.” “I want people to know that we made them dance,” he said.
This is not how the CEO of Microsoft talks in a sane world. It shows an overwhelming gap between how seriously we are taking the problem, and how seriously we needed to take the problem starting 30 years ago.
We are not going to bridge that gap in six months.
It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence—not perfect safety, safety in the sense of “not killing literally everyone”—could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone.
Trying to get anything right on the first really critical try is an extraordinary ask, in science and in engineering. We are not coming in with anything like the approach that would be required to do it successfully. If we held anything in the nascent field of Artificial General Intelligence to the lesser standards of engineering rigor that apply to a bridge meant to carry a couple of thousand cars, the entire field would be shut down tomorrow.
We are not prepared. We are not on course to be prepared in any reasonable time window. There is no plan. Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die.
Read More: The New AI-Powered Bing Is Threatening Users. That’s No Laughing Matter
Many researchers working on these systems think that we’re plunging toward a catastrophe, with more of them daring to say it in private than in public; but they think that they can’t unilaterally stop the forward plunge, that others will go on even if they personally quit their jobs. And so they all think they might as well keep going. This is a stupid state of affairs, and an undignified way for Earth to die, and the rest of humanity ought to step in at this point and help the industry solve its collective action problem.
Some of my friends have recently reported to me that when people outside the AI industry hear about extinction risk from Artificial General Intelligence for the first time, their reaction is “maybe we should not build AGI, then.”
Hearing this gave me a tiny flash of hope, because it’s a simpler, more sensible, and frankly saner reaction than I’ve been hearing over the last 20 years of trying to get anyone in the industry to take things seriously. Anyone talking that sanely deserves to hear how bad the situation actually is, and not be told that a six-month moratorium is going to fix it.
On March 16, my partner sent me this email. (She later gave me permission to excerpt it here.)
“Nina lost a tooth! In the usual way that children do, not out of carelessness! Seeing GPT4 blow away those standardized tests on the same day that Nina hit a childhood milestone brought an emotional surge that swept me off my feet for a minute. It’s all going too fast. I worry that sharing this will heighten your own grief, but I’d rather be known to you than for each of us to suffer alone.”
When the insider conversation is about the grief of seeing your daughter lose her first tooth, and thinking she’s not going to get a chance to grow up, I believe we are past the point of playing political chess about a six-month moratorium.
If there was a plan for Earth to survive, if only we passed a six-month moratorium, I would back that plan. There isn’t any such plan.
Here’s what would actually need to be done:
The moratorium on new large training runs needs to be indefinite and worldwide. There can be no exceptions, including for governments or militaries. If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth. If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down.
Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
That’s the kind of policy change that would cause my partner and I to hold each other, and say to each other that a miracle happened, and now there’s a chance that maybe Nina will live. The sane people hearing about this for the first time and sensibly saying “maybe we should not” deserve to hear, honestly, what it would take to have that happen. And when your policy ask is that large, the only way it goes through is if policymakers realize that if they conduct business as usual, and do what’s politically easy, that means their own kids are going to die too.
Shut it all down.
We are not ready. We are not on track to be significantly readier in the foreseeable future. If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong.
Finally, here is an article from Open AI, which has developed Chat GPT and its now releasing GPT4. It seems to be all go-ahead for the developers, who ever they are.
Zuckerberg’s Meta AI team announces it has developed an “artificial visual cortex” for robots.
Robots that learn from videos of human activities and simulated interactions
Optimistic science fiction typically imagines a future where humans create art and pursue fulfilling pastimes while AI-enabled robots handle dull or dangerous tasks. In contrast, the AI systems of today display increasingly sophisticated generative abilities on ostensible creative tasks. But where are the robots? This gap is known as Moravec’s paradox, the thesis that the hardest problems in AI involve sensorimotor skills, not abstract thought or reasoning. To put it another way, “The hard problems are easy, and the easy problems are hard.”
Today, we are announcing two major advancements toward general-purpose embodied AI agents capable of performing challenging sensorimotor skills:
-
An artificial visual cortex (called VC-1): a single perception model that, for the first time, supports a diverse range of sensorimotor skills, environments, and embodiments. VC-1 is trained on videos of people performing everyday tasks from the groundbreaking Ego4D dataset created by Meta AI and academic partners. And VC-1 matches or outperforms best-known results on 17 different sensorimotor tasks in virtual environments.
-
A new approach called adaptive (sensorimotor) skill coordination (ASC), which achieves near-perfect performance (98 percent success) on the challenging task of robotic mobile manipulation (navigating to an object, picking it up, navigating to another location, placing the object, repeating) in physical environments.
Data powers both of these breakthroughs. AI needs data to learn from — and, specifically, embodied AI needs data that captures interactions with the environment. Traditionally, this interaction data is collected either by collecting large amounts of demonstrations or by allowing the robot to learn from interactions from scratch. Both approaches are too resource-intensive to scale toward the learning of a general embodied AI agent. In both of these works, we are developing new ways for robots to learn, using videos of human interactions with the real world and simulated interactions within photorealistic simulated worlds.
First, we’ve built a way for robots to learn from real-world human interactions, by training a general-purpose visual representation model (an artificial visual cortex) from a large number of egocentric videos. The videos include our open source Ego4D dataset, which shows first-person views of people doing everyday tasks, like going to the grocery store and cooking lunch. Second, we’ve built a way to pretrain our robot to perform long-horizon rearrangement tasks in simulation. Specifically, we train a policy in Habitat environments and transfer the policy zero-shot to a real Spot robot to perform such tasks in unfamiliar real-world spaces.
Toward an artificial visual cortex for embodied intelligence
A visual cortex is the region of the brain that (together with the motor cortex) enables an organism to convert vision into movement. We are interested in developing an artificial visual cortex — the module in an AI system that enables an artificial agent to convert camera input into actions.
Our FAIR team, together with academic collaborators, has been at the forefront of developing general-purpose visual representations for embodied AI trained from egocentric video datasets. The Ego4D dataset has been especially useful, since it contains thousands of hours of wearable camera video from research participants around the world performing daily life activities, including cooking, cleaning, sports, and crafts.
For instance, one prior work from our team (R3M) uses temporal and text-video alignment within Ego4D video frames to learn compact universal visual representations for robotic manipulation. Another work (VIP) uses Ego4D frames to learn an effective actionable visual representation that can also perform zero-shot visual reward-specification for training embodied agents. These are illustrative of the broader trend in the research community (e.g., PVR, OVRL, MVP) toward pretraining visual representations from web images and egocentric videos.
Although prior work has focused on a small set of robotic tasks, a visual cortex for embodied AI should work well for a diverse set of sensorimotor tasks in diverse environments across diverse embodiments. While prior works on pretraining visual representations give us a glimpse of what may be feasible, they are fundamentally incommensurable, with different ways of pretraining the visual representations on different datasets, evaluated on different embodied AI tasks. The lack of consistency meant there was no way of knowing which of the existing pretrained visual representations were best.
As a first step, we curated CortexBench, consisting of 17 different sensorimotor tasks in simulation, spanning locomotion, navigation, and dexterous and mobile manipulation, implementing the community standard for learning the policy for each task. The visual environments span from flat infinite planes to tabletop settings to photorealistic 3D scans of real-world indoor spaces. The agent embodiments vary from stationary arms to dexterous hands to idealized cylindrical navigation agents to articulated mobile manipulators. The learning conditions vary from few-shot imitation learning to large-scale reinforcement learning. This allowed us to perform a rigorous and consistent evaluation of existing and new pretrained models. Prior to our work, the best performance for each task in CortexBench was achieved by a model or algorithm specifically designed for that task. In contrast, what we want is one model and/or algorithm that achieves competitive performance on all tasks. Biological organisms have one general-purpose visual cortex, and that is what we seek for embodied agents.
We set out to pretrain a single general-purpose visual cortex that can perform well on all of these tasks. A critical choice for pretraining is the choice of dataset. It was entirely unclear what a good pretraining dataset for embodied AI would look like. There are massive amounts of video data available online, yet it isn’t practical to try out all combinations of those existing datasets.
We start with Ego4D as our core dataset and then explore whether adding additional datasets improves pretrained models. Having egocentric video is important because it enables robots to learn to see from a first-person perspective. Since Ego4D is heavily focused on everyday activities like cooking, gardening, and crafting, we also consider egocentric video datasets that explore houses and apartments. Finally, we also study whether static image datasets help improve our models.
Cumulatively, our work represents the largest and most comprehensive empirical study to date of visual representations for embodied AI, spanning 5+ pretrained visual representations from prior work, and multiple ablations of VC-1 trained on 4,000+ hours of egocentric human video from seven diverse datasets, which required over 10,000 GPU-hours of training and evaluation.
Today, we are open-sourcing VC-1, our best visual cortex model following FAIR’s values of open research for the benefit of all. Our results show VC-1 representations match or outperform learning from scratch on all 17 tasks. We also find that adapting VC-1 on task-relevant data results in it becoming competitive with or outperforming best-known results on all tasks in CortexBench. To the best of our knowledge, VC-1 is the first visual pretrained model that has shown to be competitive with state-of-the art results on such a diverse set of embodied AI tasks. We are sharing our detailed learnings, such as how scaling model size, dataset size, and diversity impact performance of pretrained models, in a related research paper.
Adaptive skill coordination for robotic mobile manipulation
While VC-1 demonstrates strong performance on sensorimotor skills in CortexBench, these are short-horizon tasks (navigating, picking up an object, in-hand manipulation of an object, etc.). The next generation of embodied AI agents (deployed on robots) will also need to accomplish long-horizon tasks and adapt to new and changing environments, including unexpected real-world disturbances.
Our second announcement focuses on mobile pick-and-place — a robot is initialized in a new environment and tasked with moving objects from initial to desired locations, emulating the task of tidying a house. The robot must navigate to a receptacle with objects, like the kitchen counter (the approximate location is provided to it), search for and pick an object, navigate to its desired place receptacle, place the object, and repeat.
To tackle such long-horizon tasks, we and our collaborators at Georgia Tech developed a new technique called Adaptive Skill Coordination (ASC), which consists of three components:
-
A library of basic sensorimotor skills (navigation, pick, place)
-
A skill coordination policy that chooses which skills are appropriate to use at which time
-
A corrective policy that adapts pretrained skills when out-of-distribution states are perceived
All sensorimotor policies are “model-free.” We use sensor-to-actions neural networks with no task-specific modules, like mapping or planning. The robot is trained entirely in simulation and transferred to the physical world without any real-world training data.
We demonstrate the effectiveness of ASC by deploying it on the Boston Dynamics’ Spot robot in new/unknown real-world environments. We chose the Boston Dynamics Spot robot because of robust sensing, navigation, and manipulation capabilities. However, operating Spot today involves a large amount of human intervention. For example, picking an object requires a person to click on the object on the robot’s tablet. Our aim is to build AI models that can sense the world from onboard sensing and motor commands through Boston Dynamics APIs.
Using the Habitat simulator, and the HM3D and ReplicaCAD datasets, which include indoor 3D scans of 1,000 homes, we teach a simulated Spot robot to move around an unseen house, pick up out-of-place objects, and put them in the right location. Next, we deploy this policy zero-shot in the real-world (sim2real) without explicitly building a map in the real world, and instead rely on our robot to use its learned notion of what houses look like.
When we put our work to the test, we used two significantly different real-world environments where Spot was asked to rearrange a variety of objects — a fully furnished 185-square-meter apartment and a 65-square-meter university lab. Overall, ASC achieved near-perfect performance, succeeding on 59 of 60 (98 percent) episodes, overcoming hardware instabilities, picking failures, and adversarial disturbances like moving obstacles or blocked paths. In comparison, traditional baselines like task and motion planning succeed in only 73 percent of cases, because of an inability to recover from real-world disturbances. We also study robustness to adversarial perturbations, such as changing the layout of the environment, walking in front of the robot to repeatedly block its path, or moving target objects mid-episode. Despite being trained entirely in simulation, ASC is robust to such disturbances, making it well suited for many long-horizon problems in robotics and reinforcement learning.
This opens avenues for sim2real research to expand to even more challenging real-world tasks, such as assistance in everyday tasks like cooking and cleaning, and even human-robot collaboration. Our work is a step toward scalable, robust, and diverse robot assistants of the future that can operate in new environments out of the box and do not require expensive real-world data collection.
Rethinking sim2real transfer
One of the most important tasks in sim2real learning is to build simulation models that truthfully reflect the robot’s behavior in the real world. However, this is challenging, since the real world is vast and constantly changing, and the simulator needs to capture this diversity. No simulator is a perfect replica of reality, and the main challenge is overcoming the gap between the robot’s performance in simulation and in the real world. The default operating hypothesis of this field is that reducing the sim2real gap involves creating simulators of high physics fidelity and using them to learn robot policies.
Over the past year, we have taken a counterintuitive approach to sim2real. Instead of building high-fidelity simulations of the world, we built an abstracted simulator of Spot, which does not model low-level physics in simulation, and learn a policy that can reason on a higher level (like where to go rather than how to move the legs). We call this a kinematic simulation, where the robot is teleported to a location and the target object is attached to the robot arm, when it is close to the gripper and in view. In the real world, Boston Dynamics controllers are used for achieving the actions commanded by this high-level policy.
Robots pretrained in sim2real have mostly been limited to short-horizon tasks and visual navigation, without any interaction with the environment. Mobile pick-and-place is a long-horizon task, and it requires interacting with the environment and switching between different phases of navigation, picking, placing, etc. This is typically very challenging for reinforcement learning, and requires demonstrations, or sophisticated hand-designed rewards. Our high-level abstraction and kinematic simulation let us learn long-horizon tasks, with sparse rewards, without requiring to reason about low-level physics.
Future areas of exploration
While we haven’t yet applied visual cortex to our object rearrangement robot, we hope to integrate it into a single system. With so many unpredictable variables in the real world, having strong visual representations and pretraining on a diverse number of egocentric videos showing many different activities and environments will be an important step toward building even better robots.
Voice is one area we are particularly interested in exploring. For example, instead of providing a task definition, natural language processing could be integrated, so someone could use their voice to tell their assistant to take the dishes from the dining room and move them to the kitchen sink. We also want to explore how our robot can better perform around people, such as by anticipating their needs and helping them with a multistep task, like baking a cake.
These are just some of the many areas that call for more research and exploration. We believe that with a strong visual cortex pretrained on egocentric video and visuomotor skills pretrained in simulation, these advancements could one day serve as building blocks for AI-powered experiences where virtual assistants and physical robots can assist humans and interact seamlessly with the virtual and physical world.
Read the paper: Adaptive Skill Coordination (ASC)
Read the paper: Visual Cortex
Get the Visual Cortex code
We would like to acknowledge the contributions of the following people:
Visual Cortex: Arjun Majumdar, Karmesh Yadav, Sergio Arnaud, Yecheng Jason Ma, Claire Chen, Sneha Silwal, Aryan Jain, Vincent-Pierre Berges, Pieter Abbeel, Jitendra Malik, Yixin Lin, Oleksandr Maksymets, and Aravind Rajeswaran
Adaptive Skill Coordination: Naoki Yokoyama, Alexander William Clegg, Eric Undersander, Sergio Arnaud, Jimmy Yang, and Sehoon Ha
And certain people think ciimate change is the only threat to human existance!