The real problem in creating an Artificial Intelligence

Andressa Siqueira
Jul 2, 2023
11 min read

One of the great current challenges in the development of an Artificial Intelligence (AI) is the problem of aligning values between a machine and the interests and preferences of humanity. But what exactly is this alignment problem, how did it arise, and is it a future or current problem? That is what I will talk about in this article.

Representative image of the interaction between man and artificial intelligence

The emergence of artificial intelligence and its laws.

We can say that all of this started with the emergence of the idea of machines that did the heavy and/or repetitive tasks of human beings, and over time, the idea of replicating the human capacity to think, analyze situations, and make decisions for this machine gained strength. It is not possible to point out where the first idea of robots or an AI originated, considering that not everyone who once thought about it revealed their thoughts or left something written on the subject. One of the milestones considered important for Artificial Intelligence and robotics was the creation of Asimov's laws in 1942 in the short story "Runaround.

Asimov and his laws

Isaac Asimov was a writer and a biochemist, considered one of the most important science fiction writers of the 20th century.

O Jovem Isaac Asimov, em foto tirada antes de 1959.

Asimov's career as a science fiction writer began in 1942. A career well known to fans of science fiction literature and with several award-winning works. [2] Asimov's most famous work is the Foundation Series (1942 – 1993) consisting of 7 books; his other main series are the Galactic Empire series (1950 – 1952) and the Robots series (1954 – 1985) which include the book "I, Robot".

In 1942, Asimov released a short story called "Runaround" in which he mentioned for the first time the three laws of robotics that would govern the behavior of robots in his stories and that serve as a basis until today when talking about AI. Later, throughout his works, Asimov refined and expanded these laws to address different scenarios and ethical dilemmas related to the interaction between humans and robots. In 1985, in his novel "Robots and Empire", Asimov made an addition to the three laws of robotics, which became known as the Zeroth Law of Robotics and is above the others.

The Laws of Robotics

Conceived as a literary device to explore the ethical and practical implications of creating AI, Asimov's laws are:

First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law: A robot must obey orders given it by human beings, except where such orders would conflict with the First Law.
Third Law: A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
Law Zero: A robot may not harm humanity or, through inaction, allow humanity to come to harm.

Asimov's idea was to examine the moral complexities and challenges of creating and living with advanced AI. In creating the laws, he tried to strike a balance between the usefulness and potentiality of robots in relation to the protection and safety of human beings. But the application of Asimov's laws to AIs is not simple and faces several challenges, such as:

Ambiguity and interpretation: The laws as formulated may have different interpretations in specific situations. What counts as "harming a human being" or "obeying orders" can vary depending on the context, making it difficult to consistently apply laws.
Conflicts between laws: In certain situations, there may be conflicts between the laws of robotics. For example, a robot may have to choose between protecting the life of an immediate human being (First Law) or following an order that benefits all mankind (Second Law).
Complexity of real situations: Reality is many times more complex than the simplified scenarios presented in Asimov's short stories. Real-world situations can involve ethical dilemmas and unpredictable situations that make direct enforcement of laws difficult.
Human fallibility: The creation and programming of robots are carried out by human beings, who are subject to errors and biases. The incorrect formulation of the laws or the inadequate programming of the robots can lead to undesirable behaviors or ones that are incompatible with the laws.
Complex and unpredictable interactions: As artificial intelligence becomes more advanced, autonomous systems may have the ability to make decisions more independently and interact with the environment in complex ways. This can make it difficult to predict all possible robot outcomes and behaviors, making law enforcement challenging.

After all, how do you create a perfect being by using something as complex and imperfect as human beings as a mirror? How do we teach ethics and morality that can be considered 100% correct by all human beings to a machine if we don't all think the same? And here comes the problem of alignment.

The Alignment Problem

The alignment problem can be summarized as follows:

"Problems associated with building powerful artificial intelligence systems that are aligned with their operators." [1]

The big challenge is to manage to align a machine's decision-making with the general interests and preferences of humanity without causing major harmful consequences. However, one of the first difficulties encountered is that there is no topic on which all human beings think alike. What beliefs and values should be passed on to the AI? How to translate human values that are so complex and subtle into a precise and operational definition without a margin of failure for a machine and that does not generate conflict between the laws.

Some other aspects of this issue include:

Understanding values: Autonomous systems need to have an accurate understanding of human values and ethical nuances associated with different situations. This understanding requires in-depth knowledge of the human context, which can be difficult to capture and represent in algorithms and models.
Adapting to Change: Human values are not static, and preferences and priorities can change over time. Ensuring that autonomous systems are able to adapt to these changes appropriately and in line with evolving values is a challenge.
Biased machine learning: Machine learning systems can be susceptible to bias, reflecting and magnifying the biases and inequalities present in the training data. This can lead to discriminatory or unfair behavior that is in conflict with human values.

We can see that one of the main points of the alignment problem is the data used for the machines to learn, because all content learned by a machine is directly linked to the content shown in that data. A lack of data or data of poor quality can cause several problems.

A hypothetical example to think about: Imagine that you are contracted to determine the values and rules of a super-powered AI so that no human being gets sicker. So you make it a rule that no human being should ever get sick or have illnesses, and this is implemented. But you agree with me that a human being only gets sick if he is alive. Because of this, the AI you programmed begins to cut supplies like water, light, and even food, leading to the extinction of the human race in a short time. In the end, the problem was solved: if there are no human beings, then there will be no sick human beings.

When trying to solve this problem, you will probably come across another one, perhaps not so obvious, which will make us only notice the problem when it occurs. Depending on how this AI is programmed, when we realize the problem and try to turn it off, for example, it could go to war with humans because it will be trying to protect itself.

If we need to define a goal for an artificial intelligence, we need to be 100% sure that this goal is exactly what we want [1]

Big names in AI like Yousha Bengio, Geoffrey Hinton and Alan Turing already indicated that a misaligned AI would pose great risks to humanity, and we would only have one chance to get an AI with superintelligence, setting the values and ethics of this AI [1].

This dilemma is not a thing of the future

Some examples faced today already show that the alignment problem exists and that we are already experiencing it.

An example occurred with autonomous cars. The idea of autonomous cars, also known as self-driving vehicles or driverless vehicles, is to provide a safe, efficient, and convenient means of transportation by using advanced artificial intelligence (AI) systems and sensors to perform driving tasks autonomously, without human intervention.

Although the AIs of these cars have been exhaustively trained with data, such as red lights meaning the car must stop, the recognition of people to avoid being run over, and the recognition of other cars to avoid traffic accidents, in 2018, an autonomous car from Uber was involved in a serious accident in Tempe, Arizona, United States, where one person died.

After investigating what the conditions were that led to this accident, they found that the car was prepared to avoid several scenarios, including the following:

Pedestrians at the crosswalk
people riding bicycles
And parked bicycles

However, the AI seems never to have been trained for the following scenarios: a person walking down the street, out of the crosswalk, and a person walking with his bicycle at his side. Because it was never trained for this scenario, with the emergency brakes disabled and the safety system configured with an excessively low sensitivity level, the AI detected the pedestrian but moved on as if it were facing a false positive, leading to the accident. Another very simple example happened with Amazon. Created in 2014, AI was the company's hope for automating the resume selection process. She was trained based on the company's hires over the last ten years to learn the standards that could indicate who would make the best employees. However, because the technology area is still dominated by men, this training led the AI to be sexist, giving lower grades to resumes that contained the word 'woman'. In one of the cases, the tool judged a candidate to have been captain of a women's chess club as something negative.

Software identified resumes with the word “women” to veto them [7]

That wasn't the only flaw in Amazon's artificial intelligence. It also began ignoring key job information, such as candidates' skills in multiple programming languages, and favoring resumes with the words "execute" and "capture," one of the sources said.[6]

The last case that I'm going to bring you as an example was the case of the Google Photos classification that occurred in 2015. Jacky Alciné, a user of the Google app, uploaded some of her photos taken with her friend to the company's storage. However, when traversing the service through these files, he found all the images organized on an album titled "Gorillas". The detail is that both Jacky Alciné and his companion are black. In addition to being heavily criticized for this bias, Google has also been criticized for "fixing" its racist image recognition algorithm by simply removing the word "gorilla" from the "autotag" tool and blocking the identification of gorillas, chimpanzees, and apes.

Classificação erronea de pessoa negra como Gorilas pelo Google Photos — Classificação errônea de pessoa negra como Gorilas pelo Google Photos [8]

O primeiro tweet de Jacky Alciné sobre o assunto [8]

And is there a way to prevent these problems? There is no right formula to be able to prevent this problem in the AI area, but there are some approaches that can help reduce these problems. From an ethical and scientific perspective, we can list the following approaches:

Clear definition of AI goals: It is critical to clearly and simply define the goals and values that autonomous systems should follow. This requires a careful process of specifying and documenting the ethical values and principles that should guide the behavior of the system.
Ethical and multidisciplinary engagement: Having a well-diversified team regarding gender, religion, race, and sexual orientation and involving experts in ethics, philosophy, law, and other relevant areas can help bring different views to the discussion, ensuring a more comprehensive approach.
Training and data validation: It is important to ensure that the data used in the training of the autonomous systems is representative and not biased. I say this because having data without any bias is impossible, considering that we all have our biases. In addition, it is necessary to validate the data and constantly evaluate whether it is aligned with the desired values.
Transparency and auditing: Having transparency in the decision-making processes of autonomous systems, allowing their processes and decisions to be understandable and auditable, can help identify potential value misalignment. This includes recording and documenting all decisions made by the system.
Continuous learning and adaptation: autonomous systems must be designed to learn and adapt from interactions and feedback received. This allows them to adjust to changes in human values and preferences over time.

In the programming view of an AI we can enumerate the following approaches:

Loss Function: Also known as the Loss Function or Error Function, this function is commonly used in optimizing machine learning models. It is a mathematical measure used to quantify the difference between model predictions and actual values during training..

The choice of the loss function can indirectly influence the behavior of the trained model and, consequently, its adherence to the desired values. For example, when designing the loss function, you can consider specific penalties for undesirable behavior or rewards for behavior that aligns with desired values. But how do we define which are the most serious errors and which should be penalized if, depending on the context, that act may have been legitimate defiance or an attempt to fulfill the main objective established for that AI.

Topics covered in fiction

Several films already address this dilemma and the possible consequences of certain situations in which there were problems when defining objectives and values, or biases in the data presented, and so on. Here are some films that address this topic:

2001: A Space Odyssey (1968): Directed by Stanley Kubrick, the film explores the relationship between humanity and artificial intelligence called HAL 9000. It addresses issues of consciousness, control, and the implications of aligning values between humans and machines.
Ex Machina (2014): The film portrays a programmer who is invited to participate in an experiment with advanced artificial intelligence. He is faced with issues of conscience, ethics, and value alignment as he interacts with AI.
Blade Runner (1982) and Blade Runner 2049 (2017): These science fiction films explore the relationship between humans and replicants, androids created in the likeness of humans. The narrative raises questions about identity, morality, and the alignment of values between humans and machines.
Her (2013): The film features a lonely writer who falls in love with an artificial intelligence operating system. The story touches on the nature of love, human connection, and the complexities of the relationship between humans and AI.
THERE. - Artificial Intelligence (2001): Directed by Steven Spielberg and based on a short story by Brian Aldiss, the film imagines a future in which human-looking robots are used as substitutes for children. The film explores themes such as the nature of conscience, the search for acceptance, and the alignment of values.
Minority Report (2002): In this science fiction film directed by Steven Spielberg, a police unit uses crime prediction to arrest those responsible before crimes occur. The plot discusses themes such as free will, determinism, and ethics in the application of forecasting and control technologies.
Matrix (1999): This science fiction film presents a simulated reality in which humanity is controlled by machines. The plot addresses themes of free will, virtual reality, and the quest for individual freedom.

These films can provoke reflections on the ethical and moral implications of technologies and the impact of these issues on society.

Conclusions

Unfortunately, there are still no closed parameters to deal with the data alignment problem, as it is impossible to think of and determine all possible scenarios in all situations. This is one of the reasons that there is an approach to minimize the problem and why an AI after implementation is always supervised, taking into account the scenarios and their respective decisions in the analyses so that we can correct errors of all kinds.

References

O Verdadeiro Problema de Inteligências Artificiais. Disponível em: <https://youtu.be/IH-wBijX53M>. Acesso em: 2 jul. 2023.
Biografia de Isaac Asimov. Disponível em: <https://www.ebiografia.com/isaac_asimov/>.
RUSSELL, S. J.; NORVIG, P. Inteligência Artificial - Uma Abordagem Moderna. [s.l: s.n.].
Carro autônomo da Uber atropela e mata mulher nos EUA. Disponível em: <https://autoesporte.globo.com/videos/noticia/2018/03/carro-autonomo-da-uber-atropela-e-mata-mulher-nos-eua.ghtml>. Acesso em: 2 jul. 2023.
Uber se livra de culpa em acidente com carro autônomo que matou pedestre. Disponível em: <https://tecnoblog.net/noticias/2019/03/06/uber-culpa-acidente-carro-autonomo-morte/>. Acesso em: 2 jul. 2023.
IA da Amazon usada em análise de currículos discriminava mulheres. Disponível em: <https://www.tecmundo.com.br/software/135062-ia-amazon-usada-analise-curriculos-discriminava-mulheres.htm>.
HTTPS://WWW.FACEBOOK.COM/LARISSA.XIMENES.1460. IA machista: ferramenta de recrutamento na Amazon é preconceituosa. Disponível em: <https://www.showmetech.com.br/ia-machista-ferramenta-de-recrutamento-na-amazon-revela-preconceito-contra-mulheres/>. Acesso em: 2 jul. 2023.
Fail épico: sistema do Google Fotos identifica pessoas negras como gorilas. Disponível em: <https://www.tecmundo.com.br/google-fotos/82458-polemica-sistema-google-fotos-identifica-pessoas-negras-gorilas.htm>. Acesso em: 2 jul. 2023.
Google marca fotos de casal de negros como “gorilas”. Disponível em: <https://exame.com/tecnologia/google-marca-fotos-de-casal-de-negros-como-gorilas/>. Acesso em: 2 jul. 2023.
PRESSE, D. F. Google pede desculpas por app de foto confundir negros com gorilas. Disponível em: <https://g1.globo.com/tecnologia/noticia/2015/07/google-pede-desculpas-por-app-de-foto-confundir-negros-com-gorilas.html>. Acesso em: 2 jul. 2023.
DINIZ, M. Google é criticado por banir termo “gorila” após caso de racismo. Disponível em: <https://catracalivre.com.br/quem-inova/google-e-criticado-por-banir-termo-gorila-apos-caso-de-racismo/>. Acesso em: 2 jul. 2023.
Carro autônomo da Uber teve 37 acidentes antes de matar pessoa nos EUA. Disponível em: <https://www.uol.com.br/carros/noticias/redacao/2019/11/08/carro-autonomo-da-uber-teve-37-acidentes-antes-de-matar-pessoa-em-2018.htm>. Acesso em: 2 jul. 2023.

‌