Special Issue "The Perils of Artificial Intelligence"

A special issue of Philosophies (ISSN 2409-9287).

Deadline for manuscript submissions: 1 December 2021.

Special Issue Editor

Special Issue Information

Dear Colleagues,

The history of robotics and artificial intelligence (AI) in many ways is also the history of humanity’s attempts to control such technologies. From the Golem of Prague to the military robots of modernity, the debate continues as to what degree of independence such entities should have and how to make sure that they do not turn on us, their inventors. Numerous recent advancements in all aspects of research on and the development and deployment of intelligent systems are well publicized, but safety and security issues related to AI are insufficiently addressed. This Special Issue is intended to mitigate this fundamental problem. It will be comprised of papers from leading AI Safety researchers addressing different aspects of the AI control problem as it relates to the development of safe and secure artificial intelligence and philosophical issues around this topic. We solicit papers ranging from near-term issues like bias, transparency, and fairness to long-term concerns such as value alignment, controllability, predictability, and explainability of superintelligent systems. We also welcome manuscripts on non-traditional and contrarian views of AI safety and security.

We invite the submission of contributions on any aspect of AI safety and security, including:

  • adversarial machine learning (ML)
  • AI failures
  • bad actors/malevolence
  • boxing/confinement
  • controllability
  • drives/goals
  • ethics/law
  • explainability
  • friendliness theory
  • openness/secrecy
  • off-switch
  • predictability
  • reward engineering
  • self-improvement
  • security
  • singularity/superintelligence
  • validation/verification
  • value alignment
  • weaponized/military
  • wireheading

Dr. Roman V. Yampolskiy
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Philosophies is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • AI safety
  • AI security
  • controllability
  • value alignment
  • friendly AI
  • AI ethics

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Provably Safe Artificial General Intelligence via Interactive Proofs
Philosophies 2021, 6(4), 83; https://doi.org/10.3390/philosophies6040083 - 07 Oct 2021
Viewed by 220
Abstract
Methods are currently lacking to prove artificial general intelligence (AGI) safety. An AGI ‘hard takeoff’ is possible, in which first generation AGI1 rapidly triggers a succession of more powerful AGIn that differ dramatically in their computational capabilities (AGIn << [...] Read more.
Methods are currently lacking to prove artificial general intelligence (AGI) safety. An AGI ‘hard takeoff’ is possible, in which first generation AGI1 rapidly triggers a succession of more powerful AGIn that differ dramatically in their computational capabilities (AGIn << AGIn+1). No proof exists that AGI will benefit humans or of a sound value-alignment method. Numerous paths toward human extinction or subjugation have been identified. We suggest that probabilistic proof methods are the fundamental paradigm for proving safety and value-alignment between disparately powerful autonomous agents. Interactive proof systems (IPS) describe mathematical communication protocols wherein a Verifier queries a computationally more powerful Prover and reduces the probability of the Prover deceiving the Verifier to any specified low probability (e.g., 2−100). IPS procedures can test AGI behavior control systems that incorporate hard-coded ethics or value-learning methods. Mapping the axioms and transformation rules of a behavior control system to a finite set of prime numbers allows validation of ‘safe’ behavior via IPS number-theoretic methods. Many other representations are needed for proving various AGI properties. Multi-prover IPS, program-checking IPS, and probabilistically checkable proofs further extend the paradigm. In toto, IPS provides a way to reduce AGInAGIn+1 interaction hazards to an acceptably low level. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Show Figures

Figure 1

Article
Understanding and Avoiding AI Failures: A Practical Guide
Philosophies 2021, 6(3), 53; https://doi.org/10.3390/philosophies6030053 - 28 Jun 2021
Viewed by 720
Abstract
As AI technologies increase in capability and ubiquity, AI accidents are becoming more common. Based on normal accident theory, high reliability theory, and open systems theory, we create a framework for understanding the risks associated with AI applications. This framework is designed to [...] Read more.
As AI technologies increase in capability and ubiquity, AI accidents are becoming more common. Based on normal accident theory, high reliability theory, and open systems theory, we create a framework for understanding the risks associated with AI applications. This framework is designed to direct attention to pertinent system properties without requiring unwieldy amounts of accuracy. In addition, we also use AI safety principles to quantify the unique risks of increased intelligence and human-like qualities in AI. Together, these two fields give a more complete picture of the risks of contemporary AI. By focusing on system properties near accidents instead of seeking a root cause of accidents, we identify where attention should be paid to safety for current generation AI systems. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Show Figures

Figure 1

Article
AI Ethics and Value Alignment for Nonhuman Animals
Philosophies 2021, 6(2), 31; https://doi.org/10.3390/philosophies6020031 - 13 Apr 2021
Viewed by 1068
Abstract
This article is about a specific, but so far neglected peril of AI, which is that AI systems may become existential as well as causing suffering risks for nonhuman animals. The AI value alignment problem has now been acknowledged as critical for AI [...] Read more.
This article is about a specific, but so far neglected peril of AI, which is that AI systems may become existential as well as causing suffering risks for nonhuman animals. The AI value alignment problem has now been acknowledged as critical for AI safety as well as very hard. However, currently it has only been attempted to align the values of AI systems with human values. It is argued here that this ought to be extended to the values of nonhuman animals since it would be speciesism not to do so. The article focuses on the two subproblems—value extraction and value aggregation—discusses challenges for the integration of values of nonhuman animals and explores approaches to how AI systems could address them. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Article
Transdisciplinary AI Observatory—Retrospective Analyses and Future-Oriented Contradistinctions
Philosophies 2021, 6(1), 6; https://doi.org/10.3390/philosophies6010006 - 15 Jan 2021
Cited by 1 | Viewed by 1049
Abstract
In the last years, artificial intelligence (AI) safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. [...] Read more.
In the last years, artificial intelligence (AI) safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. This paper motivates the need for an inherently transdisciplinary AI observatory approach integrating diverse retrospective and counterfactual views. We delineate aims and limitations while providing hands-on-advice utilizing concrete practical examples. Distinguishing between unintentionally and intentionally triggered AI risks with diverse socio-psycho-technological impacts, we exemplify a retrospective descriptive analysis followed by a retrospective counterfactual risk analysis. Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety. As further contribution, we discuss differentiated and tailored long-term directions through the lens of two disparate modern AI safety paradigms. For simplicity, we refer to these two different paradigms with the terms artificial stupidity (AS) and eternal creativity (EC) respectively. While both AS and EC acknowledge the need for a hybrid cognitive-affective approach to AI safety and overlap with regard to many short-term considerations, they differ fundamentally in the nature of multiple envisaged long-term solution patterns. By compiling relevant underlying contradistinctions, we aim to provide future-oriented incentives for constructive dialectics in practical and theoretical AI safety research. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Show Figures

Figure 1

Article
Facing Immersive “Post-Truth” in AIVR?
Philosophies 2020, 5(4), 45; https://doi.org/10.3390/philosophies5040045 - 15 Dec 2020
Viewed by 1079
Abstract
In recent years, prevalent global societal issues related to fake news, fakery, misinformation, and disinformation were brought to the fore, leading to the construction of descriptive labels such as “post-truth” to refer to the supposedly new emerging era. Thereby, the (mis-)use of technologies [...] Read more.
In recent years, prevalent global societal issues related to fake news, fakery, misinformation, and disinformation were brought to the fore, leading to the construction of descriptive labels such as “post-truth” to refer to the supposedly new emerging era. Thereby, the (mis-)use of technologies such as AI and VR has been argued to potentially fuel this new loss of “ground-truth”, for instance, via the ethically relevant deepfakes phenomena and the creation of realistic fake worlds, presumably undermining experiential veracity. Indeed, unethical and malicious actors could harness tools at the intersection of AI and VR (AIVR) to craft what we call immersive falsehood, fake immersive reality landscapes deliberately constructed for malicious ends. This short paper analyzes the ethically relevant nature of the background against which such malicious designs in AIVR could exacerbate the intentional proliferation of deceptions and falsities. We offer a reappraisal expounding that while immersive falsehood could manipulate and severely jeopardize the inherently affective constructions of social reality and considerably complicate falsification processes, humans may neither inhabit a post-truth nor a post-falsification age. Finally, we provide incentives for future AIVR safety work, ideally contributing to a future era of technology-augmented critical thinking. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Article
An AGI Modifying Its Utility Function in Violation of the Strong Orthogonality Thesis
Philosophies 2020, 5(4), 40; https://doi.org/10.3390/philosophies5040040 - 01 Dec 2020
Cited by 1 | Viewed by 1603
Abstract
An artificial general intelligence (AGI) might have an instrumental drive to modify its utility function to improve its ability to cooperate, bargain, promise, threaten, and resist and engage in blackmail. Such an AGI would necessarily have a utility function that was at least [...] Read more.
An artificial general intelligence (AGI) might have an instrumental drive to modify its utility function to improve its ability to cooperate, bargain, promise, threaten, and resist and engage in blackmail. Such an AGI would necessarily have a utility function that was at least partially observable and that was influenced by how other agents chose to interact with it. This instrumental drive would conflict with the strong orthogonality thesis since the modifications would be influenced by the AGI’s intelligence. AGIs in highly competitive environments might converge to having nearly the same utility function, one optimized to favorably influencing other agents through game theory. Nothing in our analysis weakens arguments concerning the risks of AGI. Full article
(This article belongs to the Special Issue The Perils of Artificial Intelligence)
Back to TopTop