No Access

Unpredictability of AI: On the Impossibility of Accurately Predicting All Actions of a Smarter Agent

Computer Science and Engineering, University of Louisville, 222 Eastern Parkway, Duthie Center, 215 Louisville, KY 40292, USA

E-mail Address: roman.yampolskiy@louisville.edu

Search for more papers by this author

https://doi.org/10.1142/S2705078520500034Cited by:23 (Source: Crossref)

Abstract

The young field of AI Safety is still in the process of identifying its challenges and limitations. In this paper, we formally describe one such impossibility result, namely Unpredictability of AI. We prove that it is impossible to precisely and consistently predict what specific actions a smarter-than-human intelligent system will take to achieve its objectives, even if we know the terminal goals of the system. In conclusion, the impact of Unpredictability on AI Safety is discussed.

Keywords:

References

Armstrong, S. and Mindermann, S. [2017] Impossibility of deducing preferences and rationality from human policy, arXiv:1712.05812. Google Scholar
Babcock, J., Kramar, J. and Yampolskiy, R. [2016] “The AGI containment problem,” Paper presented at The Ninth Conference on Artificial General Intelligence (AGI2015) (NYC, USA), pp. 53–63. Google Scholar
Babcock, J., Kramár, J. and Yampolskiy, R. V. [2017] Guidelines for artificial intelligence containment,. arXiv:1707.08476. Google Scholar
Bathaee, Y. [2018] The artificial intelligence black box and the failure of intent and causation, Harvard J. Law Technol. 31(2), 889. Google Scholar
Baum, S., Barrett, A. and Yampolskiy, R. V. [2017] Modeling and interpreting expert disagreement about artificial superintelligence, Informatica 41(7), 419–428. Google Scholar
Baum, S. D., Armstrong, S., Ekenstedt, T., Häggström, O., Hanson, R., Kuhlemann, K., Maas, M. M., Miller, J. D., Salmela, M., Sandberg, A., Sotala, K., Torres, P. and Yampolskiy, R. V. [2019] Long-term trajectories of human civilization, Foresight 21(1), 53–83. Crossref, Google Scholar
Bazerman, M. H., Morgan, K. P. and Loewenstein, G. F. [1997] The impossibility of auditor independence, Sloan Manage. Rev. 38, 89–94. Google Scholar
Behzadan, V., Munir, A. and Yampolskiy, R. V. [2018] A Psychopathological Approach to Safety Engineering in ai and agi. Paper presented at the Int. Conf. Computer Safety, Reliability, and Security. (Springer, 2018). Google Scholar
Bostrom, N. [1998] Singularity and Predictability, Available at http://mason.gmu.edu/ $\sim$ $\sim$ rhanson/vc.html. Google Scholar
Bostrom, N. [2012] The superintelligent will: Motivation and instrumental rationality in advanced artificial agents, Minds Mach. 22(2), 71–85. Crossref, Google Scholar
Brundage, M., Avin, S., Clark, J., Toner, H., Eckersley, P., Garfinkel, B., Dafoe A., Scharre, P., Zeitzoff, T., Filar, B., Anderson, H., Roff, H., Allen, G. C., Steinhardt, J., Flynn, C., hÉigeartaigh, S. Ó., Beard, S., Belfield, H., Farquhar, S., Lyle, C., Crootof, R., Evans, O., Page, M., Bryson, J., Yampolskiy, R. V. and Amodei, D. [2018] The malicious use of artificial intelligence: Forecasting, prevention, and mitigation, arXiv:1802.07228. Google Scholar
Callaghan, V., Miller, J., Yampolskiy, R. and Armstrong, S. [2017] Technological Singularity (Springer). Crossref, Google Scholar
Campbell, M., Hoane Jr, A. J. and Hsu, F.-H. [2002] Deep blue, Artif. Intell. 134(1–2), 57–83. Crossref, Google Scholar
Cantlon, J. F. and Brannon, E. M. [2007] Basic math in monkeys and college students, PLoS Biol. 5(12), e328. Crossref, Google Scholar
Charisi, V., Dennis, L., Lieck, M. F. R., Matthias, A., Sombetzki, M. S. J., Winfield, A. F. and Yampolskiy, R. [2017] Towards moral autonomous systems, arXiv:1703.04741. Google Scholar
Cognitive Uncontainability [2019] Paper presented at the Arbital, Available at: https://arbital.com/p/uncontainability/. [Retrieved May 19, 2019]. Google Scholar
De Garis, H. [2008] The artilect war, Available at https://agi-conf.org/2008/artilectwar.pdf. Google Scholar
Duettmann, A., Afanasjeva. O., Armstrong, S., Braley, R., Cussins, J., Ding, J., Eckersley, P., Guan, M., Vance, A. and Yampolskiy R. V. Foresight Institute: Palo Alto, (CA, USA, 2018). Google Scholar
Dufour, J.-M. [1997] Some impossibility theorems in econometrics with applications to structural and dynamic models, Econometrica: Journal of the Econometric Society 1365–1387. Crossref, Google Scholar
Eckersley, P. [2018] Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv:1901.00064. Google Scholar
Fallenstein, B. and Soares, N, Vingean reflection: Reliable reasoning for self-improving agents. Technical Report 2015–2, Machine Intelligence Research Institute, 2015. Google Scholar
Ferrucci, D. A. [2012] Introduction to “this is watson”, IBM J. Res. Deve. 56(3.4), 1:1–1:15. Crossref, Google Scholar
Fisher, M., Lynch, N. and Peterson, M. [1985] Impossibility of distributed consensus with one faulty process, J. ACM 32(2), 374–382. Crossref, Google Scholar
Grossman, S. J. and Stiglitz, J. E. [1980] On the impossibility of informationally efficient markets, Am. Econo. Rev. 70(3), 393–408. Google Scholar
Israeli, N. and Goldenfeld, N. [2004] Computational irreducibility and the predictability of complex physical systems, Phys. Rev. Lett. 92(7), 074105. Crossref, Google Scholar
Itti, L. and Baldi, P. 2005, June. A principled approach to detecting surprising events in video. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (Vol. 1, pp. 631–637). IEEE. (San Diego, CA, USA, June, 2005). Google Scholar
Itti, L. and Baldi, P. F. [2006] Bayesian Surprise Attracts Human Attention. Paper presented at the Adv. Neural Inf. Process. Syst, Vancouver, B.C., Canada, pp. 547–554. Google Scholar
Kleinberg, J. M. [2003] An Impossibility Theorem for Clustering. Paper presented at the Adv. Neural Inf. Process. Syst, Vancouver, British Columbia, Canada, pp. 463–470. Google Scholar
Lehman, J., Clune, J. and Misevic, D. [2018] The Surprising Creativity of Digital Evolution. Paper presented at the Artif. Life Conf. Proc, Cambridge, MA USA, pp. 55–56. Google Scholar
List, C. and Pettit, P. [2002] Aggregating sets of judgments: An impossibility result, Econ. Philos. 18(1), 89–110. Crossref, Google Scholar
Majot, A. M. and Yampolskiy, R. V. [2014] AI Safety Engineering Through Introduction of Self-Reference into Felicific Calculus via Artificial Pain and Pleasure. Paper presented at the IEEE Int. Symp. Ethics in Science, Technology and Engineering, Chicago, IL. Google Scholar
Mokhtarian, E. [2018] The bot legal code: Developing a legally compliant artificial intelligence, Vanderbilt J. Entertainment Techol. Law 21, 145. Google Scholar
Moore, C. [1990] Unpredictability and undecidability in dynamical systems, Phys. Rev. Lett. 64(20), 2354. Crossref, Google Scholar
Moore, C. [1991] Generalized shifts: Unpredictability and undecidability in dynamical systems, Nonlinearity 4(2), 199. Crossref, Google Scholar
Nielsen, M. [1998] Comment by Michael Nielsen, Available at http://mason.gmu.edu/∼rhanson/vc.html. Google Scholar
Omohundro, S. M. [2008] The Basic AI Drives. Paper presented at the AGI. Vol. 171. Memphis, TN, USA, pp. 483–492. Google Scholar
Ozlati, S. and Yampolskiy, R. [2017] The Formalization of AI Risk Management and Safety Standards. Paper presented at the Workshops at the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, pp. 127–131. Google Scholar
Pistono, F. and Yampolskiy, R. V. [2016] Unethical research: How to create a malevolent artificial intelligence, arXiv:1605.02817. Google Scholar
Rahwan, I. and Cebrian, M. [2018] Machine Behavior Needs to be an Academic Discipline. Paper presented at the Nautilus, http://nautil.us/issue/58/self/machine-behavior-needs-to-be-an-academic-discipline. Google Scholar
Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J.-F., Breazeal, C., Crandall, J. W., Christakis, N. A, Couzin, I. D., Jackson, M. O. and Jennings, N. R. [2019] Machine behaviour, Nature 568(7753), 477–486. Crossref, Google Scholar
Ramamoorthy, A. and Yampolskiy, R. [2018] Beyond mad? The race for artificial general intelligence, ITU J 1, 1–8. Google Scholar
Rice, H. G. [1953] Classes of recursively enumerable sets and their decision problems, Trans. Am. Math. Soc. 74(2), 358–366. Crossref, Google Scholar
Schmidhuber, J. [2009] Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes, J. SICE 48(1). Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T. and Lillicrap, T. [2018] A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play, Science 362(6419), 1140–1144. Crossref, Google Scholar
Storck, J., Hochreiter, S. and Schmidhuber, J. [1995] Reinforcement Driven Information Acquisition in Non-deterministic Environments. Paper presented at the Proc. Int. Conf. Artificial Neural Networks (Paris). Google Scholar
Strawson, G. [1994] The impossibility of moral responsibility. Philos. Stud. 75(1), 5–24. Crossref, Google Scholar
Strong Cognitive Uncontainability [2019] Paper presented at the Arbital, Available at https://arbital.com/p/strong-uncontainability/. [Retrieved May 19, 2019]. Google Scholar
Trazzi, M. and Yampolskiy, R. V. [2018] Building safer AGI by introducing artificial stupidity, arXiv:1808.03644. Google Scholar
Turchin, A. and Denkenberger, D. [2018] Classification of global catastrophic risks connected with artificial intelligence, AI Soc. 1–17. Google Scholar
Vinge’s Principle [2019] Paper presented at the Arbital, Available at https://arbital.com/p/Vinge-principle/. [Retrieved May 19, 2019]. Google Scholar
Vinge, V. [1993] Technological Singularity. Paper presented at the VISION-21 Symp. sponsored by NASA Lewis Research Center and the Ohio Aerospace Institute. Google Scholar
Vingean Reflection [2019] Paper presented at the Aribital, Available at https://arbital.com/p/Vingean-reflection/. [Retrieved May 19, 2019]. Google Scholar
Vingean Uncertainty [2019] Paper presented at the Arbital, Available at https://arbital.com/p/Vingean-uncertainty/. [Retrieved May 19, 2019]. Google Scholar
Wolfram, S. [2002] A New Kind of Science, Vol. 5 (Wolfram media Champaign, IL). Google Scholar
Yampolskiy, R. [2019] Unexplainability and Incomprehensibility of Artificial Intelligence, arXiv:1907.03869. Google Scholar
Yampolskiy, R. V. [2013], What to Do with the Singularity Paradox?, Philosophy and Theory of Artificial Intelligence (Springer Berlin Heidelberg), pp. 397–413. Crossref, Google Scholar
Yampolskiy, R. V. [2015] The Space of Possible Mind Designs. The Eighth Conference on Artificial General Intelligence (AGI2015). Berlin, Germany. J. Bieger (Ed.), Chapter 23, pp. 218–227. (Springer. July, 2015), pp. 22–25. Google Scholar
Yampolskiy, R. V. [2017] What are the ultimate limits to computational techniques: Verifier theory and unverifiability, Phys. Script. 92(9), 093001. Crossref, Google Scholar
Yampolskiy, R. V. [2018a] Artificial consciousness: An illusionary solution to the hard problem, Reti, saperi, linguaggi 2(2), 287–318. Google Scholar
Yampolskiy, R. V. [2018b] Artificial Intelligence Safety and Security (Chapman and Hall/CRC). Crossref, Google Scholar
Yampolskiy, R. V. [2019] Predicting future AI failures from historic examples, Foresight 21(1), 138–152. Crossref, Google Scholar
Yudkowsky, E. [2008a] Expected Creative Surprises. Paper presented at the Less Wrong, https://www.lesswrong.com/posts/rEDpaTTEzhPLz4fHh/expected-creative-surprises. Google Scholar
Yudkowsky, E. [2008b] Belief in Intelligence. Paper presented at the Less Wrong, Available at https://www.lesswrong.com/posts/HktFCy6dgsqJ9WPpX/belief-in-intelligence. Google Scholar
Yudkowsky, E. [2008c] Aiming at the Target. Paper presented at the Less Wrong, Available at https://www.lesswrong.com/posts/CW6HDvodPpNe38Cry/aiming-at-the-target. Google Scholar
Yudkowsky, E. [2016] Eliezer Yudkowsky on AlphaGo’s Wins. Paper presented at the Future of Life Institute, https://futureoflife.org/2016/03/15/eliezer-yudkowsky-on-alphagos-wins/. Google Scholar
Yudkowsky, E. and Herreshoff, M. [2013] Tiling agents for self-modifying AI, and the Löbian obstacle, MIRI Technical Report. Google Scholar
Zwirn, H. and Delahaye, J.-P. [2013] Unpredictability and Computational Irreducibility Irreducibility and Computational Equivalence (Springer), pp. 273–295. Crossref, Google Scholar