Robots and other AIs are becoming increasingly numerous, and they are increasingly acting as members of our society. They drive cars autonomously on our roads, help care for children and the elderly, and run complex distributed systems in the infrastructures of our world. These tasks sometimes present difficult and time-critical choices. How should robots and AIs make morally and ethically significant choices?
The standard notion of rationality in artificial intelligence, derived from game theory, says that a rational agent should choose the action that maximizes its expected utility. In principle, "utility" can be very sophisticated, but in practice, it typically means the agent's own reward. Unfortunately, scenarios like the Tragedy of the Commons and the Prisoner's Dilemma show that self-interested reward-maximization can easily lead to very poor outcomes both for the individual and for society.
As a step toward resolving this problem, we ask what are the pragmatic benefits of acting morally and ethically, both for individuals and for society as a whole. Recent results in the cognitive sciences shed light on how humans make moral and ethical decisions. Following ethical and moral constraints often leads both the individual and the society as a whole to reap greater benefits than would be available to self-interested reward-maximizers.
Based on the human model, we can begin to define a decision architecture by which robots and AIs can judge the moral and ethical properties of proposed or observed actions, and can explain those judgments and understand such explanations, leading to feedback cycles at several different time-scales.