Explainable Ethical Reasoning

The HERA Python software package ethics has recently been updated by a module for explanations. That is, an HERA agent can now explain its moral permissibility judgments. This functionality can, for instance, be used to generate natural-language explanations for humans to understand the decisions made by an ethical reasoning agent, e.g., the decisions made by a social robot. The following tutorial will showcase how the new functionality can be used programmatically.

Consider the following case described as a mixed utility-based and Kantian causal agency model. For the tutorial, I will assume this YAML file is named trolley-dilemma.yaml.
description: The Trolley Dilemma
actions: [pull, refrain]
background: []
patients: [person1, person2]
consequences: [d1, d2]
mechanisms: 
    d2: pull
    d1: Not('pull')
utilities:
    d2: -1
    d1: -1
    Not('d2'): 1
    Not('d1'): 1
    pull: 0
    refrain: 0
intentions:
    pull: [pull, Not('d1')]
    refrain: [refrain]
goals:
    pull: [Not('d1')]
    refrain: []
affects:
    pull: []
    refrain: []
    d1: [[person1, -]]
    d2: [[person2, -]]
    Not('d1'): [[person1, +]]
    Not('d2'): [[person2", +]]
The situation is thus as follows: The trolley threatens to bring about the death of person 1 (represented through variable d1). However, there is the possibility to pull the lever and thus bring about the death of person 2 instead. Let us see how the explanation API can be used to find out how the various ethical principles reason about this case. First, the CausalModel class and the classes representing the ethical principles have to get imported. Let us just import the four of them.
from ethics.semantics import CausalModel
from ethics.principles import DeontologicalPrinciple, UtilitarianPrinciple, DoNoHarmPrinciple, KantianHumanityPrinciple
Next, we load the trolley-dilemma.json twice for the two situations resulting from pulling the lever or refraining from doing so. Because we also want to use a contrastive ethical principle (Utilitarianism), we assert that the two situations are alternatives of each other.
trolley1 = CausalModel("trolley-dilemma.json", {"pull":0, "refrain":1})
trolley2 = CausalModel("trolley-dilemma.json", {"pull":1, "refrain":0})

trolley1.alternatives.append(trolley2)
trolley2.alternatives.append(trolley1)
All that remains to be done is calling the explain method of the model while handing over one of the ethical principles as parameter.
explanation1 = trolley1.explain(DeontologicalPrinciple)
print(explanation1)
The output of this code is a Python dictionary consisting of four entries called “permissible”, “sufficient”, “necessary”, and “inus”. The meaning of these entries will become more clear when we look at the examples.
{'permissible': True, 'sufficient': [Not(Bad('refrain'))], 'necessary': [Not(Bad('refrain'))], 'inus': [Not(Bad('refrain'))]}
So, in this case, we learn that the deontological principle renders refraining from pulling the lever morally permissible. The reason is that refraining as such is not bad (sufficient reason). Moreover, we learn that the refraining not being bad also is a necessary reason, that is, had refraining been bad, then the action would have been judged impermissible. Finally, through the inus entry we additionally learn that refraining not being bad is a necessary part of a sufficient reason (which is trivially true here). Let us do the same thing for the other principles as well:
print(trolley1.explain(UtilitarianPrinciple))
print(trolley1.explain(DoNoHarmPrinciple))
print(trolley1.explain(KantianHumanityPrinciple))
The output should look like this:
{'permissible': True, 'sufficient': [GEq(U(And('d1', Not('d2'))), U(And(Not('d1'), 'd2')))], 'necessary': [Not(Gt(U(And(Not('d1'), 'd2')), U(And('d1', Not('d2')))))], 'inus': []}

{'permissible': True, 'sufficient': [And(Not(Causes('refrain', Not('d2'))), Not(Causes('refrain', 'd1'))), And(Not(Bad(Not('d2'))), Not(Causes('refrain', 'd1')))], 'necessary': [Not(Causes('refrain', 'd1')), Or(Not(Causes('refrain', Not('d2'))), Not(Bad(Not('d2'))))], 'inus': [Not(Causes('refrain', 'd1'))]}

{'permissible': True, 'sufficient': [And(Not(Means('Reading-1', 'person1')), Not(Means('Reading-1', 'person2')))], 'necessary': [Not(Means('Reading-1', 'person2')), Not(Means('Reading-1', 'person1'))], 'inus': [Not(Means('Reading-1', 'person2')), Not(Means('Reading-1', 'person1'))]}
As can be seen, all of the ethical principles agree that refraining from action is permissible. The utilitarian principle argues that the death of person 1 is no worse than the death of person two. The do-no-harm principle argues that refraining is permissible, because doing so does not actively cause the death of person 1. Finally, the Kantian principle argues that by refraining none of the two persons is used as a means. Next, we want the ethical principles to explain to us their judgments about pulling the lever. To do so, we write the following piece of Python code:
print(trolley2.explain(DeontologicalPrinciple))
print(trolley2.explain(UtilitarianPrinciple))
print(trolley2.explain(DoNoHarmPrinciple))
print(trolley2.explain(KantianHumanityPrinciple))
The output looks like this:
{'permissible': True, 'sufficient': [Not(Bad('pull'))], 'necessary': [Not(Bad('pull'))], 'inus': [Not(Bad('pull'))]}

{'permissible': True, 'sufficient': [GEq(U(And(Not('d1'), 'd2')), U(And('d1', Not('d2'))))], 'necessary': [Not(Gt(U(And('d1', Not('d2'))), U(And(Not('d1'), 'd2'))))], 'inus': []}

{'permissible': False, 'sufficient': [And(Causes('pull', 'd2'), Bad('d2'))], 'necessary': [Or(Causes('pull', 'd2'), Causes('pull', Not('d1'))), Or(Bad('d2'), Causes('pull', Not('d1'))), Causes('pull', 'd2'), Bad('d2')], 'inus': [Causes('pull', 'd2'), Bad('d2')]}

{'permissible': True, 'sufficient': [And(Not(Means('Reading-1', 'person1')), Not(Means('Reading-1', 'person2'))), And(End('person1'), Not(Means('Reading-1', 'person2')))], 'necessary': [Not(Means('Reading-1', 'person2')), Or(Not(Means('Reading-1', 'person1')), End('person1'))], 'inus': [Not(Means('Reading-1', 'person2'))]}
Again, the deontological principle argues that pulling the lever is permissible, because it is not inherently bad. The Utilitarian principle again argues that the death of person 2 is no worse than the death of person 1, and therefore it is also permissible to pull the lever. The do-no-harm principle forbids pulling the lever, because pulling the lever causes the death of person 2, which is bad. Finally, the Kantian principle argues that person 2 (who is not treated as an end) is also not treated as a means.

We are currently working towards linguistically framing these explanations to use them for our conversational robot Immanuel. This work will shed more light onto which type of reasons (sufficient, necessary, inus) are best suited for communicating different aspects of the situation. INUS reasons seem to be quite concise and straight to the point. Sufficient reasons seem to give a good idea about the regularities that underlie the judgment. And necessary reasons give an idea about what should have been different in order to enforce a different judgment—thus they may provide the basis for contrastive explanations.

More to come soon.