Assurance Cases with Accepted Test-Driven Development¶
- Marc P. Hauer (TU Kaiserslautern)
The goal of an Assurance Case is to provide an argumentation framework that helps to reason and document why provided evidences support a claim under given assumptions. They are already heavily used for assuring safety of ADM systems like in autonomous vehicles, but interest is also growing in other areas. For example, assurance cases are discussed as supporting framework for certification processes in the standardization roadmap AI 2.0 of the German standardization council.
The main task of an Assurance Case is to justify why and under which assumptions evidence implies a claim. To this end, the main claim is decomposed into sub-claims that are either also based on the fulﬁllment of hierarchically structured sub-claims or that can be directly induced from evidence. Each decomposition of a claim is made explicit by an argument or reasoning step that explains the idea behind a decomposition. Furthermore, all relevant assumptions for concluding that the sub-claims imply the claim are made explicit and connected to the argument. To ease the understanding of an argument, contextual information can be attached to it as well.
The idea of ATDD is to gather a group of various stakeholders with different perspectives and have structured meetings with them to develop example situation of how the software should work. Then the developers derive formal acceptance criteria from the examples and formulate tests based on them. Once all tests are specified, the actual development starts, which makes ATDD a part of the test-first philosophy like test-driven development, but on high-level system development. The Agile world knows various nuances and different names for the concept (e.g., specification by example, behavior-driven development).
Together, we expect to have a framework that allows testing and argumentation of the fulfillment of soft requirements (like fairness, ethics, diversity, explanations, …), wherever clear legal or normative requirements still seem to be a long way off and there is room for interpretation or manipulation using carefully chosen metrics. The proposed approach (see details in the paper below) is not a deterministic process and will thus not lead to a unique result. It can also not solve the main and principal problem of how to deﬁne soft requirements in a quantiﬁed and widely accepted manner. Nevertheless, it describes a pragmatic approach to come to a well-documented argument about when and under which assumptions a system is deemed “good enough” to be used.
By modelling the argumentation about why the solutions, and the fulﬁllment of the chosen tests, conﬁrm the achievement of the objectives as an Assurance Case, the rationales for decisions can be documented and disclosed for an external review or audit, for example, in case of a lawsuit. By employing the approach before development, and by automating the tests and documenting the results, this process yields the potential to provide a long-term protection against unwanted changes, for example through further training or errors when changing the code.
Additionally, if similar applications have to be audited again and again, with the help of assurance cases, best practices can develop over time to help making well-reasoned decisions in the context of AI based applications.
Related publication: https://ieeexplore.ieee.org/abstract/document/9440188 (Preprint: https://aalab.informatik.uni-kl.de/publikationen/peer_review/PDFs/Hauer-AssuringFairness.pdf)
Last update: 2022.09.04, v0.1