590
5.3 Introduction of a model-agnostic testing process
In order to be able to test the manifold of AI-based
systems in a uniform manner and to ensure the future
viability of the testing process, the establishment of a
model-agnostic testing process is suggested (cf.
Chapter 4.1). The focus of the test should be to
determine “whether” and not “how” an AI-based
system functions properly. This approach enables the
feasibility, scalability and comparability of the audit
processes.
5.4 Formalization of the operational design domains of AI-
based systems
To enable uniform testing, it is advised to rely on a
standardised formalization of both the description of
the operation domain as well as of the measurement
and evaluation of functional performance (cf. [14],
[15], [13]).
5.5 Development of an automated data processing
infrastructure
The technical realisation of the testing processes
should be based on an automatable data processing
infrastructure to ensure scalability and
reproducibility. For the data procurement process, it
is fundamental to rely on standardised operation
domain descriptions of present systems (cf. [14], [15],
[13]). Notably, the use of synthetic or augmented data
is a promising way to independently obtain the
necessary test data at any time without building up
long-term data dumps (cf. [17], [18], [19], [20], [21]). A
crucial advantage in using synthetic (or augmented)
test data is the generation of novel test data which
was not used by the manufacturer before.
6 CONCLUSION
Current regulatory procedures are inadequate to
assess maritime AI-based systems (therefore referring
to MASS) as shown in Chapter 2. New processes have
to allow systems with a wide variety of architectures
to be tested, verified and brought to market in a safe
manner. It is therefore needed to introduce concepts
which can be implemented parallel to existing
procedures and measures without interfering with
innovation or safety. The authors, therefore, propose
the introduction of a new Module in the framework of
the MED labelled Module K consisting of guidelines
for the manufacturer of an AI-based system and the
regulating body responsible for verifying, testing and
approving such a system. The guidelines include
steps which should be performed to address concerns
arising from bringing these systems on the market
whilst keeping the amount of required in-depth
knowledge about their internal functions to a
minimum, essentially allowing for a black box testing
procedure. The proposed methods are a basic outline
of how such a methodology could be implemented to
allow the verification of MASS. These methods can
serve as a guideline to specify future research and
narrow down the fields which must be investigated
further.
7 FUTURE WORK
Despite the given possibilities for modelling complex
dynamics and correlations with the help of large
amounts of data, the application of AI with ML
methods, especially through deep learning, is
problematic. The quality and reliability of the
decision-making processes and consequent results of
given models are directly dependent on the selection
of the algorithms and quality of datasets.
Furthermore, the range of available datasets for
testing the models is severely limited, making it
difficult to generalise and solve a problem using ML
methods. One approach to address that issue is to
establish methods and processes in the development
phase of safety-critical applications to maintain safety
and robustness after deployment. Processes and
methods from other areas, e.g. for computer vision
applications, could be adapted by transferring
findings to the maritime domain. Another important
aspect is how to define and justify methods, processes
and requirements for datasets and their procurement,
since they are crucial for the development of robust
systems based on AI, more specifically ML.
ACKNOWLEDGEMENT
The study which is summarized in this paper was carried
out within the experts network ”Wissen - Können -
Handeln” of the Federal Ministry of Digital Affairs and
Transport Germany (BMDV) and funded by the BMDV
under grant number 0800Z12-1114/002/1061.
REFERENCES
[1] S. K. Brooks and N. Greenberg, “Mental Health and
Psychological Wellbeing of Maritime Personnel: A
Systematic Review,” BMC Psychology, vol. 10, no. 1, pp.
1–26, 2022.
[2] C. Berghoff, B. Biggio, E. Brummel, V. Danos, T. Doms,
H. Ehrich, T. Gantevoort, B. Hammer, J. Iden, S. Jacob,
H. Khlaaf, L. Komrowski, R. Kröwing, J. H. Metzen, M.
Neu, F. Petsch, M. Poretschkin, W. Samek, H. Schäbe, A.
V. Twickel, M. Vechev, T. Wiegand, W. Samek, and M.
Fliehe, “Towards Auditable AI Systems,” Whitepaper,
2021.
[3] W. Samek and K.-R. Müller, “Towards Explainable
Artificial Intelligence,” in Lecture Notes in Computer
Science (Including Subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 2019,
vol. 11700 LNCS, pp. 5–22.
[4] Europäisches Parlament und Rat der Europäischen
Union, “Richtlinie 2014/90/EU des europäischen
Parlaments und des Rates vom 23. Juli 2014 über
Schiffsausrüstung und zur Aufhebung der Richtlinie
96/98/EG des Rates (2014/90/EU),” pp. 146–185, 2014.
[5] E. Kommission, “Vorschlag für eine Verordnung des
Europäischen Parlaments und des Rates zur Festlegung
harmonisierter Vorschriften für künstliche Intelligenz
(Gesetz über künstliche Intelligenz) und zur Änderung
bestimmter Rechtsakte der Union,” 2021.
[6] B. Rokseth, O. I. Haugen, and I. B. Utne, “Safety
Verification for Autonomous Ships,” MATEC Web of
Conferences, vol. 273, 2019.
[7] H. Ringbom, “Regulating Autonomous Ships—
Concepts, Challenges and Precedents,” Ocean