597
1 INTRODUCTION
NTS are those specific human competencies such as
leadership, teamwork, situation awareness and
decision making, which affect the likelihood of
humanerroroccurringandtheseverityofitsimpact
(Flinetal.,2003).ThefourmainNTSaresubdivided
intotwocategories;socialandcognitive.Socialskills
are
those which are easily observable i.e leadership
and teamworking. Cognitive skills are those which
are difficult to observe i.e situation awareness and
decisionmaking(Flinetal.,2003).
Simulator training has proven to be very
successfulinthetrainingofpersonnelforoperatingin
high risk domains (Kozuba and
Bondaruk, 2014;
Wanger et al., 2013; Balci et al., 2014). Many safety
critical industries, such as aviation and anaesthesia,
have now adapted simulation as the recommended
methodofNTStraininganditseffectivenesshasbeen
tested in various pieces of research across the globe
worldwide(Winteretal.,2012;Michaelet
al.,2014).
Thetechnologyhasalsobeenadoptedfortraining
and assessments in the maritime sector. The
mathematicalmodelofashipcreatedonacomputer
graphically displays the ship and its movement
through the water nearly in a realistic manner and
helps learners to learn effectively (Mohovic et al.,
2012). The training provided through this medium
has many benefits such as the ability to navigate
vessels through restricted waters, deal with
emergency or crisis situations or use various
navigational aids (Pelletier, 2006). The biggest
advantage of providing training by simulator is the
ability to create various scenarios in different
A proposed Evidential Reasoning (ER) Methodology for
Quantitative Assessment of Non-Technical Skills (NTS)
Amongst Merchant Navy Deck Officers in a Ship’s
Bridge Simulator Environment
F.Saeed
HigherCollegesofTechnology,AbuZabi,UnitedArabEmirates
A.Bury,S.Bonsall&R.Riahi
LiverpoolJohnMooresUniversity,Liverpool,UnitedKingdom
ABSTRACT:Ship’sbridgesimulatorsareverypopularintheworldwidetrainingandassessmentofmerchant
navydeckofficers.Theexaminersofsimulatorcoursespresentlydonothaveamethodtoquantitativelyassess
theperformanceofagrouporanindividual.Someexaminersusechecklists
andothersusetheirgutfeelingto
grade competence. In this paper a novel methodology is established that uses the Evidential Reasoning
algorithm to quantitatively assess the NonTechnical Skills (NTS) of merchant navy officers. To begin with,
interviewswereconductedwithexperienceddeckofficerstodevelopthetaxonomyand
behaviouralmarkers
thatwouldbeusedintheassessmentprocess.Arandomselectionofstudentsstudyingtowardstheir Chief
Officer’s Certificate of Competency were recruited to have their NTS to be observed in a ship’s bridge
simulator.Theparticipant’sbehaviourwasratedagainstfivecriteriaandthesubsequentdatawasentered
into
theEvidentialReasoningalgorithmtoproduceacrispnumber.Theresultsthatweregenerateddemonstrate
thatthisapproachprovidesareliablemethodtoquantitativelyassesstheNTSperformanceofmerchantnavy
officersinasimulatedbridgeenvironment.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 12
Number 3
September 2018
DOI:10.12716/1001.12.03.20
598
meteorologicalconditionsindifferentseaareasusing
differenttargetships(Sniegocki,2005).
Simulator training is now being used as a
compulsory training element of the Officer of the
Watch(OOW)andChiefMate’scourse.AttheOOW
levelthecourseiscalledNAEST(O)(NavigationAids
and Equipment Simulator Training
Operational)
and at chief mate’s level NAEST (M) (Navigation
Aids and Equipment Simulator Training
Management).TheNAEST(O)courseisabasiclevel
course where the use of equipment, basic watch
keeping andnavigation skills are taught to students
undertakingtheOOWcourse.WhereasNAEST(M)is
a
management level course where advanced
navigationskillsaretaught(Wall,2015).
Presently simulator assessors do not have any
methodtoquantitativelyassesstheNTScompetence
ofdeckofficers.Theynormallyusetheirgutfeelingto
gaugethecompetenceofacandidate.
2 METHODS
Theaimofthisresearchistodevelop
amethodology
forquantitativelyassessingtheNTSofmerchantnavy
deckofficersinaship’sbridgesimulator.Toachieve
this,thefollowingstepswereundertaken:
1 Develop a taxonomy for deck officers’ NTS. To
assign a weight to each different criterion,
questionnaires were designed to assign the
possibleva lues
forrankingeachdifferentcriterion
through meetings and interviews with the
experienced deck officers. The ranks/weights
assignedbyexpertswereaggregatedbytheAHP
method.
2 Develop a behavioural markers’ assessment
framework based on the taxonomy of deck
officers’NTS.
3 Simulatorscenariodevelopedandvolunteerchief
officerstudentsrecruited.
4
Simulator observations conducted with volunteer
students and each BMwas awardeda weight by
assessor.
5 ER Algorithm and UV method used to calculate
thefinalcrispnumberoftheperformance.
3 DEVELOPATAXONOMYFORDECKOFFICERS
NTS(STEP1)
TodevelopataxonomyofdeckofficersNTS,a
series
ofinterviewswereconductedwithexperienceddeck
officersatmanagementleveltohelpidentifythekey
skills to be included. A semistructured method of
interviewing was carried out to extract maximum
information from the interviewee. The aim of each
interviewwastoidentifythenontechnicalaspectofa
deckofficer’sroleinacrisissituationonthebridgeof
ashipandtheskillsneededforthis,e.g.thinkingand
team working skills, decision making, situation
awarenessandleadership.
Theinterviewwasdividedintothreeparts:
Part 1: Performance example The interviewee
wasasked todescribe a realcase fromhis career
thatwasparticularlychallengingwhichtestedhis
NTS. The example could be a real critical
incident/near miss or a normal case where
experience and NTS were a significant outcome.
Theintervieweewasasked
inadvanceifhecould
think of this example before the interview. This
case was then discussed to identify the most
significantNTScomponents.
Part2:DistinguishingskillsTheintervieweewas
askedtothinkabouttheskillswhicharenecessary
for the effective performance of a deck officer
involved in a crisis situation on the bridge of a
ship.
Part 3: Weighting task The interviewee was
asked to assign a weight to each of the NTS
taxonomyelements.
Approximate times for the three interview parts
were:Part145minutes,Part215minutes,Part3
15 minutes. All the given information was held in
confidenceandiskeptasanonymous.
3.1 PilotInterview
To support the development of the interview
schedule, a pilot interview was undertaken with a
seniordeckofficer.Thistookplaceatanearlystageto
help make minor changes to the interview
questionnaire. This questionnaire was adapted from
the study of
‘Identification and measurement of
anaesthetists’ NTS (Fletcher et al. 2003b). The pilot
interviewwasrecordedandsubsequentlyutilisedby
the research team to ensure that the necessary
informationwasbeingobtainedfromtheinterviews.
3.2 IdentifyingParticipants
Thefirstcriterionfortheselectionoftheparticipants
wasthattheymust
holdaMasterMarinerCertificate
ofCompetency.Theothercriterionfortakingpartin
the study was that the interviewees volunteered to
take part. Fletcher et al. (2003b) argues that those
peoplewhoareveryinterestedinhumanfactorswill
bemoreinclinedtovolunteerandthismightleadto
potential
biases.However,giventhesensitivityofthe
informationbeingdiscussed,itwouldbeunethicalto
interview unwilling participants. The researcher in
this project visited the World Maritime University,
Malmo, to conduct interviews with experienced
master mariners pursuing further studies. The
researcher’saimwastoconduct1015interviewsfor
this
researchbutcouldonlymanage12interviewsin
total.
3.3 DataAnalysis
Basedonareviewoftheexistingliteratureandwith
the help of the information collected from
experiencedseafarers throughthe interview process,
a generic decision making model was generated
(Figure 1), the data gathered during the interviews,
was
carefullyreviewedandaweightassignedtoeach
criterion using the mathematical decision making
method known as the Analytical Hierarchy Process
(AHP).Theprocessofevaluatingweightofacriterion
ispresentedinthefollowingsubsection.
599
Figure1.DeckOfficers’NontechnicalSkillsTaxonomy
3.3.1 TheAHPmethod
The AHP was pioneered by Saaty and is often
referred to as the Saaty method (Coyle, 2004). The
method is popular and widely used in decision
makingandratingtasks.Itisamulticriteriadecision
making (MCDM) method that helps the decision
maker to make the
right decision in a complex
situation (Ishizaka and Labib, 2009).AHP case
applications range from choice of career through to
planningaportdevelopment(Coyle,2004).
Riahi et al. (2012) has used Saaty’s quantified
judgements on pairs of attributes A
i and Aj
representedbyannbynmatrixD.Theentriesaijare
definedbythefollowingentryrules.
Rule1.Ifa
ij=α,thenaji=1/α,α≠0
Rule 2. If A
i is judged to be of equal relative
importanceasA
j,thenaij=aji=1
12 1
2
12
12
1...
1
1...
... ... ... ...
11
... 1
n
n
nn
aa
a
a
D
aa

where i, j = 1, 2, 3, …, n and each a
ij is relative
importanceofattributeA
itoattributeAj.
Having recorded the quantified judgments of
comparisononpair(A
i,Aj)asthenumericalentryaij
in the matrix D, what is left is to assign to the n
contingenciesA
1,A2,…,A nasetofnumerical weights
w
1, w2, …, wn that should reflect the recorded
judgements. Generally weights w
1, w2,…, wn can be
calculatedbyusingthefollowingequation;

n
kj
k
n
j1
ij
i1
a
1
ωk1,2,3,.,n
n
a

(1)
whereaijrepresentstheentryofrowiandcolumnjin
acomparisonmatrixofordern.
The weight vector of the comparison matrix will
provide the priority order but it cannot confirm the
consistency of the pairwise judgement. The AHP
providesameasureoftheconsistencyofthepairwise
comparisons
bycomputingaConsistencyRatio(CR)
(Riahietal.,2012).TheCRisdevisedinsuchaway
thatavalue lessthan0.10isdeemedconsistentinthat
a decision maker should review the pairwise
judgementsiftheresultantvalueismorethan0.10.
The CR value is calculated according
to the
followingequations:
CI
CR
RI
(2)
max
λn
CI
n1
(3)
nn
kjk j
j1 k1
max
[( w a ) / w ]
λ
n


(4)
where CI is the Consistency Index, RI is the
average random index (Table 4.7), n is the matrix
600
orderandλ
maxisthemaximumweightvalueofthen
byncomparisonmatrixD.
The following numerical example shows the
methodofevaluationofweightsofmaincriteria(i.e.
Situation Awareness, Decision Making, Leadership
andTeamWork)byananonymousexpertjudgement
(Table2).
Table1.ValueofRIversus matrixorder(Saaty,1990)
_______________________________________________
n RI
_______________________________________________
1 0
2 0
3 0.58
4 0.9
5 1.12
6 1.24
7 1.32
8 1.41
9 1.45
10  1.49
_______________________________________________
11 12 13 14
21 22 23 24
31 32 33 34
41 42 43 44
aaaa
aaaa
D
aaaa
aaaa






The matrix for main criterion was obtained from
thetable2asfollows:
1
11 2
3
1113
3113
111
1
233
SA DM LS TW
SA
DM
D
LS
TW








Weights of main criteria are calculated using
equation1:
1311 12 14
1
11 21 31 41 12 22 32 42 13 23 33 43 14 24 34 44
a
aa a1
ω
n(aaaa)(aaaa)(aaaa)(aaaa)

   

1
1 1 1 0.3333 2
ω
4 1 1 3 0.5 1 1 1 0.3333 0.3333 1 1 0.3333 2 3 3 1




  

1
ω 0.207260
23
21 22 24
2
11 21 31 41 12 22 32 42 13 23 33 43 14 24 34 44
a
aa a
1
ω
n(aaaa)(aaaa)(aaaa)(aaaa)

   

2
11 1 1 3
ω
4 1 1 3 0.5 1 1 1 0.3333 0.3333 1 1 0.3333 2 3 3 1




  

2
ω 0.297538
31 32 33 34
3
11 21 31 41 12 22 32 42 13 23 33 43 14 24 34 44
aaaa
1
ω
n(aaaa)(aaaa)(aaaa)(aaaa)

   

3
13 1 1 3
ω
4 1 1 3 0.5 1 1 1 0.3333 0.3333 1 1 0.3333 2 3 3 1




  

3
ω 0.388447
43
41 42 44
4
11 21 31 41 12 22 32 42 13 23 33 43 14 24 34 44
a
aa a1
ω
n(aaaa)(aaaa)(aaaa)(aaaa)

   

4
1 0.5 0.3333 0.3333 1
ω
4 1 1 3 0.5 1 1 1 0.333 0.3333 1 1 0.3333 2 3 3 1




  

4
ω 0.106755
Table2:Anonymousexpertjudgements
Goal:ToSelectthemostimportantnontechnicalskillsfordeckOfficers
SituationAwareness
__________________________________________________________________________________________________
Howimportantis.. UnimportantEqually Important
‘SituationAwareness’Important
comparedto
__________________________________________________________________________________________________
1/9 1/8 1/7 1/6 1/5 1/4 1/3 1/21 2 3 4 5 6 7 8 9
__________________________________________________________________________________________________
DecisionMakingx
Leadershipx
Teamworkx
__________________________________________________________________________________________________
DecisionMaking
__________________________________________________________________________________________________
Howimportantis.. UnimportantEqually Important
Decision MakingImportant
comparedto
__________________________________________________________________________________________________
1/9 1/8 1/7 1/61/5 1/4 1/3 1/21 2 3 4 5 6 7 8 9
__________________________________________________________________________________________________
Leadershipx
Teamworkx
__________________________________________________________________________________________________
Leadership
__________________________________________________________________________________________________
Howimportantis.. UnimportantEqually Important
LeadershipImportant
comparedto
__________________________________________________________________________________________________
1/9 1/8 1/7 1/61/5 1/4 1/3 1/21 2 3 4 5 6 7 8 9
__________________________________________________________________________________________________
Teamworkx
__________________________________________________________________________________________________
601
The weight values are found as 0.207260 (ω
1),
0.297538 (ω
2), 0.388447 (ω3) and 0.106755 (ω4).
Consistencyratioiscalculatedbyusingequations2,3,
4.
Based on equation 4,
max
was calculated as
follows:
ω
1x=(1×0.207260)+(1× 0.297538)+(0.333333×
0.388447)+(2×0.106755)=0.847790
ω
2x=(1×0.207260)+(1× 0.297538)+(1×0.388447)+
(3×0.106755)=1.21351
ω
3x=(3×0.207260)+(1× 0.297538)+(1×0.388447)+
(3×0.106755)=1.62803
ω
4x=(0.5×0.20726)+(0.33×0.297538)+(0.33×
0.388447)+(1×0.106755)=0.43905
max
0.847790 1.21351 1.62803 0.43905
0.207260 0.297538 0.388447 0.106755
λ 4.118196
4





Themeanvaluefor
max
is4.118196.Ifanyofthe
max
turnsouttobelessthann,whichis4inthiscase,
then there is an error in the calculation, which
requiresathoroughcheck.
TheCIiscalculatedasfollows;
max
λn
4.118196 4
CI 0.03939
n1 41


Based on table 1, the Random Index (RI) for 4
criteriais0.9.Asaresult,theCRvaluewascalculated
asfollows;
CI 0.03939
CR 0.04376
CR 0.9

TheCRvalueforthemaincriteriawasfoundtobe
0.04376. A CR value of less than or equal to 0.1
indicatesthatjudgementsareacceptable(Saaty,1980).
Asaresult,theconsistency ofpairwisecomparisons
for the main criteria, are acceptable. The same
calculation technique was applied to
obtain weights
foreachsubcriterionandtochecktheconsistencyof
theexpertopinions.
3.3.2 GeometricMeanMethod
AHP initially was developed as a decision making
toolforindividualdecisionmakersbutbytheuseof
the geometric mean method individual pairwise
comparisonmetricsofanynumberofexperts
canbe
aggregated(AullHydeetal.,2006)asfollows:
1
k
ij 1ij 2ij 3ij kij
GeometricMean [e .e .e e ]
(5)
where, ekij is the k
th
expert judgement on pair of
attributesA
iandAj.
3.3.3 KnowledgeRepresentation
Datawascollectedbyconductinginterviewswith12
experienced senior deck officers both in UK and
Malmo,Sweden.Onlyeightparticipants’resultswere
considered for this study as the remaining four
participants’weightingdatawasinconsistentinlight
oftheAHPformula.Figure2 showstheweights
ofall
elementsoftheNTS.
Figure2. Deck Officers’ Nontechnical Skills Taxonomy
(Withresultantweights)
602
4 DEVELOPMENTOFBEHAVIOURALMARKERS
(STEP2)
Behaviouralmarkersystemsareusedfortrainingand
assessmentsoftheparticipantsinthesimulatorsand
were first developed in the aviation industry
(Helmreichetal.,1999).Lateronothersafetycritical
industries such as anaesthesia and nuclear power
generation have developed their
own behavioural
markersystems.
Klampferetal.(2001) proposedthe following for
designinggoodbehaviourmarkersystems:
Validity:inrelationtoperformanceoutcome.
Reliability: interrater reliability, internal
consistency.
Sensitivity:inrelationtolevelsofperformance.
Transparency: the observer understands the
performancecriteriaagainstwhichtheyarebeing
rated,availabilityofreliabilityandvaliditydata.
Usability:easytotrain,simpleframework,easyto
understand, domain appropriate language,
sensitivetoraterworkload,easytoobserve.
Klampfer et al. (2001) further suggest that
behaviouralmarkersystemsarelimitedbecausethey
“cannot capture every aspect of performance and
behaviour”duetothe:
Limited occurrence of some behaviours such as
conflictresolution.
Limitationofhumanobserverssuchasdistraction
or overload (e.g. in complex situations, or when
observinglargeteams)
In developing behavioural markers systems for
scrubpractitioners’NTS(SPLINTSsystem)Mitchellet
al.(2013)establishedthefollowingdesigncriteria:
Focus on the skills that are observable from
behaviour.
Besetasahierarchicalstructurewiththreelevels
ofdescription;category,element,andbehaviour.
Use active verbs for skills and understandable
languagefordefinitions.
Showasimplestructureandlayoutwitharating
scale that fits on one page that it can be easily
used.
The behavioural marker assessment framework
must,asfaraspossible,bedesignedtoensurethatit
is capable of capturing the fullest context of the
environmentinwhichthe
assessmentistakingplace
(Gatfield, 2008). Behavioural markers are a valuable
toolinassessingorobservingaparticipant’stechnical
andNTSintherealworldorinthesimulator.
A review of behaviour marker systems in use in
othersafetycriticalindustriesfoundthattheaviation
industry’s NTS taxonomy and behavioural
markers
would make a good starting point for developing a
system for use in the maritime industry. The
taxonomyandbehaviouralmarkerswerepresentedto
eachexpertintervieweefortheirfeedback.
The initial taxonomy and behavioural marker
systems had 26 elements and4 categories. Based on
theexperts’opinionduring
theinterviewsand since
someelementssuchas“conflictresolution”werenon
observable; 6 elements out of 26 elements were
removedfromthesystemtobeapplied.
The behavioural markers to be utilised in the
assessmentofdeckofficers’NTSwereformedintoa
frameworkforeaseofusein
theobservationstageof
thestudy.Asanexample,thedecisionmakingNTS
and its related behavioural markers are shown in
Table3.Therearefivelevelsofperformanceinthis
behaviouralmarkersystem.These rangefromvery
goodpracticetoverypoor practice.Byusingthese
behavioural markers an
examiner is able to rate a
student’sperformanceinaship’sbridgesimulator.
5 BRIDGESIMULATORSTUDY(STEP3)
The main aim of the bridge simulator study was to
developa method which couldquantitatively assess
NTS of the deck officers in a bridge simulator
environment. For conducting this study a
set of
volunteer students were recruited to take part. The
participants were volunteer students who have
completed their course of study for Chief Mates
certificateofCompetency.LJMUethicalapprovalwas
obtained for the study and students’ content was
obtained.
The simulator performance was observed by the
mainresearcherof
thisstudy,DrFarhanSaeedwhois
master mariner with ten years seagoing experience
and fourteen years teaching and training experience
todeckofficers.Duringthesimulatorobservation,the
researcher observed and rated participants’
performance against the behaviour marker
assessmentframework(Table4,5,6,and7).
5.1 Bridgesimulatorscenario
The following scenario was developed for the
assessmentofNTSofmerchantnavydeckofficersin
abridgesimulatorenvironment:
ThevesselwasalongsidethejettyinSouthampton.
Thebridgeteamwouldhavetopilottheirownvessel
and maintain all the records as agreed by the
members.Eachteam
wouldneedtomanoeuvretheir
ownvesselwiththeuseofabowthruster(teamwas
notallowedtousetugs).Therewouldbeanumberof
inbound as well as outbound vessels during the
departure. A grounded vessel in the vicinity of the
Nabtowerwithasalvageoperationunderway
would
requestawideberth.
Just after passing Fawley Terminal, Gyro No. 1
would start to drift ata rate of 1°/sec. Based on the
position of the vessel at the time of passing there
would be the possibility of interaction with large
inboundcontainerships.
This exercise is designed to
allow participants to
demonstrate their teamwork, situational awareness,
leadership,anddecisionmakingskills.
603
Table3.Decisionmakingelementsandbehaviouralmarkers
__________________________________________________________________________________________________
ElementVeryGoodPractice GoodPractice AcceptablePractice PoorPractice VeryPoorPractice
__________________________________________________________________________________________________
ProblemGatherallGathersufficient Gatherjustenough Gatherlittle Failuretodiagnose
definition informationto informationto informationto informationto theproblem
anddiagnosis identifyproblem identifyproblem identifyproblem identifyproblem
Reviewallcasual Reviewenough Reviewsomecasual Reviewveryfew Nodiscussionof
factorswithother casualfactorswithfactorswithother casualfactorswithprobablecause
crewmembers othercrewcrewmembers othercrew
membersmembers
OptionStatesallalternative Statesenough StatessomeStatesveryfew Doesnotsearch
generation optionalternativeoption alternativeoption alternativeoption forinformation
Askscrewmembers AskscrewAskscrewmembers Askscrew Doesnotaskcrew
foralloptionsmembersfor forsomeoptions membersforveryforalternatives
enoughoptionsfewoptions
RiskConsidersand Considersand Considers andsharesInadequate Nodiscussionof
Assessmentandsharesallestimated sharessubstantialjustenoughestimateddiscussionof limitingfactorswith
optionselection riskofalternative sharessubstantialriskofalternative limitingfactors crew
optionsestimatedriskof optionswithcrew
alternativeoptions
Confirmsandstates Confirmsand Confirmsandstates Confirmsand Doesnot inform
allselectedoptions/ statesenough someselectedoptions/statesveryfew crewofdecisionpath
agreedactionselectedoptions/ agreedactionselectedoptions/ beingtaken
agreedactionagreedaction
OutcomereviewCompletechecking Substantial AveragecheckingofLittlecheckingof Failstocheck
ofoutcomeagainst checkingof outcomeagainstplan outcomeagainst selectedoutcome
planoutcomeagainstplanagainstplan
plan
__________________________________________________________________________________________________
Table4.Teamworking
__________________________________________________________________________________________________
ElementVeryGoodPractice5 4 3 2 1 VeryPoorPractice
__________________________________________________________________________________________________
TeambuildingandFullyencouragesinputandxKeeps barriersbetweenteammembers
maintaining feedbackfromothers
ConsideringothersTakenoticeofthesuggestionsxIgnoressuggestionsofotherteam
ofotherteammembersmembers
ConsidersconditionofotherxDoesnottakeaccountoftheconditionof
team
membersintoaccountotherteammembers
ProvidedetailedpersonalxShownoreactiontootherteammembers
feedback
Supportingothers ProvideamplehelptootherxDonothelpotherteammembersin
teammembersindemandingdemandingsituation
situation
OffersverygoodassistancexDoesnotoffer
assistance
Communication EstablishtotalatmospherexBlocksopencommunication
foropencommunication
CommunicatesveryeffectivelyxIneffectivecommunication
Information SharesinformationamongxDoesnotshareinformationproperly
sharingallteammembersamongallteammembers
__________________________________________________________________________________________________
Table5.LeadershipandManagerialSkills
__________________________________________________________________________________________________
ElementVeryGoodPractice5 4 3 2 1 VeryPoorPractice
__________________________________________________________________________________________________
UseofAuthority TakesfullinitiativetoensurexHindersorwithholdscrewinvolvement.
andassertiveness crewinvolvementandtask
completion
Takesfullcontrolifsituationx Doesnotshowinitiativefordecision
requires
TotallyreflectsonsuggestionsxIgnoressuggestionsofothers
ofothers
Providingand Demonstrates
completewilltoxDoesnotcareforperformance
Maintaining achievetopperformanceeffectiveness.
standards
Planningand CompletelyencouragescrewxDoesnotencouragecrewparticipationin
Coordination participationinplanningandplanningandtaskcompletion
taskcompletion
PlaniswellclearlystatedandxPlanisnotclearly
statedandconfirmed
confirmed
WellclearlystatesgoalsandxGoalsandboundariesremainunclear
boundariesfortaskcompletion
WorkloadCompletelynotifiessignsofxIgnoressignsoffatigue
604
Management stressandfatigue
Allotsgoodtimetocompletetasksx Allotsverylittletimetocompletetasks
Prioritisation DemonstrateverygoodxDemonstratenoprioritisationoftasks
prioritisationoftasks
TaskDelegation Delegatesalltasksproperlyx Doesnotdelegatetasks
Initialcrisis Identifiesinitialcrisissituationx Does
notidentifyinitialcrisissituation
management veryquicklyandrespond
accordingly
__________________________________________________________________________________________________
Table6.SituationAwareness
__________________________________________________________________________________________________
ElementVeryGoodPractice5 4 3 2 1 VeryPoorPractice
__________________________________________________________________________________________________
Awarenessof FullymonitorsandreportxDonotmonitorschangesinsystems’
bridgesystems changesinsystems’statesstates
Awarenessof CollectsfullinformationaboutxDoesnotcollectinformationabout
externalenvironment(ownship’sposition,environment(ownship’sposition,traffic 
environment trafficandweather)andweather)
Sharescomplete
keyinformationxDoesnotsharekeyinformationabout 
aboutenvironmentwithteamenvironmentwithcrew
members
Awarenessoftime FullydiscusstimeconstraintsxDoesnotdiscusstimeconstraintswith
withotherteammembersotherCM
SituationMakesfullassessmentofx Doesnotmakeanassessmentof
Assessment changingsituationchangingsituation
__________________________________________________________________________________________________
Table7.Decisionmaking
__________________________________________________________________________________________________
ElementVeryGoodPractice5 4 3 2 1 VeryPoorPractice
__________________________________________________________________________________________________
Problemdefinition GatherallinformationtoxFailuretodiagnosetheproblem
anddiagnosis identifyproblem
ReviewallcasualfactorswithxNodiscussionofprobablecause
othercrewmembers
Optiongeneration StatesallalternativeoptionxDoesnotsearchforinformation
Askscrewmembersforallx Does
notaskcrewforalternatives
options
RiskAssessment ConsidersandsharesallxNodiscussionoflimitingfactorswith
andoption estimatedriskofalternativecrew
selectionoptions
ConfirmsandstatesallxDoesnotinformcrewofdecisionpath
selectedoptions/agreedactionbeingtaken
Outcomereview Completechecking
ofoutcomex Failstocheckselectedoutcomeagainst
againstplanplan
__________________________________________________________________________________________________
6 NTSASSESSMENTOFDECKOFFICERINA
BRIDGESIMULATOR(STEP4)
The following is a rundown of the participants’
performanceduringthescenarioestablishedinstep3.
They were rated against their performance on the
developed behavioural markers assessment
framework(Table4,5,6,and7).
The passage
plan was already prepared a day
before the exercise. The group tested all bridge
equipmentandcompletedthechecklists.Theexercise
startedwhenthebridgeteamwasready.Initiallythey
had some doubts aboutdeparting the berth without
tugs. The use of the bow thruster helped them to
depart without
any problems. The vessel was
manoeuvred slowly and left the berth and headed
towardsthechannel.Thevesselspeedwas8knotsin
thechannel.Themasterwasinoverallcommand,the
chief officer and OOW were performing navigation
andcommunicationdutiesrespectively.Atonepoint
their vessel grounded and
then refloated quickly.
The gyro started drifting but the bridge team
considered that the vessel was drifting due to
tide/current. The OOW suggested that the drifting
was due to the gyro failure but the master did not
investigate it further and it was assumed that the
vesselwasdriftingdue
toheavycurrent.Themaster
onlyrealisedthegyrofailureoncethelargealteration
of the vessel’s course was observed (about half an
hour after the initial drift). Immediately action was
takenbyswitchingtothebackupgyroandcontrolling
thesituation.
Gyro failure during the exercise was the key
moment and it was expected that the bridge team
would identify and take corrective measures
immediately.Thegroup’spoorperformancewasdue
to lack of situation awareness of the team and then
the master’s over reliance on the chief officer rather
thantakingcontrolofthesituationhimself.
Thestudents’behaviour
markersaretabulated in
Table 4, 5, 6, 7.After feeding this input in to the
model(Figure1:DeckOfficers’NTSTaxonomy)and
using the ER algorithm, an output result set was
generatedasshowninTable8andFigure3.
Table8.ERresultsofthegroupperformance
_______________________________________________
VeryPoor35.39%
Poor33.71%
Average28.05%
Good2.85%
VeryGood0.0%
_______________________________________________
605
Figure3.ERresultsofthegroupperformance
7 PERFORMANCECALCULATIONBYER
ALGORITHMANDUTILITYVALUE(STEP5)
After rating the performance of deck officers on a
ratingscaleof15(where5isverygoodpracticeand1
is very poor practice), these ratings are fed into ER
formula to obtain aggregate of each scale. Utility
Value is used to obtain a final value of the
performanceofdeckofficers.
TheERalgorithmcanbe analysedand explained
asfollows(Riahietal.,2012):
LetRrepresentasetwithfivelinguisticterms(i.e.
verypoor,poor,average,goodandverygood) with
their associated belief
degrees (i.e. β ) and be
synthesised by two subsets
1
R
and
2
R
from two
differentassessments.Then,forexample,R,
1
R
and
2
R
canseparatelybeexpressedby:
12345
R β Very Poor, β Poor, β Average, β Good, β Very Good
12345
11 1 1 1 1
R β Very Poor, β Poor, β Average, β Good, β Very Good
12345
22 2 2 2 2
R β Very Poor, β Poor, β Average, β Good, β Very Good
Suppose that the normalised relative weights of two
assessments in the evaluation process are given as
1
w and
2
w (
12
1)ww.
1
w and
2
w can be estimated by
using an AHP technique. Suppose that
1
m
M
and
2
m
M
(m
= 1, 2, 3, 4, 5) are individual degrees to which the subsets
1
R
and
2
R
support the hypothesis that the evaluation is
confirmed to the five linguistic terms. Then,
1
m
M
and
2
m
M
are obtained as:
mm
111
M wβ
mm
222
M wβ (6)
Suppose that H1 and H2 are the individual
remaining belief values unassigned for
1
m
M
and
2
m
M
(m=1,2,3,4,5).ThenH1andH2areexpressed
as:
111
H H H
222
H H H
(7)
where
n
H (n =1,2)representthedegreetowhichthe
otherassessorcanplay aroleintheassessment,and
n
H
(n = 1, 2) is caused by the possible
incompleteness in the subsets
1
R
and
2
R
.
n
H (n =
1or2)and
n
H
(n=1,2)aredescribedas:
112
H 1 w w
221
H 1 w w
5
m
11 1
m1
H w(1 β)

5
m
22 2
m1
H w1 β




(8)
Supposethat
'm
(m=1,2,3,4or5)represents
the nonnormalised degree to which the reliability
evaluationisconfirmed toeach ofthe fivelinguistic
terms as a result of the synthesis of the judgements
produced by assessors 1 and 2. Suppose that
'
U
H
represents the nonnormalised remaining belief
unassignedafterthecommitmentofbelieftothefive
linguistic terms because of the synthesis of the
judgements produced by assessors 1 and 2. The ER
algorithmisstatedas:
mmmmm
12 12 21
β KMM MH MH

U
12
HKHH
U
12 12 21
H' K H H H H H H

55
TR1
12
1
T1
K(1 MM)
R
RT


(9)
Aftertheaboveaggregation,thecombineddegrees
ofbeliefaregeneratedbyassigning
'
U
H backtofive
linguistictermsusingthenormalisationprocess:

m'
m
U
β
β m 1, 2, 3, 4, 5
1 H'

U
U
U
H'
H
1 H'
(10)
where, HU is the unassigned degree of belief
representing the extent of incompleteness in the
overall assessment.The above gives the process of
combiningtwo subsets.If threesubsets are required
to be combined, the result obtained from the
combination of any two subsets can be further
synthesised with the third subset
using the above
algorithm. In a similar way, the judgements of
multipleassessorsoflowerlevelcriteriainthechain
system (i.e. components or subsystems) can be
combined.
As an example, based on the ER algorithm two
quantitative data (e.g. R
1 and R2) are aggregated as
follows:
R1 stands for ‘Problem definition and diagnosis’
(sub criteria of decision making) assessed for a
teamperformance(Table7and9).
606
R2 stands for ‘Option generation’ (sub criteria of
decisionmaking)assessedforateamperformance
(Table7and9).
Table9:SubCriteriafordecisionmaking
_______________________________________________
R1R2
_______________________________________________
VeryPoor 00.5
Poor0.50.5
Average00
Good0.50
VeryGood 00
Weight(w
n) 0.2447 0.2069
_______________________________________________
w
1+w2=0.2447+0.2069=0.4516
Normalisedweights w
1=0.2447×2.21435=0.54185
Normalisedweights w
2=0.2069×2.21435=0.45815
12 34 5
11 1 1 1
β 0, β 0.5, β 0, β 0.5, β 0  
12345
22222
β 0.5, β 0.5, β 0, β 0, β 0 
11
111
M w β 0.54185 0 0
22
111
M w β 0.54185 0.5 0.27093
33
111
M w β 0.54185 0 0
44
111
M w β 0.54185 0.5 0.27093
55
111
M w β 0.54185 0 0
11
222
M w β 0.45815 0.5 0.22908
22
222
M w β 0.45815 0.5 0.22908
33
222
M w β 0.45815 0 0
44
222
M w β 0.45815 0 0
55
222
M w β 0.45815 0 0
11
H 1 w 1 0.54185 0.45815 
22
H 1 w 1 0.45815 0.54185 




12345
11 11111
H w1βββββ
0.54185 1 0 0.5 0 0.5 0 0






12345
2 2 22222
H w1βββββ
0.45815 1 0.5 0.5 0 0 0 0


111
H H H 0.45815 0 0.45815
222
H H H 0.54185 0 0.54185
55
TR1
12
1
T1
K(1 MM)
R
RT


1
5
T1 T2 T3 T4 T5
12 12 12 12 12
T1
K1 (MMMMMMMM )MM




1111
2222
11 2345
12 12 12 12 12
12345
12 12 12 12 12
12345
12 12 12 12 12
12345
12 12 12 12 12
12345
12 12 1
2
33333
4444
21
4
55555
212
(MMMMMMMMMM
(MMMMMMMMMM
K1(MMMMMMMMM
)
)
)
)
M
(MMMMMMMMMM
(MMMMMMMMMM)






1









K 1.2288
U' 1 2
H K H H 0.3050
1' 1 1 1 1
12 12 21
B K M M M H M H 0.1289
1'
1
U'
B
β 0.18547
1H

2' 2 2 2 2
12 12 21
B K M M M H M H 0.3857
2'
2
U'
B
β 0.55496
1H

3' 3 3 3 3
12 12 21
B K MM MH MH
 0
3'
3
U'
B
β0
1H
4' 4 4 4 4
12 12 21
B K M M M H M H 0.1805
4'
4
U'
B
β 0.25971
1H

5' 5 5 5 5
12 12 21
B K MM MH MH 0

5'
5
U'
B
β0
1H
Thefollowingresultwasobtainedfromtheabove
calculations:
R12=R1R2
_________________________
Very Poor 18.547%
Poor 55.496%
Average 0
Good 25.971%
Very Good 0
_________________________
ThecalculationisrepeatedforR
3andR4andthen
againrepeatedtoaggregatetheR
12(i.e.R
1
R
2
)andR34
(i.e. R
3
R
4
) to find the final value of the ‘decision
making’elementofthegroup.
7.1 ObtainingUtilityValue
The main aim of using a utility approach was to
obtainasinglecrispnumberforthetoplevelcriterion
(thefinalresultorgoal)ofeachalternateinorderto
rankthem.
LettheutilityofanevaluationgradeHnbe
607
denoted by

n
uH and

1nn
uH uH
if Hn+1
ispreferredtoH
n;

n
uH
canbeestimatedusingthe
decision marker’s preferences. If no preference
informationisavailable,itcouldbeassumedthatthe
utilities of evaluation grades are equidistantly
distributedinanormalisedutilityspace.Theutilities
ofevaluationgradesthatareequidistantlydistributed
inanormalisedutilityspacearecalculatedas

nmin
n
max min
VV
uH
VV
(11)
whereVnistherankingva lueofthelinguistictermHn
thathasbeenconsidered,V
maxistherankingvalueof
the mostpreferred linguistic term H
N and Vmin is the
rankingvalueoftheleastpreferredlinguistictermH
l.
TheutilityofthetoplevelorgeneralcriterionS€is
denoted by u(S(E)). If β
H ≠ 0 (i.e. the assessment is
incomplete,
1
1
N
H
n
n

) there is belief interval
[β
n,(βn+βH)], which provides likelihood that S(E) is
assessed to H
n. Without loss of generality, suppose
that the leastpreferred linguistic term having the
lowest utility is denoted by

l
uH and the most
preferredlinguistictermhavingthehighestutilityis
denotedby

N
uH .Thenthe minimum,maximum
and average utilities are defined as follows
respectively(Riahietal.,2012);



2
N
min n n l H l
N
uSE uH uH





1
1
N
max n n N H N
n
uSE uH uH








2
min max
average
uSEuSE
uSE
(12)
Obviouslyifalltheassessmentsarecomplete,then
0
H
and the maximum, minimum and average
utilities of S(E) will be the same. Therefore, u(S(E))
canbecalculatedas



1
N
nn
n
uSE uH
(13)
Theaboveutilitiesareusedonlyforcharacterising
anassessmentandnotforcriteriaaggregation.
First

n
uH
values were calculated for belief
values(VeryGood=5,Good=4,Average=3,Poor=
2,VeryPoor=1)

nmin
n
max min
VV
uH
VV

5
51
1
51
uH


4
41 3
0.75
51 4
uH


3
31 2
0.5
51 4
uH


2
21 1
0.25
51 4
uH


1
11
0
51
uH
Following Group’s ER algorithm output values
wereusedfortheexamplecalculations;
β
1=0.3539
β
2=0.3371
β
3=0.2805
β
4=0.0285
β
5=0.000
Total1.000
Ifβ
1+β2+β3+β4+β5=1thenfollowingequationwill
beused;



1
N
nn
n
uSE uH
11 2 2 3 3 4 4 5 5
uSEuHuHuHuHuH


0.2459uSE
8 RESULTANDDISCUSSION
Deck officers’ NTS taxonomy was developed using
interviewsandAHP,whichprovidedtheweightsof
the each skill and element. These weights were fed
into ER algorithm while aggregating participants’
NTSperformanceinabridgesimulatorenvironment.
The examiner observed the students’ NTS in a
ship’sbridge
simulatorbyusingbehaviouralmarkers,
the assessment data was then aggregated using the
ER algorithm.As part of the ER calculations, a
utilityvalue wasobtainedforthegroup’sNTS,which
provided a crisp number.The final group
performancevaluewasfoundtobe24.59%.
24.59% is a poor result.
Unfortunately, the
discussion on how to improve a deck officer’s
performanceina crisissituationisoutside thefocus
of this paper.Further research may be required to
address this issue.What is important here is that
this method has made it possible to quantitatively
assess the NTS performance of merchant
navy deck
officers in a bridge simulator and provide a crisp
number.
Assessingstudentscanbeanintensiveprocessfor
an examiner.It would be completely unrealistic to
expect an examiner to perform the calculations for
observations on each criteria at the same time as
observing students’ performance.To overcome
this
difficulty,theIntelligentDecisionSystemforMultiple
Criteria Assessment software was used.It is
expectedthatthiswouldalsobethecasewithfuture
assessments.Observedvalueswereenteredintothe
softwaretogetaresultquickly.Inthiscase,toprove
thereliabilityoftheresultsgeneratedby
thissoftware
608
the results were tested against manual calculations
(Section7.0)andfoundtobeaccurate.
9 CONCLUSION
This methodology has now made it possible to
quantitatively assess the NTS of deck officers in a
bridgesimulator.Thenecessarycalculationscanbe
performed by the Intelligent Decision System for
Multiple Criteria Assessment
software as the
examinermaynothavetheskillsetortimetoperform
thecalculationsforeachobservation.Theuseofthe
softwaremakesiteasytoinputthevaluesandobtain
thefinalresultsinatimelyfashion.
ACKNOWLEDGEMENT
The material and data in this publication have been
obtained through the funding and support of the
International Association of Maritime Universities
(IAMU)andTheNipponFoundationinJapan.
REFERENCES
AullHyde, R., Erdogan, S. and Duke, J. D. (2006) An
experiment on the consistency of aggregated
comparison matrices in AHP. European journal of
operationalresearch,171,pp.290295.
Balci M. B. C., Tas, T., Hazar, A. I., Aydin, M., Onuk, O.,
Cakiroglu, B., Fikri, O., Ozkan, A. and
Nuhoglu, B.
(2014) Applicability and effectiveness of virtual reality
simulator training in urology surgery: A doubleblind
randomisedstudy.Noblemedicus29,10(2),pp.6671.
Coyle, G. (2004) The Analytic Hierarchy Process (AHP).
PracticalStrategy.OpenAccessMaterial.AHP.Pearson
EducationLimited.
Fletcher, G., Flin, R and Mcgeorge, P.
(2003b) Interview
study to identify anaesthetists’ nontechnical skills.
UniversityofAberdeenSCPMDEProject:RDNES/991/C.
Flin,R., Martin, L., Geosters, K., Hoermann, J., Amalberti,
R.,Valot,C.,andNijhuis,H. (2003)Developmentofthe
NOTECHS (NonTechnical Skills) system for assessing
pilots’CRMskills.HumanFactorsandAerospaceSafety,3
(2),pp.95117.
Gatfield D., (2008) Behavioural markers for the assessment of
competence in crisis management. PhD thesis,
SouthamptonSolentUniversity.
Helmreich, R. L., Merritt, A. C., and Wilhelm, J. A. (1999)
Theevolution ofCrewResourceManagementTraining
in commercial aviation. The International Journal of
AviationPsychology,
9(1),pp.1932.
Ishizaka,A.andLabib,A.(2009)AnalyticHierarchyProcess
andExpertChoice:Benefits andLimitation,ORInsight,
22(4),p201220.
Klampfer,B.,Flin,R.HelmreichR.,Hausler,R.,Fletcher,G.,
Field, P., Staender, S., Lauche, K., Dieckmann, A. and
Amacher, A.(2001) Behavioural Markers Workshop.
Group
interaction in high risk environment (GIHRE)
project. GIHREAviation: Swiss Federal Institute of
Technology(ETH) Zurich,Swisstrainingcentre,56July
2001.
Kozuba,J.and Bondaruk, A. (2014) Flight simulator as an
essential device supporting the process of shaping
pilot’s situational awareness. International conference of
scientific paper, AFASES 2014,
Brasor, 2224 May 2014,
pp.695714.
Micheal,M.,Abboudi,H.,Ker,J.,Khan,M. S.,Dasgupta,P.
andAhmed,K.(2014)Performanceoftechnologydriven
simulators for medical students A systemic review.
Journalofsurgicalresearch,192,pp.531543.
Mitchell L., Flin R., Yule S., Mitchell
J., Coutts K. and
Youngson G. (2013) Development of a behavioural
marker system for scrub practitioners’ nontechnical
skills (SPLINTS system). Journal of evaluation in clinical
practice,19,pp.317323.
Mohovic, R., Rudan, I. and Mohovic, D. (2012) Problems
during simulator training in ship handling education.
ScientificJournalofMaritime
Research,26(1),pp.191199.
Pelletier, S. (2006) The role of navigation simulator
technology in marine pilotage.International Maritime
Pilotage Association 18
th
Congress, Havana, Cuba, 23
rd
November2006,pp.15.
Riahi, R., Bonsall, S., Jenkinson, I. and Wang, J. (2012) A
Seafarer’sreliabilityassessmentincorporatingsubjective
judgements. Journal of Engineering for the maritime
environment,226(4),pp.313334.
Saaty T. L. (1990) How to make decisions: The Analytic
Hierarchy Process. European Journal of operational
Research,48(1):pp.926.
Sniegocki,H.(2005)Impactoftheusageofvisual simulator
on the students training results. Conference paper,
InternationalConferenceonmodellingandsimulationgeneral
application and models in engineering science, Gdynia
MaritimeUniversity.
Wall,A.D.(2015),SubjectHead,LJMU,Interview,27
th
May,
2015.
Wanger, R., Razek, V., Grafe, F., Berlarge, T., Janousek, J.,
Daehnert, I., and Weidenbach, M. (2013), Effectiveness
of simulatorbased echocardiography training of non
cardiologists in congenital heart diseases.
Echocardiography, Wiley periodicals,
Inc.DOI:10.1111/echo.12118,pp.693698.
Winter, J. C. F, Dodou,D. and Mulder,M (2012) Training
effectivenessof whole body flight simulatormotion: A
comprehensive EMtaAnalysis. The International Journal
ofAviationPsychology,22(2),pp.164183.