876
thespeed,aresenttotheruddercontrolsystem.Inthe
process of collision avoidance of ships, the
information transmitted by multiple sensors and
equipment is continuously integrated, making sure
the collision avoidance scheme is adjusted in time
(Wang,2007).Thecoreofthecurrentresearchishow
the decision
‐making module can satisfy the optimal
navigational operations in all types of extreme
offshoreenvironments.
Therefore, in the risk assessment and early
warningresearch ofunmannedshipnavigation,itis
necessarytofocusontheunmannedshipincomplex
navigation conditions (such as ports, straits, canals,
andother intensivewaters),
shipcollision avoidance
and hydrometeorology, geographical environment,
traffic situation, and other issues. This research is
based on ship sensor data acquisition and training
optimization of decision neural networks
(Mazurowski, 2008). An intelligent risk warning
model and method suitable for unmanned ships
under complex navigation conditions is formed to
approachreal‐time
warningofships(Scheffer,2012).
In the intelligent decision‐making research, an
intelligentfusioncorrelationanalysisiscarriedouton
staticanddynamictargetsandnavigationconditions
aroundunmannedships.Intelligenttheories,suchas
deep learning, knowledge base, and situation
calculation,areapplied.Researchonshipnavigation
intelligent decision theory
based on ship navigation
system information and shore‐based support
information, break through the key technologies of
ship autonomous meteorological navigation
technology. Technologies such as ship collision
avoidance,reefavoidance,anti‐shelfintegration,and
smart processing of navigation information support
autonomous decision‐making of ship navigation
(Capraro,2006).
Toachieveintelligent
collisionavoidancefunction
of unmanned ships in various environments, a
collision avoidance decision module based on deep
reinforcement learning is proposed to make
autonomous decisions under various conditions
(Mnih, 2015). In the Cyber confrontation game, the
DeepMind team collects enough data for training;
however, in the real navigation environment, it
is
difficult to obtain data in a rich and varied nautical
environment.Inparticular,varioustypesofencounter
shipshavedifferentpointsofobservationindifferent
situations, and it is difficult to predict their future
pathofnavigation(Sarukkai,2000).Intheprocessof
calculating the global solution optimal solution, the
decisionmodel isdifficulttodifferentiatedue tothe
discreteactionasaresult,theglobaloptimalsolution
cannot converge. Therefore, this study proposes a
generative adversarial networks (GANs) model to
solve the problem of neural network training data,
andthecombinationofGANanddeepreinforcement
learningtosolve
theconvergenceproblemofoptimal
actionunderdiscreteactionunitconditions.
2 RELATEDWORK
2.1 TheprincipleofGAN
GAN is a new method proposed by Goodfellow
(2014)totraingeneratedmodels.ThemethodofGAN
includes the generation and discrimination of two
“adversarial” models. The generated model (G) is
used
to capture the data distribution, and the
discriminant model (D) is used to estimate the
probability that a sample is derived from real data
ratherthanthegenerateddata.Boththegeneratorand
discriminatorarecommonconvolutionalnetworksas
well as fully connected networks. The generator
generatesasamplefrom
thestochasticvector,andthe
discriminator discriminates between the generated
sampleandtherealtrainingsetsample.
This optimization process can be attributed to a
two‐player minimax game problem. Both purposes
canbeachieved throughabackpropagationmethod.
Awell‐trainedgenerationnetworkcantransformany
noisevectorintoa
samplesimilartothetrainingset.
Thisnoisecanbeseenastheencodingofthesample
inalow dimensionalspace.The generatorgenerates
meaningful data based on the stochastic vectors. In
contrast, the discriminator learns how to determine
realandgenerateddataandthenpassesthelearning
experience
to the generator, thereby, enabling the
generator to generate more workable data based on
the stochastic vectors. Such a trained generator can
have many uses; one of them being environmental
generationinautomaticnavigation.
Thespecificprocesstoobtainvarioustargetships
isshowninFigure1.First,afewstochastic
vectorsare
fedasinputinthegeneratornetwork,andfakedata
are subsequently generated by the generator. The
aforementioned fake data can correspond to a few
shipstatepicturesornavigationdatasuchasAISdata
of a nearby encounter of the given ship or the path
planning data
after the ship route is updated. We
input the fake data to the discriminator, and the
discriminatordetermines whethertheinput dataare
realdataorfakedatageneratedbythegenerator.The
similarity between the generated data and the real
data gradually increases, then the discriminating
ability required by the
discriminator also increases
accordingly. Furthermore, the generator and the
discriminator share a mutually competitive and
mutually adversarial equation. The generated data
are considered to sufficiently mirror real data, and
therefore,thefakedatainputbythegeneratorappear
sufficientlyrealistic.Theapproximateaccuracyofthe
discriminatorinthiscaseis
50%.Thiscorrespondsto
the ta rget ship image data that are required in a
criticalseaenvironment.
Figure1.ApplyingGANgeneratevarioustargetshipswith
differentbackground