Mumakore achangopfuura, mamodhi ekugadzira anonzi "diffusion modhi" ave achiwedzera kufarirwa, uye nechikonzero chakanaka.
Nyika yaona izvo mamodheru ekuparadzira anogona kuita, sekupfuura maGAN pamifananidzo synthesis, nekuda kweakasarudzika mashoma akasarudzika akaburitswa mu2020s & 2021s.
Varapi nguva pfupi yadarika vakaona kushandiswa kwemhando dzekuparadzira mukati DALL-E2, OpenAI's mufananidzo wekugadzira mufananidzo wakaburitswa mwedzi wapfuura.
Vazhinji varapi veKudzidza kweMichina pasina mubvunzo vanoda kuziva nezve mukati mekushanda kweDiffusion Models vakapihwa kubudirira kwavo kwazvino.
Mune ino post, isu tichatarisa iyo theoretical underpinnings yeDiffusion Models, dhizaini yavo, mabhenefiti avo, uye zvimwe zvakawanda. Ngatiendei.
Chii chinonzi Diffusion modhi?
Ngatitangei nekuona kuti sei modhi iyi ichinzi semuenzaniso wekuparadzira.
Izwi rine chekuita ne thermodynamics muzvidzidzo zvefizikisi rinonzi diffusion. Sisitimu haina kuenzana kana paine kuwanda kwechinhu, sekunhuhwirira, munzvimbo imwechete.
Diffusion inofanira kuitika kuti sisitimu ipinde mukuenzana. Mamorekuru ekunhuhwirira anopararira mukati mehurongwa kubva kudunhu repamusoro, zvichiita kuti sisitimu ifanane mukati mese.
Zvese zvinopedzisira zvaita homogeneous nekuda kwekupararira.
Diffusion modhi inokurudzirwa neiyi thermodynamic isiri-equilibrium mamiriro. Diffusion modhi dzinoshandisa Markov cheni, inova nhevedzano yezvakasiyana uko kukosha kwega kwega kunoenderana nemamiriro echiitiko chekutanga.
Kutora mufananidzo, isu tinowedzera zvakateerana huwandu hweruzha kwairi mukati mechikamu chekuenderera mberi.
Mushure mekuchengetedza mufananidzo weruzha, tinoenderera mberi nekugadzira iyo inotevera mufananidzo mumutsara nekuunza imwe ruzha.
Kakawanda, nzira iyi inoitwa. Mufananidzo wakachena weruzha unobva pakudzokorora nzira iyi nguva shoma.
Zvino tingagadzira sei mufananidzo kubva pamufananidzo wakazara?
Iyo diffusion process inodzoserwa kumashure uchishandisa a neural network. Manetiweki mamwe chete uye huremu humwe chete hunoshandiswa mukudzokera kumashure kupararira maitiro kugadzira mufananidzo kubva t kusvika t-1.
Panzvimbo yekurega network ichitarisira mufananidzo, munhu anogona kuedza kufanotaura ruzha pane imwe neimwe nhanho, iyo inofanirwa kubviswa kubva pamufananidzo, kuitira kuti iwedzere kurerutsa basa.
Mune chero mamiriro ezvinhu, the neural network design inofanira kusarudzwa nenzira inochengetedza data dimensionality.
Yakadzika Dive muDiffusion Model
Zvikamu zvemodhi yekuparadzira inzira yekumberi (inozivikanwawo senzira yekuparadzira), umo datum (kazhinji mufananidzo) inonzwika zvishoma nezvishoma, uye nzira yekudzokera kumashure (inozivikanwawo senzira yekudzosera kumashure), umo ruzha rwunoitwa. inodzoserwa kuita sampuli kubva kune yakananga kugovera.
Kana iyo ruzha nhanho yakadzikira zvakakwana, maGaussians ane mamiriro anogona kushandiswa kumisikidza sampling ketani shanduko mukuenderera mberi. Iyo iri nyore parameterization yenzira yekumberi inokonzerwa nekubatanidza ruzivo urwu nekufungidzira kwaMarkov:
q(x1:T |x0) := YT t=1 q(xt|xt−1), q(xt|xt−1) := N (xt; p 1 − βtxt−1, βtI)
pano Poshi….T ichirongwa chekusiyanisa (chinodzidzwa kana chakagadziriswa) chinovimbisa, kune yakakwirira zvakakwana T, kuti xT inenge iri isotropic Gaussian.
Iyo inopesana maitiro ndiko uko diffusion modhi yemashiripiti inoitika. Iyo modhi inodzidza kudzoreredza iyi nzira yekuparadzira panguva yekudzidziswa kuitira kuburitsa data nyowani. Iyo modhi inodzidza kugovera kwakabatana se (x0:T) mhedzisiro yekutanga neiyo yakachena Gaussian ruzha equation
(xT):=N(xT,0,I).
pθ(x0:T ) := p(xT ) YT=1 pθ(xt−1|xt), pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ( xt, t))
uko Gaussian shanduko 'inotsamira nguva paramita inowanikwa. Kunyanya, cherechedza kuti kuumbwa kweMarkov kunotaura sei kuti yakapihwa reverse diffusion shanduko yekugovera inotsamira chete pane yakapfuura nguva (kana inotevera nguva, zvichienderana nekuti iwe unoitarisa sei):
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t))
Model Kudzidzisa
A reverse Markov modhi iyo inowedzera mukana weiyo data yekudzidziswa inoshandiswa kudzidzisa modhi yekuparadzira. Kutaura zvazviri, kudzidziswa kwakafanana nekudzikisa musiyano wepamusoro wakasungwa pane yakashata yerogi mukana.
E [− log pθ(x0)] ≤ Eq − log pθ(x0:T ) q(x1:T |x0) = Eq − log p(xT ) − X t≥1 log pθ(xt−1|xt) q (xt|xt−1) =: L
Models
Isu ikozvino tinofanirwa kusarudza maitiro ekuita yedu Diffusion Model mushure mekutanga masvomhu epasi pechinangwa chedu basa. Sarudzo yega inodiwa kune yekumberi hurongwa ndeyekutarisa mutsauko wehurongwa, iwo maitiro anowanzo simuka panguva yekuita.
Isu tinofunga zvakanyanya kushandisa iyo Gaussian yekugovera parameterization uye modhi yedhizaini yekudzosera kumashure maitiro.
Mamiriro ega ekugadzirwa kwedu ndeyekuti zvese zvinopinza uye zvinobuda zvine zviyero zvakafanana. Izvi zvinosimbisa huwandu hukuru hwerusununguko hunopihwa Diffusion Models.
Pazasi, tichaenda zvakadzama pamusoro peidzi sarudzo.
Forward Process
Isu tinofanirwa kupa iyo mutsauko hurongwa maererano nekuenderera mberi. Isu takavaisa chaizvo kuti ive nguva-inotsamira nguva uye tikazvidza mukana wekuti vanogona kudzidzwa. Chirongwa chekufamba kubva
β1 = 10−4 kusvika βT = 0.02.
Lt inova nguva dzose neruremekedzo rweseti yedu yezviyero zvinodzidzwa nekuda kweiyo yakatarwa mutsauko wechirongwa, ichitibvumira kuti tirege kuitora panguva yekudzidziswa zvisinei nehunhu chaihwo hwakasarudzwa.
Reverse Process
Iye zvino tinoenda pamusoro pezvisarudzo zvinodiwa kuti titsanangure reverse process. Rangarira matsananguriro atakaita shanduko yeMarkov seGaussian:
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t))
Iye zvino zvatava kuona mhando dzekushanda. Kunyangwe paine chokwadi chekuti kune akaomesesa matekiniki ekuita parameterize, isu tichangoisa
Σθ(xt, t) = σ 2 t I
σ 2 t = βt
Kuzvitaura neimwe nzira, isu tinoona iyo multivariate Gaussian semhedzisiro yeGaussia yakaparadzana ine musiyano wakafanana, kukosha kwekusiyana kunogona kuchinjika nekufamba kwenguva. Izvi zvakamisikidzwa kuti zvienderane nenguva yekumisikidza nzira yekuendesa mberi.
Nekuda kwechigadzirwa chitsva ichi, tine:
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t)) :=N (xt−1; µθ (xt, t), σ2 t I)
Izvi zvinoguma mune imwezve yekurasikirwa basa inoratidzwa pazasi, iyo vanyori vakawana ichiburitsa yakawedzera kuenderana kudzidziswa uye mhedzisiro yepamusoro:
Lsimple(θ) := Et,x0, h − θ( √ α¯tx0 + √ 1 − α¯t, t) 2
Vanyori vanodhirowawo hukama pakati peiyi dhizaini yemhando yekuparadzira uye Langevin-yakavakirwa mamakisi-anofananidzira mamodheru. Sezvinei neyakazvimiririra uye inofambirana kusimudzira kwewave-based quantum physics uye matrix-based quantum mechanics, iyo yakaburitsa maviri akafananidzwa maumbirwo echinhu chimwe chete, zvinoita sekunge Diffusion Models uye Score-Based modhi inogona kuva mativi maviri emari imwe chete.
Network Architecture
Kunyangwe chokwadi chekuti yedu yakapfupikiswa kurasikirwa basa ine chinangwa chekudzidzisa modhi Σθ, isu hatisati tasarudza pane iyi modhi yekuvakwa. Ramba uchifunga kuti modhi inongofanirwa kuve neyakafanana yekupinza uye yekubuda zviyero.
Tichifunga nezvekumanikidzirwa uku, pamwe hazvisi zvisingatarisirwi kuti U-Net-senge zvivakwa zvinowanzoshandiswa kugadzira mifananidzo yekuparadzira mifananidzo.
Shanduko dzakawanda dzinoitwa munzira yekudzoreredza maitiro uchishandisa inoenderera nemamiriro Gaussian kugovera. Rangarira kuti chinangwa cheiyo reverse maitiro kugadzira mufananidzo unoumbwa neinteger pixel values. Kusarudza discrete (log) mukana kune yega yega inogona kuve pixel kukosha pane ese pixels saka zvakakosha.
Izvi zvinoitwa nekupa yakasarudzika discrete decoder kune reverse diffusion cheni yekupedzisira shanduko. kufungidzira mukana weimwe mufananidzo x0 kupihwa x1.
pθ(x0|x1) = YD i=1 Z δ+(xi 0 ) δ−(xi 0 ) N (x; µ i θ (x1, 1), σ2 1 ) dx
δ+(x) = ∞ kana x = 1 x + 1 255 kana x <1 δ−(x) = −∞ kana x = −1 x − 1 255 kana x > −1
apo iyo superscript I inoreva kubviswa kweimwe coordination uye D inoratidza huwandu hwehukuru mu data.
Chinangwa panguva ino ndechekusimbisa mukana weiyo yega yega kukosha kwepixel chaiyo yakapihwa kugoverwa kwezvingangove zvakakosha zveiyo pixel munguva-inosiyana. t=1.
Chinangwa Chokupedzisira
Mhedzisiro mikuru, sekureva kwesainzi, yakabva mukufembera chikamu cheruzha chemufananidzo pane imwe nguva. Pakupedzisira, vanoshandisa chinangwa chinotevera:
Lsimple(θ) := Et,x0, h − θ( √ α¯tx0 + √ 1 − α¯t, t) 2
Mumufananidzo unotevera, kudzidziswa uye sampling maitiro emhando yedu yekuparadzira anoratidzwa muchidimbu:
Mabhenefiti eDiffusion Model
Sezvakatoratidzwa, huwandu hwekutsvagisa pamusoro pemhando dzekuparadzira hwakawedzera munguva pfupi yapfuura. Diffusion Models ikozvino inoburitsa State-of-the-Art mufananidzo mhando uye inofemerwa neisiri-equilibrium thermodynamics.
Diffusion Models inopa akasiyana mamwe mabhenefiti mukuwedzera kune yekucheka-kumucheto mufananidzo mhando, senge isingade kudzidziswa kweanopikisa.
Izvo zvipingamupinyi zvekudzidziswa kweanopikisa zvinozivikanwa zvakanyanya, saka zvinowanzosarudzika kusarudza dzimwe nzira dzisiri dzemhandu dzine mashandiro akaenzana uye kugona kudzidzisa.
Diffusion modhi zvakare inopa zvakanakira scalability uye parallelizability maererano nekudzidziswa kushanda.
Kunyangwe Diffusion Models ichiita seinoburitsa mhedzisiro inoratidzika kunge isiri mumhepo yakatetepa, hwaro hweizvi mhedzisiro hwakaiswa nehuwandu hunofunga uye hunonakidza hwemasvomhu sarudzo uye hudiki, uye maindasitiri akanakisa maitiro achiri kuvandudzwa.
mhedziso
Mukupedzisa, vaongorori vanoratidza yemhando yepamusoro yemifananidzo yakawanikwa vachishandisa diffusion probabilistic modhi, kirasi yemamodhi akasarudzika anokurudzirwa nemazano kubva kune nonequilibrium thermodynamics.
Vakawana zvinhu zvikuru nekuda kwezvakabuda muState-of-the-Art uye kudzidziswa kusiri kwemhandu uye nekupihwa hucheche hwavo, kufambira mberi kwakawanda kunogona kutarisirwa mumakore anotevera.
Kunyanya, zvakaonekwa kuti diffusion modhi dzakakosha pakushanda kwemamodheru epamberi seDALL-E 2.
pano unogona kuwana tsvakiridzo yakazara.
Leave a Reply