Kwiminyaka yakutshanje, iimodeli eziveliswayo ezibizwa ngokuba "ziimodeli zokusasaza" ziye zanda kakhulu, kwaye ngesizathu esihle.
Umhlaba uzibonile ukuba zeziphi iimodeli zokusasaza ezikwaziyo, ezinje ngokugqwesa ii-GAN kwi-synthesis yemifanekiso, enkosi kupapasho olumbalwa olubalulekileyo olupapashwe nje ngo-2020 & 2021s.
Abasebenzi kutsha nje babone ukusetyenziswa kweemodeli zokusasaza kwi I-DALL-E2, Imodeli yokudala imifanekiso ye-OpenAI eyapapashwa kwinyanga ephelileyo.
Uninzi lwabasebenzi abafunda ngoomatshini ngokungathandabuzekiyo banomdla wokwazi malunga nokusebenza kwangaphakathi kweeModeli zeDiffusion ngenxa yokuphumelela kwabo kwamva nje.
Kule post, siza kujonga inkcazo yethiyori yeeModeli zeDiffusion, uyilo lwabo, iingenelo zabo, nokunye okuninzi. Masihambe.
Yintoni imodeli yeDiffusion?
Masiqale ngokufumanisa ukuba kutheni le modeli ibizwa ngokuba yimodeli yokusasaza.
Igama elinxulumene ne-thermodynamics kwiiklasi zefiziksi kuthiwa yi-diffusion. Inkqubo ayikho kungqamaniso ukuba kukho ukuxinana okukhulu kwezinto, njengevumba, kwindawo enye.
Ukusasazwa kufuneka kwenzeke ukuze inkqubo ingene kwi-equilibrium. Iimolekyuli zevumba zisasazeka kuyo yonke inkqubo ukusuka kwindawo yoxinzelelo oluphezulu, zenza inkqubo ifane kuyo yonke.
Yonke into ekugqibeleni iba yi-homogeneous ngenxa yokusabalalisa.
Iimodeli zokusabalalisa zikhuthazwa yile meko ye-thermodynamic non-equilibrium. Iimodeli zokusasaza zisebenzisa ikhonkco leMarkov, oluluhlu lwezinto eziguquguqukayo apho ixabiso ngalinye lixhomekeke kwimeko yesiganeko sangaphambili.
Ukuthatha umfanekiso, songeza ngokulandelelana umyinge othile wengxolo kuwo kwisigaba sosasazo lwangaphambili.
Emva kokugcina umfanekiso wengxolo, siqhubeka nokudala umfanekiso olandelayo kuluhlu ngokuzisa ingxolo eyongezelelweyo.
Amaxesha amaninzi, le nkqubo iyenziwa. Umfanekiso wengxolo omsulwa uphuma ngokuphinda le ndlela amaxesha ambalwa.
Singawenza njani ke umfanekiso ngalo mfanekiso ubhijekileyo?
Inkqubo yokusasaza iguqulwa kusetyenziswa a inethiwekhi yomnatha. Uthungelwano olufanayo kunye nobunzima obufanayo busetyenziswa kwinkqubo yokusasazwa ngasemva ukwenza umfanekiso ukusuka ku-t ukuya ku-t-1.
Esikhundleni sokuvumela inethiwekhi ilindele umfanekiso, umntu unokuzama ukuxela kwangaphambili ingxolo kwinqanaba ngalinye, ekufuneka lisuswe kumfanekiso, ukwenzela ukuba lula lula umsebenzi.
Kuyo nayiphi na imeko, i uyilo lwenethiwekhi ye-neural kufuneka ikhethwe ngendlela egcina ubungakanani bedatha.
Ukuntywila nzulu kwiModeli yoDiffusion
Amalungu emodeli yosasazo yinkqubo eya phambili (ekwabizwa ngokuba yinkqubo yokusasazwa), apho i-datum (idla ngokuba ngumfanekiso) yenziwa ingxolo ngokuthe ngcembe, kunye nenkqubo ebuyela umva (ekwaziwa ngokuba yinkqubo yokusasazwa okubuyela umva), apho ingxolo yenziwa khona. iguqulelwe emva kwisampulu ukusuka kunikezelo ekujoliswe kulo.
Xa inqanaba lengxolo liphantsi ngokwaneleyo, ii-Gaussians ezinemiqathango zingasetyenziselwa ukuseka utshintsho lwesampulu kwinkqubo yokuya phambili. Iparameterization elula yenkqubo eya phambili isiphumo sokudibanisa olu lwazi kunye nentelekelelo yeMarkov:
q(x1:T |x0) := YT t=1 q(xt|xt−1), q(xt|xt−1) := N (xt; p 1 − βtxt−1, βtI)
apha Nye….T yishedyuli yomahluko (enoba ifundiwe okanye ilungisiwe) eqinisekisa, ngo-T ophakamileyo ngokwaneleyo, ukuba i-xT iphantse ibe yi-isotropic Gaussian.
Inkqubo echaseneyo kulapho umlingo wemodeli yokusasazwa kwenzeka khona. Imodeli ifunda ukubuyisela umva le nkqubo yokusasazwa ngexesha loqeqesho ukuze kuveliswe idatha entsha. Imodeli ifunda ukuhanjiswa ngokudibeneyo njenge (x0:T) isiphumo sokuqala nge-equation yengxolo ye-Gaussian ecocekileyo
(xT):=N(xT,0,I).
pθ(x0:T ) := p(xT ) YT=1 pθ(xt−1|xt), pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ( xt, t))
apho iiparamitha zeGaussian ezixhomekeke kwixesha zifunyanwa khona. Ngokukodwa, qaphela indlela uqulunqo lweMarkov oluchaza ngayo ukuba unikezelo lwenguquko yokubuyisela umva luxhomekeke ngokukodwa kwixesha langaphambili (okanye ixesha elilandelayo, kuxhomekeke kwindlela oyijonga ngayo):
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t))
Uqeqesho olungumzekelo
Umva imodeli yeMarkov eyandisa amathuba edatha yoqeqesho isetyenziselwa ukuqeqesha imodeli yokusasazwa. Xa sithetha ngokoqobo, uqeqesho luyafana nokunciphisa umahluko ongaphezulu kwithuba lelogi elibi.
E [− log pθ(x0)] ≤ Eq − log pθ(x0:T ) q(x1:T |x0) = Eq − log p(xT ) − X t≥1 log pθ(xt−1|xt) q (xt|xt−1) =: L
imifuziselo
Ngoku kufuneka sithathe isigqibo malunga nendlela yokuphumeza iModeli yethu yeDiffusion emva kokuseka iziseko zemathematika zomsebenzi wethu wenjongo. Isigqibo sodwa esifunekayo kwinkqubo yokuqhubela phambili kukumisela ishedyuli yokuhluka, ixabiso layo liphakama ngokuqhelekileyo ngexesha lenkqubo.
Sicinga ngamandla ukusebenzisa i-Gaussian yokusabalalisa iparameterization kunye noyilo lwemodeli yenkqubo ebuyela umva.
Imeko yodwa yoyilo lwethu kukuba zombini igalelo kunye nemveliso inemilinganiselo efanayo. Oku kugxininisa iqondo elikhulu lenkululeko ebonelelwa yiDiffusion Models.
Apha ngezantsi, siza kungena kubunzulu obungakumbi malunga nolu khetho.
Phambili Inkqubo
Kufuneka sinikeze ishedyuli yokuhluka ngokumalunga nenkqubo yokuqhubela phambili. Sizibeke ngokuthe ngqo ukuba zixhomekeke kwixesha kwaye singayikhathaleli into yokuba zinokufundwa. Ishedyuli yolandelelwano ukusuka
β1 = 10−4 ukuya ku-βT = 0.02.
Lt iba yinto eqhubekayo ngokubhekiselele kwiseti yethu yeeparamitha ezifundekayo ngenxa yeshedyuli emiselweyo yokwahluka, okusivumela ukuba singayinaki ngexesha loqeqesho kungakhathaliseki ukuba ziphi na ixabiso elikhethiweyo.
Reverse Process
Ngoku sihamba phezu kwezigqibo ezifunekayo ukucacisa inkqubo ebuyela umva. Khumbula indlela esiluchaze ngayo utshintsho lukaMarkov njengeGaussian:
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t))
Ngoku sele sichonge iindidi ezisebenzayo. Ngaphandle kwento yokuba kukho iindlela ezintsonkothileyo zokumisela iparameter, siseta nje
Σθ(xt, t) = σ 2 t I
σ 2 t = βt
Ukuyibeka ngenye indlela, sithatha i-multivariate Gaussian ibe sisiphumo samaGaussia ahlukeneyo anomahluko ofanayo, ixabiso eliguquguqukayo elinokuguquguquka ngokuhamba kwexesha. Ezi zitenxo zisetelwe ukungqamanisa ithayimthebhile yokutenxa kwinkqubo yogqithiso.
Ngenxa yolu qulunqo lutsha, si:
pθ(xt−1|xt) := N (xt−1; µθ (xt, t), Σθ(xt, t)) :=N (xt−1; µθ (xt, t), σ2 t I)
Oku kubangela omnye umsebenzi welahleko oboniswe ngezantsi, apho ababhali bafumene ukuvelisa uqeqesho oluhambelanayo kunye neziphumo eziphezulu:
Ilula(θ) := Et,x0, h − θ( √ α¯tx0 + √ 1 − α¯t, t) 2
Ababhali bakwazoba uqhakamshelwano phakathi kolu qulunqo lwemizekelo yokusasazwa kunye neemodeli eziveliswayo ezisekelwe kumanqaku eLangevin. Njengophuhliso oluzimeleyo nolungqameneyo lwefiziksi ye-quantum esekwe kumaza kunye ne-matrix-based quantum mechanics, eveze iifom ezimbini ezithelekisekayo zesenzeko esifanayo, kubonakala ngathi iiModeli zoDiffusion kunye neemodeli eziSekwe kwiNqaku zingamacala amabini enkozo enye.
Architecture Network
Ngaphandle kwento yokuba umsebenzi wethu welahleko ofinyeziweyo ujolise ekuqeqesheni imodeli Σθ, asikagqibi isigqibo malunga noyilo lwalo mzekelo. Gcina ukhumbula ukuba imodeli kufuneka nje ibe negalelo elifanayo kunye nemilinganiselo yemveliso.
Ngenxa yesi sinyanzeliso, mhlawumbi akulindelwanga ukuba i-U-Net-efana ne-architecture isetyenziswa rhoqo ukwenza iimodeli zokusasazwa kwemifanekiso.
Utshintsho oluninzi lwenziwa ecaleni kwendlela yenkqubo ebuyela umva ngelixa kusetyenziswa unikezelo olunemiqathango oluqhubekayo lwe-Gaussian. Khumbula ukuba injongo yenkqubo yokubuya umva kukwenza umfanekiso owenziwe ngamaxabiso apheleleyo epixel. Ukumisela okunokwenzeka (okubhaliweyo) okunokubakho kwixabiso ngalinye lepixel enokubakho kuzo zonke iipixels ngoko kuyimfuneko.
Oku kufezwa ngokunikezela idikhowuda eyahlukileyo kwinguqu yokugqibela yekhonkco losasazo. ukuqikelela ithuba lomfanekiso othile x0 banikwa x1.
pθ(x0|x1) = YD i=1 Z δ+(xi 0 ) δ−(xi 0 ) N (x; µ i θ (x1, 1), σ2 1 ) dx
δ+(x) = ∞ ukuba x = 1 x + 1 255 ukuba x < 1 δ−(x) = −∞ ukuba x = −1 x − 1 255 ukuba x > −1
apho umbhalo ophezulu ndichaza ukutsalwa kolungelelwaniso olunye kwaye uD ubonisa inani lemilinganiselo kwidatha.
Injongo kweli nqanaba kukuseka ukubakho kwexabiso lenani elipheleleyo lepixel ethile enikwe unikezelo lwamaxabiso anokubakho aloo pixel kwixesha lokwahluka. t=1.
Injongo yokugqibela
Ezona ziphumo zinkulu, ngokutsho kwezazinzulu, zavela kuqikelelo lwenxalenye yengxolo yomfanekiso ngexesha elithile. Ekugqibeleni, basebenzisa le njongo ilandelayo:
Ilula(θ) := Et,x0, h − θ( √ α¯tx0 + √ 1 − α¯t, t) 2
Kulo mfanekiso ulandelayo, uqeqesho kunye neenkqubo zesampulu zemodeli yethu yokusasazwa ziboniswe ngokufutshane:
Izibonelelo zeModeli yoDiffusion
Njengoko besele kubonisiwe, isixa sophando kwiimodeli zokusasaza siye saphindaphindeka kutsha nje. Iimodeli ze-Diffusion ngoku zihambisa umgangatho womfanekiso we-State-of-the-Art kwaye ziphefumlelwe yi-thermodynamics engekho equilibrium.
Iimodeli zeDiffusion zibonelela ngeendidi zezinye iingenelo ukongeza ekubeni nomgangatho wemifanekiso osikiweyo, njengokungafuni uqeqesho oluchaseneyo.
Izithintelo zoqeqesho oluchaseneyo ziyaziwa ngokubanzi, kungoko kudla ngokukhethwa ukukhetha iindlela ezingezizo ezichaseneyo nentsebenzo efanayo kunye noqeqesho olunempumelelo.
Iimodeli zokusasaza zikwabonelela ngeenzuzo zokukaleka kunye nokuhambelana ngokubhekiselele ekusebenzeni koqeqesho.
Nangona iiModeli zeDiffusion zibonakala zivelisa iziphumo ezibonakala ngathi ziphuma emoyeni, isiseko sezi ziphumo sibekwe linani lezigqibo zemathematika ezicingayo nezinika umdla kunye nobuqili, kwaye ezona ndlela zilungileyo zoshishino zisaphuhliswa.
isiphelo
Ukuqukumbela, abaphandi babonisa ukufunyanwa kwemifanekiso ekumgangatho ophezulu kusetyenziswa iimodeli ezinokwenzeka zokusasazwa, udidi lweemodeli eziguquguqukayo ezifihlakeleyo ezikhuthazwa ziimbono ezivela kwi-nonequilibrium thermodynamics.
Baye bazuza izinto ezinkulu enkosi kwiziphumo zabo ze-State-of-the-Art kunye noqeqesho olungenachaphaza kwaye ngenxa yobuntwana babo, ukuqhubela phambili okungaphezulu kunokulindelwa kwiminyaka ezayo.
Ngokukodwa, kufunyaniswe ukuba iimodeli zokusasaza zibalulekile ekusebenzeni kweemodeli eziphambili njenge-DALL-E 2.
apha unokufikelela kuphando olupheleleyo.
Shiya iMpendulo