Isiqulatho[Fihla][Bonisa]
Uhlalutyo oluphezulu kunye neenkqubo zokufunda ngomatshini ziqhutywe yidatha, kodwa ukufikelela kuloo datha kunokuba nzima kubafundi ngenxa yeengxaki kunye nemigaqo yobumfihlo kunye nezoshishino.
Idatha ye-synthetic, enokuthi kwabelwane ngayo kwaye isetyenziswe ngeendlela idatha yokwenyani ingenako, yindlela entsha enokuthi ilandelwe. Nangona kunjalo, esi sicwangciso sitsha asinazo iingozi okanye izinto ezingeloncedo, ngoko ke kubalulekile ukuba amashishini athathele ingqalelo ngononophelo apho asebenzisa khona izixhobo zawo.
Kwixesha langoku le-AI, sinokuthi kwakhona idatha yioyile entsha, kodwa bambalwa abakhethiweyo abahleli kwi-gusher. Ke ngoko, abantu abaninzi bazenzela ezabo amafutha, afikelelekayo kwaye asebenzayo. Yaziwa njengedatha yokwenziwa.
Kule posi, siza kujonga ngokweenkcukacha kwidatha eyenziweyo-kutheni kufuneka uyisebenzise, uyivelisa njani, yintoni eyenza yahluke kwidatha yokwenyani, yeyiphi imeko enokuthi isebenze, kunye nokunye okuninzi.
Ke, yintoni i-Synthetic Data?
Xa iiseti zedatha zangempela zingonelanga kumgangatho, inani, okanye ukuhlukahluka, idatha yokwenziwa ingasetyenziselwa ukuqeqesha imizekelo ye-AI endaweni yedatha yembali yangempela.
Xa idatha ekhoyo ingazanelisi iimfuno zeshishini okanye inomngcipheko wabucala xa isetyenziselwa ukuphuhlisa yokufunda umatshini iimodeli, isoftware yovavanyo, okanye ezifana, idatha yokwenziwa inokuba sisixhobo esibalulekileyo kwiinzame ze-AI.
Ngokucacileyo, idatha eyenziweyo isetyenziswa rhoqo endaweni yedatha yokwenyani. Ngokuchaneke ngakumbi, yidatha ethegiswe ngokwenziwa yaza yaveliswa ngokulinganisa okanye i-algorithms yekhompyuter.
Idatha ye-Synthetic lulwazi oludalwe yinkqubo yekhompyutha ngokungeyomfuneko kunokuba ngenxa yezehlo zokwenyani. Iinkampani zinokongeza idatha yokwenziwa kwidatha yazo yoqeqesho ukugubungela zonke iimeko zokusetyenziswa kunye nomda, ukunciphisa iindleko zokuqokelelwa kwedatha, okanye ukwanelisa imigaqo yabucala.
Idatha yokwenziwa ngoku ifikeleleka ngakumbi kunangaphambili ngenxa yokuphuculwa kwamandla okusebenza kunye neendlela zokugcina idatha njengelifu. Idatha ye-Synthetic iphucula ukudalwa kwezisombululo ze-AI eziluncedo ngakumbi kubo bonke abasebenzisi bokugqibela, kwaye ngokungathandabuzekiyo luphuhliso oluhle.
Ibaluleke kangakanani idatha yokwenziwa kwaye kutheni kufuneka uyisebenzise?
Xa uqeqesho lweemodeli ze-AI, abaphuhlisi bahlala befuna iiseti zedatha ezinkulu ezinophawu oluchanekileyo. Xa ufundiswa ngedatha eyahlukileyo, amanethiwekhi sebenza ngokuchanekileyo.
Ukuqokelela kunye nokuleyibhela ezi datha zininzi ziqulathe amakhulu okanye izigidi zezinto, nangona kunjalo, kunokutya ixesha kunye nemali ngokungekho ngqiqweni. Ixabiso lokuvelisa idatha yoqeqesho lingancitshiswa kakhulu ngokusebenzisa idatha yokwenziwa. Umzekelo, ukuba wenziwe ngokwenziwa, umfanekiso woqeqesho oxabisa i-5 yeedola xa uthengwa kwi-a umboneleli wokuleyibhela idatha ingaxabisa i-$0.05 kuphela.
Idata yokwenziwa ingathomalalisa inkxalabo yabucala enxulumene nedatha enokuba novakalelo eveliswe kwihlabathi lokwenyani lo gama ikwanciphisa iindleko.
Xa kuthelekiswa nedatha yokwenyani, engakwaziyo ukubonisa ngokuchanekileyo uluhlu olupheleleyo lweenyani malunga nehlabathi lokwenyani, inokunceda ukunciphisa ucalucalulo. Ngokubonelela ngeziganeko ezingaqhelekanga ezibonisa okunokwenzeka okunokwenzeka kodwa kunokuba nzima ukufumana kwidatha esemthethweni, idatha eyenziweyo inokubonelela ngeyantlukwano enkulu.
Idatha ye-Synthetic inokulungela iprojekthi yakho ngezizathu ezidweliswe ngezantsi:
1. Ukomelela kwemodeli
Ngaphandle kokuyifumana, fikelela kwiidatha ezahlukeneyo kwiimodeli zakho. Ngedatha eyenziweyo, unokuqeqesha imodeli yakho usebenzisa ukwahluka komntu omnye oneenwele ezahlukeneyo, iinwele zobuso, iiglasi, ukuma kwentloko, njl., kunye nethoni yolusu, iimpawu zobuhlanga, ubume bamathambo, amabala, kunye nezinye iimpawu ukwenza okwahlukileyo. ubuso bayomeleze.
2. Iimeko ze-Edge zithathelwa ingqalelo
Ukulingana isethi yedatha ikhethwa ngokufunda koomatshini algorithms. Cinga emva kumzekelo wethu wokuqaphela ubuso. Ukuchaneka kweemodeli zabo kuya kuphucula (kwaye ngokwenene, amanye ala mashishini enza oku kanye), kwaye babeza kuvelisa imodeli yokuziphatha ngakumbi ukuba babevelise idatha yokwenziwa kobuso obumnyama ukuze bazalise izithuba zabo zedatha. Amaqela angakwazi ukugubungela zonke iimeko zokusetyenziswa, kubandakanywa iimeko ezinqamlekileyo apho idatha inqabile okanye ingekho, ngoncedo lwedatha yokwenziwa.
3. Inokufumaneka ngokukhawuleza kunedatha "eyiyo".
Amaqela ayakwazi ukuvelisa izixa ezikhulu zedatha eyenziweyo ngokukhawuleza. Oku kuluncedo ngakumbi xa idatha yokwenyani ixhomekeke kwiziganeko ezenzeka manqaphanqapha. Amaqela angakufumanisa kunzima ukufumana idatha yelizwe lokwenyani eyaneleyo kwiimeko zendlela ezinzima ngelixa eqokelela idatha yemoto eziqhubayo, umzekelo, ngenxa yokunqaba kwazo. Ukuze kukhawuleziswe inkqubo yochazo olunzima, izazinzulu zedatha zinokubeka ii-algorithms ukuze zilebhelishe ngokuzenzekelayo idatha yokwenziwa njengoko isenziwa.
4. Ikhusela ulwazi lwabucala lomsebenzisi
Iinkampani zinokuba nobunzima bokhuseleko ngelixa ziphethe idatha ebuthathaka, ngokuxhomekeke kwishishini kunye nohlobo lwedatha. Ulwazi lwezempilo lomntu (PHI), umzekelo, luhlala lubandakanywa kwidatha yesigulane kwishishini lezempilo kwaye kufuneka liphathwe ngokhuseleko oluphezulu.
Kuba idatha eyenziweyo ayibandakanyi ulwazi malunga nabantu bokwenyani, imiba yabucala iyancitshiswa. Cinga ngokusebenzisa idatha eyenziweyo njengenye indlela ukuba iqela lakho kufuneka libambelele kwimithetho ethile yabucala yedatha.
Idatha yokwenyani Vs idatha yeSynthetic
Ehlabathini lokwenyani, idatha yokwenyani ifunyenwe okanye iyalinganiswa. Xa umntu esebenzisa i-smartphone, ilaptop, okanye ikhomputha, enxiba iwotshi yesihlahla, efikelela kwiwebhusayithi, okanye esenza intengiselwano kwi-intanethi, olu hlobo lwedatha lwenziwa ngoko nangoko.
Ukongeza, uphando lunokusetyenziswa ukubonelela ngedatha yokwenyani (kwi-intanethi nakwi-intanethi). Izicwangciso zedijithali zivelisa idatha yokwenziwa. Ngaphandle kwesahlulo esingazange sithathwe kuyo nayiphi na imicimbi yehlabathi lokwenene, idatha yokwenziwa yenziwe ngendlela elinganisa ngempumelelo idatha yokwenyani ngokweempawu ezisisiseko.
Uluvo lokusebenzisa idatha yokwenziwa endaweni yedatha eyiyeyona ithembisayo kuba inokusetyenziselwa ukunika idatha yoqeqesho efundwa ngoomatshini iimodeli zifuna. Kodwa ayiqinisekanga loo nto kukubhadla okungeyonyani inokusombulula yonke imiba evela kwihlabathi lokwenyani.
Sebenzisa iimeko
Idatha ye-Synthetic iluncedo kwiinjongo zorhwebo ezahlukeneyo, kubandakanywa uqeqesho lwemodeli, ukuqinisekiswa kwemodeli, kunye novavanyo lweemveliso ezintsha. Siza kudwelisa amacandelo ambalwa akhokele indlela kwisicelo sawo sokufunda koomatshini:
1. Ukhathalelo lwempilo
Ngenxa yobuntununtunu bedatha yayo, icandelo lezempilo lilungele ukusetyenziswa kwedatha yokwenziwa. Idatha ye-synthetic inokusetyenziswa ngamaqela ukurekhoda i-physiologies yalo lonke uhlobo lwesigulana esinokubakho, ngaloo ndlela inceda ekuxilongweni okukhawulezileyo nokuchaneka ngakumbi kwezigulo.
Imodeli kaGoogle yokufumanisa imelanoma ngumzekeliso onika umdla woku kuba ibandakanya idatha eyenziweyo yabantu abaneethowuni ezimnyama zolusu (indawo yedatha yeklinikhi emelwe ngokulusizi) ukunika imodeli amandla okusebenza ngokufanelekileyo kuzo zonke iintlobo zolusu.
2. Iimoto
Izifanisi zihlala zisetyenziswa ziinkampani ezenza iimoto eziziqhubayo ukuvavanya ukusebenza. Xa imozulu imbi, umzekelo, ukuqokelela idatha yendlela yokwenyani kunokuba yingozi okanye kube nzima.
Ukuxhomekeka kuvavanyo oluphilayo kunye neemoto zokwenyani ezindleleni ayilombono ilungileyo kuba zininzi izinto eziguquguqukayo ezinokuthathelwa ingqalelo kuzo zonke iimeko ezahlukeneyo zokuqhuba.
3. Ukuphatheka kweDatha
Ukuze ukwazi ukwabelana nabanye ngedatha yabo yoqeqesho, imibutho ifuna iindlela ezithembekileyo nezikhuselekileyo. Ukufihla ulwazi olunokuhlonitshwa (PII) phambi kokwenza iseti yedatha esidlangalaleni sesinye isicelo esibangela umdla sedatha yokwenziwa. Ukutshintshiselana ngedatha yophando lwenzululwazi, idatha yezonyango, idatha yezentlalo, kunye neminye imimandla enokuthi iqulathe i-PII, kubhekiswa kuyo njengedatha eyimfihlo yokwenziwa.
4. Khu seleko
Imibutho ikhuseleke ngakumbi ngenxa yedatha eyenziweyo. Ngokuphathelele kumzekelo wethu wokuqonda ubuso kwakhona, usenokuba uqhelene nebinzana elithi “deep fakes,” elichaza iifoto okanye iividiyo ezenziweyo. Ubunyani obunzulu bunokuveliswa ngamashishini ukuvavanya ukubonwa kobuso babo kunye neenkqubo zokhuseleko. Idatha ye-Synthetic ikwasetyenziswa kucupho lwevidiyo ukuqeqesha iimodeli ngokukhawuleza nangexabiso eliphantsi.
Idatha yokwenziwa kunye nokuFunda koomatshini
Ukwakha imodeli eqinileyo nethembekileyo, i-algorithms yokufunda koomatshini idinga inani elibalulekileyo ledatha ekufuneka iqwalaselwe. Ngokungabikho kwedatha yokwenziwa, ukuvelisa umthamo omkhulu kangako wedatha kuya kuba ngumngeni.
Kwimimandla efana nombono wekhompyuter okanye ukusetyenzwa komfanekiso, apho uphuhliso lweemodeli luququzelelwa luphuhliso lwedatha yokwenziwa kwangoko, lunokubaluleka kakhulu. Uphuhliso olutsha kwinkalo yokuqatshelwa kwemifanekiso kukusetyenziswa kwe-Generative Adversarial Networks (GANs). Ngokuqhelekileyo iquka amanethiwekhi amabini: i-generator kunye nomcaluli.
Ngelixa inethiwekhi yocalucalulo ijolise ekwahluleni ezona foto zikwezo zingeyonyani, inethiwekhi yejeneretha isebenza ukuvelisa imifanekiso eyenziweyo efana kakhulu nemifanekiso yehlabathi lokwenyani.
Ekufundeni koomatshini, ii-GAN yinxalenye yosapho lwenethiwekhi ye-neural, apho zombini iinethiwekhi ziqhubeka zifunda kwaye ziphuhlise ngokudibanisa iindawo ezintsha kunye neeleya.
Xa usenza idatha yokwenziwa, unokhetho lokutshintsha imeko-bume kunye nohlobo lwedatha njengoko kufuneka ukunyusa ukusebenza komzekelo. Ngelixa ukuchaneka kwedatha eyenziweyo kunokufunyanwa ngokulula ngamanqaku awomeleleyo, ukuchaneka kwedatha ebhalwe ngexesha lokwenyani kungabiza kakhulu ngamanye amaxesha.
Ungayenza njani idatha yokwenziwa?
Iindlela ezisetyenziswayo ukwenza uqokelelo lwedatha yokwenziwa zezi zilandelayo:
Ngokusekwe kulwabiwo lwamanani
Isicwangciso esisetyenzisiweyo kule meko kukuthatha amanani ukusuka ekusasazeni okanye ukujonga ukuhanjiswa kwamanani okwenene ukwenzela ukudala idatha yobuxoki ebonakala ithelekiswa. Idatha yokwenyani inokuthi ingabikho ngokupheleleyo kwezinye iimeko.
Isazi sedatha sinokuvelisa i-dataset equlethe isampuli engakhethiyo nayiphi na ukuhanjiswa ukuba unokuqonda okunzulu kokusasazwa kwamanani kwidatha yangempela. Unikezelo oluqhelekileyo, unikezelo olubanzi, unikezelo lwe-chi-square, unikezelo olungenammiselo, kunye neminye nje imizekelo embalwa yonikezelo lwezibalo ezinokuthi zisetyenziswe ukwenza oku.
Inqanaba lenzululwazi yedatha kunye nemeko iya kuba nefuthe elibalulekileyo ekuchanekeni komzekelo oqeqeshiweyo.
Kuxhomekeke kwimodeli
Olu buchule lwakha imodeli ephendula ngokuziphatha okujongiweyo ngaphambi kokusebenzisa loo modeli ukuvelisa idatha engacwangciswanga. Ngokwenyani, oku kubandakanya ukufaka idatha yokwenyani kwidatha evela kunikezelo olwaziwayo. Indlela yaseMonte Carlo ke ingasetyenziswa ngamaqumrhu ukwenza idatha yobuxoki.
Ukongeza, ukuhanjiswa kungafakwa kusetyenziswa iimodeli zokufunda ngomatshini njengemithi yesigqibo. Iinkcukacha zesayensi Kufuneka inike ingqwalasela kuqikelelo lwengqikelelo, nangona kunjalo, njengoko imithi yesigqibo idla ngokugqwesa ngenxa yobulula bayo kunye nokwandiswa kobunzulu.
Ngokufunda nzulu
U kufunda o lukhulu iimodeli ezisebenzisa i-Variational Autoencoder (VAE) okanye i-Generative Adversarial Network (GAN) imifuziselo ziindlela ezimbini zokwenza idatha yokwenziwa. Iimodeli zokufunda zoomatshini ezingajongwanga ziquka ii-VAE.
Zenziwe ngee-encoders, ezinciphisa kwaye zidibanise idatha yokuqala, kunye ne-decoders, ephonononga le datha ukubonelela ukubonakaliswa kwedatha yangempela. Ukugcina igalelo kunye neziphumo zedatha zifana kangangoko kunokwenzeka yinjongo esisiseko ye-VAE. Iinethiwekhi ezimbini ezichasayo ze-neural ziimodeli ze-GAN kunye neenethiwekhi ezichaseneyo.
Inethiwekhi yokuqala, eyaziwa ngokuba yi-generator network, ijongene nokuvelisa idatha yobuxoki. Inethiwekhi yocalucalulo, inethwekhi yesibini, isebenza ngokuthelekisa idatha eyenziweyo eyenziweyo kunye nedatha yangempela kumzamo wokuchonga ukuba i-dataset inobuqhophololo. Umcaluli ulumkisa umenzi wejenereyitha xa efumanisa iseti yedatha yobuxoki.
Ibhetshi elandelayo yedatha enikezelwe kumcaluli ilungiswa emva koko yijenereyitha. Ngenxa yoko, umcaluli ubangcono ekuhambeni kwexesha ekuboneni iiseti zedatha zomgunyathi. Olu hlobo lomzekelo lusetyenziswa rhoqo kwicandelo lezemali ukuze kubonwe ubuqhetseba kunye nakwicandelo lokhathalelo lwempilo kwimifanekiso yezonyango.
Ukwandiswa kweDatha yindlela eyahlukileyo eqeshwa zizazinzulu zedatha ukuvelisa idatha eninzi. Akufunekanga ukuba impazamo ngedatha yobuxoki, nangona kunjalo. Ngokucacileyo, ukongezwa kwedatha sisenzo sokongeza idatha entsha kwiseti yedatha esele ikhona.
Ukwenza imifanekiso emininzi kumfanekiso omnye, umzekelo, ngokuhlengahlengisa imbonakalo, ukuqaqamba, ukwandiswa nokunye. Ngamanye amaxesha, iseti yedatha yokwenyani isetyenziswa ngolwazi lomntu siqu oluseleyo. Ukungaziwa kwedatha yilento iyiyo, kwaye iseti yedatha enjalo ngokufanayo ayifanelanga kuthathwa njengedatha yokwenziwa.
Imiceli mngeni kunye nokuthintelwa kwedatha yeSynthetic
Nangona idatha yokwenziwa ineenzuzo ezahlukeneyo ezinokunceda iifemu ngemisebenzi yesayensi yedatha, ikwanemida ethile:
- Ukuthembeka kwedatha: Kulwazi oluqhelekileyo ukuba yonke imodeli yokufunda koomatshini / imodeli yokufunda nzulu ilungile kuphela njengedatha eyondliwayo. Umgangatho wedatha yokwenziwa kulo mongo unxulumene kakhulu nomgangatho wedatha yokufaka kunye nomzekelo osetyenziselwa ukuvelisa idatha. Kubalulekile ukuqinisekisa ukuba akukho cala likhoyo kwidatha yomthombo, njengoko ezi zinokubonakaliswa ngokucacileyo kwidatha yokwenziwa. Ngaphezu koko, ngaphambi kokwenza naluphi na uqikelelo, umgangatho wedatha kufuneka uqinisekiswe kwaye ungqinisiswe.
- Kufuna ulwazi, umgudu kunye nexesha: Ngelixa ukudala idatha yokwenziwa kunokuba lula kwaye kungabizi kakhulu kunokudala idatha yokwenyani, ifuna ulwazi, ixesha kunye nomzamo.
- Ukuphindaphinda okungaqhelekanga: I-replica egqibeleleyo yedatha yelizwe lokwenyani ayinakwenzeka; idatha yokwenziwa inokuqikelela kuphela. Ke ngoko, ezinye izinto zangaphandle ezikhoyo kwidatha yokwenyani azinakugqunywa yidatha yokwenziwa. Ukusilela kwedatha kubaluleke kakhulu kunedatha eqhelekileyo.
- Ukulawula imveliso kunye nokuqinisekisa umgangatho: Idatha ye-Synthetic yenzelwe ukuphindaphinda idatha yelizwe lokwenyani. Ukuqinisekiswa kwedatha ngesandla kuba yimfuneko. Kubalulekile ukuqinisekisa ukuchaneka kwedatha ngaphambi kokuyidibanisa kumatshini wokufunda/imodeli yokufunda nzulu kwiiseti zedatha ezintsonkothileyo ezenziwe ngokuzenzekelayo kusetyenziswa i-algorithms.
- Ingxelo yomsebenzisi: Njengoko idatha eyenziweyo ingumbono wenoveli, ayinguye wonke umntu oya kulungela ukukholelwa uqikelelo olwenziwe ngayo. Oku kubonisa ukuba ukwenzela ukwandisa ukwamkeleka komsebenzisi, okokuqala kuyimfuneko ukuphakamisa ulwazi lokusetyenziswa kwedatha yokwenziwa.
Future
Ukusetyenziswa kwedatha yokwenziwa kuye kwanda kakhulu kwiminyaka elishumi edlulileyo. Ngelixa igcina iinkampani ixesha kunye nemali, ayikho ngaphandle kwayo. Ayinazinto zangaphandle, ezenzeka ngokwendalo kwidatha yokwenyani kwaye zibalulekile ekuchanekeni kwezinye iimodeli.
Kukwafanelekile ukuba uqaphele ukuba umgangatho wedatha yokwenziwa rhoqo uxhomekeke kwidatha yegalelo elisetyenziselwa ukudala; Ucalucalulo kwidatha yegalelo lunokusasazeka ngokukhawuleza kwidatha eyenziweyo, ngaloo ndlela ukukhetha idatha ekumgangatho ophezulu njengendawo yokuqala akufuneki kugqitywe.
Ekugqibeleni, idinga ulawulo olungaphezulu lwemveliso, kubandakanywa ukuthelekisa idatha yokwenziwa kunye nedatha yokwenyani echazwe ngabantu ukuqinisekisa ukuba ukungafani akwaziswa. Nangona le miqobo, idatha yokwenziwa ihlala iyintsimi ethembisayo.
Isinceda ukuba senze izisombululo ze-AI zenoveli nangona idatha yelizwe lokwenyani ingafumaneki. Okona kubaluleke kakhulu, ivumela amashishini ukuba akhe iimveliso ezibandakanya ngakumbi kwaye zibonisa ukohluka kwabathengi babo.
Kwixesha elizayo eliqhutywa yidatha, nangona kunjalo, idatha yokwenziwa ijonge ukunceda izazinzulu zedatha ukuba zenze inoveli kunye nemisebenzi yokuyila enokuba ngumceli mngeni ukuyigqiba ngedatha yehlabathi lokwenyani kuphela.
isiphelo
Kwezinye iimeko, idatha eyenziweyo inokunciphisa intsilelo yedatha okanye ukungabikho kwedatha efanelekileyo ngaphakathi kweshishini okanye umbutho. Siphinde sajonga ukuba zeziphi izicwangciso ezinokunceda ekuveliseni idatha eyenziweyo kwaye ngubani onokufumana inzuzo kuyo.
Siphinde sathetha malunga nobunye ubunzima obuza nokujongana nedatha eyenziweyo. Ukwenza izigqibo zorhwebo, idatha yokwenyani iya kuhlala ithandwa. Nangona kunjalo, idatha yokwenyani lolona khetho lulandelayo xa idatha ekrwada yokwenyani ingafikeleleki kuhlalutyo.
Nangona kunjalo, kufuneka kukhunjulwe ukuba ukwenzela ukuvelisa idatha yokwenziwa, izazinzulu zedatha ezinobungqina obuqinileyo bokubumba idatha ziyafuneka. Ukuqondwa ngokucokisekileyo kwedatha yokwenyani kunye neendawo eziyingqongileyo nazo zibalulekile. Oku kubalulekile ukuqinisekisa ukuba, ukuba kukho, idatha evelisiweyo ichaneke kangangoko kunokwenzeka.
Shiya iMpendulo