Okuqukethwe[Fihla][Bonisa]
Abacwaningi nososayensi bedatha bavame ukuhlangana nezimo lapho bengenayo idatha yangempela noma bangakwazi ukuyisebenzisa ngenxa yokugcinwa kuyimfihlo noma ukucatshangelwa kobumfihlo.
Ukubhekana nale nkinga, ukukhiqizwa kwedatha yokwenziwa kusetshenziselwa ukukhiqiza ukumiselela kwedatha yangempela.
Ukushintshwa okufanele kwedatha yangempela kuyadingeka ukuze i-algorithm isebenze kahle, okufanele futhi ibe ngokoqobo ngohlamvu. Ungasebenzisa idatha enjalo ukuze ugcine ubumfihlo, amasistimu okuhlola, noma ukhiqize idatha yokuqeqeshwa yama-algorithms okufunda komshini.
Ake sihlole ukukhiqizwa kwedatha yokwenziwa ngokuningiliziwe futhi sibone ukuthi kungani ibalulekile eminyakeni ye-AI.
Iyini i-Synthetic Data?
Idatha yokwenziwa yidatha yezichasiselo ekhiqizwa ukulingisa kwekhompuyutha noma ama-algorithms esikhundleni sedatha yomhlaba wangempela. Kungumfanekiso owenziwe ngobuhlakani bokwenziwa wedatha yangempela.
Umuntu angase asebenzise amaphethini wedatha nobukhulu esebenzisa ama-algorithms e-AI athuthukile. Bangakwazi ukudala inani elingenamkhawulo ledatha yokwenziwa emele ngokwezibalo idatha yokuqeqeshwa yasekuqaleni uma sebeqeqeshiwe.
Kunezinhlobonhlobo zezindlela nobuchwepheshe ezingasisiza ukuthi sidale idatha yokwenziwa futhi ongayisebenzisa ezinhlelweni ezihlukahlukene.
Isofthiwe yokukhiqiza idatha ivamise ukudinga:
- Imethadatha yenqolobane yedatha, okufanele idalwe yona idatha yokwenziwa.
- Isu lokukhiqiza amanani aphathekayo kodwa enganekwane. Izibonelo zifaka izinhlu zamanani nezinkulumo ezivamile.
- Ukuqwashisa okuphelele kwabo bonke ubudlelwano bedatha, lezo ezimenyezelwe ezingeni lesizindalwazi kanye nalezo ezilawulwa ezingeni lekhodi yesicelo.
Kudingeka ngokulinganayo ukuqinisekisa imodeli futhi uqhathanise izici zokuziphatha zedatha yangempela nalezo ezikhiqizwe imodeli.
Lawa madathasethi angelona iqiniso analo lonke inani lento yangempela, kodwa ayikho idatha ebucayi. Kufana nekhekhe elimnandi, elingenakhalori. Iwubonisa ngokunembile umhlaba wangempela.
Njengomphumela, ungayisebenzisa esikhundleni sedatha yomhlaba wangempela.
Ukubaluleka Kwedatha Yokwenziwa
Idatha yokwenziwa inezici ezifanelana nezidingo ezithile noma izimo ebezingeke zitholakale kudatha yomhlaba wangempela. Uma kukhona ukushoda kwedatha yokuhlolwa noma uma ubumfihlo bubhekwa phezulu, kuba usizo.
Amasethi edatha akhiqizwe yi-AI ayavumelana nezimo, avikelekile, futhi kulula ukuwagcina, ukuwashintsha, nokulahlwa. Indlela yokwenziwa kwedatha ifanele ukusetha kancane kanye nokwenza ngcono idatha yoqobo.
Ngenxa yalokho, ilungele ukusetshenziswa njengedatha yokuhlola kanye nedatha yokuqeqeshwa kwe-AI.
- Ukufundisa i-Uber esekwe kuML kanye Izimoto ezizishayelayo zeTesla.
- Ezimbonini zezokwelapha nezokunakekelwa kwezempilo, ukuhlola izifo ezithile nezimo lapho idatha yangempela ingekho khona.
- Ukutholwa nokuvikela ukukhwabanisa kubalulekile emkhakheni wezezimali. Ngokuyisebenzisa, ungase uphenye izigameko ezintsha zokukhwabanisa.
- I-Amazon iqeqesha uhlelo lolimi lwe-Alexa isebenzisa idatha yokwenziwa.
- I-American Express isebenzisa idatha yokwenziwa yezezimali ukuze ithuthukise ukutholwa kokukhwabanisa.
Izinhlobo Zedatha Yokwenziwa
Idatha yokwenziwa yenziwa ngokungahleliwe ngenhloso yokufihla ulwazi oluyimfihlo olubucayi kuyilapho kugcinwa ulwazi lwezibalo mayelana nezici kudatha yangempela.
Iyizinhlobo ezintathu kakhulu:
- Idatha yokwenziwa ngokugcwele
- Idatha yokwenziwa ngokwengxenye
- Idatha yokwenziwa ye-Hybrid
1. Idatha Yokwenziwa Ngokugcwele
Le datha yenziwe ngokuphelele futhi ayiqukethe idatha yoqobo.
Ngokuvamile, i-generator yedatha yalolu hlobo izohlonza imisebenzi yokuminyana kwezici kudatha yangempela futhi ilinganisele amapharamitha azo. Kamuva, kusukela kumisebenzi yokuminyana ebikezelwe, uchungechunge oluvikelwe ubumfihlo ludalwa ngokungahleliwe kusici ngasinye.
Uma izici ezimbalwa nje zedatha yangempela zikhethwa ukuze kufakwe enye esikhundleni sayo, uchungechunge oluvikelwe lwalezi zici lufakwe kumephu ezicini ezisele zedatha yangempela ukuze kulinganiswe uchungechunge oluvikelekile nolwangempela ngokulandelana okufanayo.
Amasu we-Bootstrap kanye nokufaka okuningi kuyizindlela ezimbili zendabuko zokukhiqiza idatha yokwenziwa ngokuphelele.
Ngenxa yokuthi idatha yenziwe ngokuphelele futhi ayikho idatha yangempela ekhona, leli su linikeza ukuvikelwa kobumfihlo okuhle kakhulu ngokuthembela ekubeni yiqiniso kwedatha.
2. Idatha Engaphelele Yokwenziwa
Le datha isebenzisa kuphela amanani okwenziwa ukuze amiselele amanani ezici ezimbalwa ezibucayi.
Kulesi simo, amanani angempela ashintshwa kuphela uma kunengozi enkulu yokuchayeka. Lolu shintsho lwenzelwa ukuvikela ubumfihlo bedatha esanda kwakhiwa.
Kusetshenziswa izindlela eziningi zokulandelanisa kanye nezisekelwe kumamodeli ukukhiqiza idatha yokwenziwa kancane. Lezi zindlela zingasetshenziswa futhi ukugcwalisa amanani angekho kudatha yomhlaba wangempela.
3. Hybrid Synthetic Data
Idatha yokwenziwa ye-Hybrid ihlanganisa kokubili idatha yangempela kanye neyomgunyathi.
Irekhodi eliseduze kulo likhethwa kurekhodi ngalinye elingahleliwe ledatha yangempela, bese kokubili kuhlanganiswe ukuze kukhiqizwe idatha eyingxube. Inezinzuzo zakho kokubili idatha yokwenziwa ngokuphelele kanye nengxenye yokwenziwa.
Ngakho-ke inikeza ukulondolozwa kobumfihlo okuqinile ngensizakalo ephezulu uma iqhathaniswa nokunye okubili, kodwa ngenani lenkumbulo eyengeziwe nesikhathi sokucubungula.
Amasu Okwenziwa Kwedatha Yokwenziwa
Iminyaka eminingi, umqondo wedatha eyenziwe ngomshini ubudumile. Manje sekuyavuthwa.
Nazi ezinye zezindlela ezisetshenziselwa ukukhiqiza idatha yokwenziwa:
1. Ngokusekelwe ekusabalaliseni
Esimeni lapho kungekho datha yangempela ekhona, kodwa umhlaziyi wedatha unombono ophelele wokuthi ukusatshalaliswa kwedathasethi kuzovela kanjani; bangakhiqiza isampuli engahleliwe yanoma yikuphi ukusatshalaliswa, okuhlanganisa Okuvamile, Okuchazisiwe, i-Chi-square, t, lognormal, kanye ne-Uniform.
Inani ledatha yokwenziwa kule ndlela liyahlukahluka kuye ngezinga lokuqonda lomhlaziyi mayelana nendawo ethile yedatha.
2. Idatha yomhlaba wangempela ekusabalaliseni okwaziwayo
Amabhizinisi angayikhiqiza ngokuhlonza ukusatshalaliswa okufaneleka okungcono kakhulu kwedatha yangempela enikeziwe uma kunedatha yangempela.
Amabhizinisi angasebenzisa indlela ye-Monte Carlo ukuyikhiqiza uma efisa ukufaka idatha yangempela ekusabalaliseni okwaziwayo futhi azi amapharamitha wokusabalalisa.
Nakuba indlela ye-Monte Carlo ingasiza amabhizinisi ekutholeni okufanayo okukhulu kakhulu okutholakalayo, ukulingana okungcono kakhulu kungase kungabi ukusetshenziswa okwanele kwezidingo zedatha yokwenziwa yenkampani.
Amabhizinisi angase ahlole ukusebenzisa amamodeli okufunda omshini ukuze avumelane nokusabalalisa kulezi zimo.
Amasu okufunda ngomshini, njengezihlahla zesinqumo, anika amandla izinhlangano ukuthi zenze imodeli yokusabalalisa okungezona zakudala, okungenzeka kube yizinhlobo eziningi futhi kuntule izici ezijwayelekile zokusabalalisa okubonwayo.
Amabhizinisi angase akhiqize idatha yokwenziwa exhuma kudatha yangempela esebenzisa lo mshini ukusabalalisa okufakwayo.
Nokho, amamodeli wokufunda wemishini zisengozini yokugcwala ngokweqile, okubangela ukuthi zehluleke ukufanisa idatha entsha noma ukubikezela ukubhekwa kwesikhathi esizayo.
3. Ukufunda Okujulile
Amamodeli akhiqizayo ajulile njenge-Variational Autoencoder (VAE) kanye ne-Generative Adversarial Network (GAN) angakhiqiza idatha yokwenziwa.
I-Autoencoder ehlukile
I-VAE iyindlela engagadiwe lapho isishumeki sicindezela idathasethi yoqobo bese sithumela idatha kusikhikhoda.
Idekhoda ibe isikhiqiza okukhiphayo okungumfanekiso wedathasethi yoqobo.
Ukufundisa uhlelo kubandakanya ukukhulisa ukuhlobana phakathi kwedatha yokufaka neyokukhiphayo.
I-Generative Adversarial Network
Imodeli ye-GAN iqeqesha ngokuphindaphindiwe imodeli isebenzisa amanethiwekhi amabili, ijeneretha, kanye nokucwasa.
Ijeneretha idala idathasethi yokwenziwa kusukela kusethi yedatha yesampula engahleliwe.
I-Discriminator iqhathanisa idatha edalwe ngokwenziwa kudathasethi yangempela kusetshenziswa izimo ezichazwe ngaphambilini.
Abahlinzeki Bedatha Yokwenziwa
Idatha Ehlelekile
Amapulatifomu ashiwo ngezansi ahlinzeka ngedatha yokwenziwa etholakala kudatha yethebula.
Iphindaphinda idatha yomhlaba wangempela egcinwe kumathebula futhi ingasetshenziselwa ukuhlaziya ukuziphatha, ukubikezela, noma okwenziwayo.
- Faka i-AI: Ingumhlinzeki wesistimu yokudala idatha yokwenziwa esebenzisa ama-Generative Adversarial Networks kanye nobumfihlo obuhlukile.
- Idatha engcono: Ingumhlinzeki wesixazululo sedatha yokwenziwa egcina ubumfihlo ye-AI, ukwabelana ngedatha, nokuthuthukiswa komkhiqizo.
- Divepale: Ingumhlinzeki we-Geminai, isistimu yokudala amasethi edatha 'amawele' anezici zezibalo ezifanayo njengedatha yoqobo.
Idatha Engakhiwe
Izinkundla ezishiwo ngezansi zisebenza ngedatha engahlelekile, zihlinzeka ngempahla yedatha yokwenziwa namasevisi ombono wokuqeqesha kanye nama-algorithms wokuhlola.
- Idathagen: Ihlinzeka ngedatha yokuqeqeshwa eyenziwe ye-3D yokufunda nokuthuthukiswa kwe-Visual AI.
- Ama-Neurolabs: I-Neurolabs ingumhlinzeki wepulatifomu yedatha yokwenziwa yombono wekhompyutha.
- Isizinda esihambisanayo: Ingumhlinzeki wenkundla yedatha yokwenziwa yokuqeqeshwa kwesistimu ezimele kanye namacala okusebenzisa okuhlola.
- I-Cognata: Ingumphakeli wokulingisa we-ADAS nabathuthukisi bezimoto ezizimele.
- I-Bifrost: Ihlinzeka ngama-API wedatha yokwenziwa wokudala izindawo ze-3D.
Izinselele
Inomlando omude ku Ukuhlakanipha okungekhona okwangempela, futhi nakuba inezinzuzo eziningi, futhi inezihibe ezibalulekile okudingeka ubhekane nazo ngenkathi usebenza ngedatha yokwenziwa.
Nazi ezinye zazo:
- Kungase kube namaphutha amaningi ngenkathi kukopishwa ubunkimbinkimbi kusuka kudatha yangempela kuya kudatha yokwenziwa.
- Imvelo yayo ethambile iholela ekuziphatheni kwayo okuchemayo.
- Kungase kube namaphutha athile afihliwe ekusebenzeni kwama-algorithms aqeqeshwe kusetshenziswa izethulo ezenziwe lula zedatha yokwenziwa esanda kuvela ngenkathi kusetshenzwa nedatha yangempela.
- Ukuphindaphinda zonke izibaluli ezifanele ezivela kudatha yomhlaba wangempela kungaba nzima. Kungenzeka futhi ukuthi ezinye izici ezibalulekile zinganakwa kulo lonke lolu hlelo.
Isiphetho
Ukukhiqizwa kwedatha yokwenziwa kudonsa ngokusobala ukunaka kwabantu.
Le ndlela ingase ingabi yimpendulo elingana konke kuzo zonke izimo zokukhiqiza idatha.
Ngaphandle kwalokho, inqubo ingase idinge ubuhlakani nge-AI/ML futhi ikwazi ukusingatha izimo eziyinkimbinkimbi zomhlaba wangempela zokudala idatha ehlobene phakathi, idatha efaneleke isizinda esithile.
Noma kunjalo, ubuchwepheshe obusha obuvala igebe lapho obunye ubuchwepheshe obuvumela ubumfihlo bufinyele.
Namuhla, zokwenziwa ukukhiqizwa kwedatha kungase kudinge ukuhambisana kokufihlwa kwedatha.
Ngokuzayo, kungase kube nokuhlangana okukhulu phakathi kwakho kokubili, okuholela esixazululweni esibanzi sokukhiqiza idatha.
Yabelana ngemibono yakho kumazwana!
shiya impendulo