Ruzivo Masayendisiti uye nyanzvi dzekudzidza muchina dzinobata nenhamba yakakosha yedata yemhando dzakasiyana mune yakajairwa data sainzi chirongwa. Mamodheru mazhinji akagadzirwa ane akasiyana masisitimu uye maficha, pamwe neakawanda iterations yeparameter tuning kuti iwane kuita kwakaringana.
Mumamiriro ezvinhu akadai, zvese zvinogadziridzwa data uye modhi yekuvaka maitiro ekugadzirisa zvinofanirwa kuongororwa uye kuyerwa kuti zvionekwe kuti chii chakashanda uye chii chisina. Izvo zvakakoshawo kuti ugone kudzokera kune yakapfuura edition uye kutarisa kune yakapfuura.
Data Version Kudzora (DVC), iyo inobatsira mukugadzirisa data, iyo yepasi modhi, uye nekuita zvinogoneka mhedzisiro, imwe tekinoroji yakadai inoita kuti titarise zvese izvi.
Mune ino positi, isu tichanyatso tarisa muData Version Kudzora, uye zvakanakisa maturusi ekushandisa. Ngatitange.
Chii chinonzi Data Version Control?
Shanduro inodiwa kune ese masystem ekugadzira. Nzvimbo imwe chete yekuwana kune yakanyanya-yezvino data. Chero sosi iyo inowanzo gadziridzwa, kunyanya nevashandisi vakati wandei panguva imwe chete, inoda kugadzirwa kweyekuongorora nzira yekutevera shanduko dzese.
Iyo vhezheni control system ine basa rekuona kuti munhu wese muchikwata ari papeji imwe chete. Inovimbisa kuti munhu wese ari muchikwata ari kushanda pane yazvino vhezheni yefaira uye, zvinotonyanya kukosha, kuti munhu wese ari kushandira pamwe pachirongwa chimwe chete panguva.
Kana uine midziyo yakakodzera, unogona kuita izvi nekuedza kushoma!
Iwe uchave uine anowirirana data seti uye yakadzama yekuchengetedza yezvese tsvagiridzo yako kana iwe ukashandisa yakavimbika data vhezheni manejimendi zano. Maturusi ekushandura data akakosha pakufambiswa kwebasa rako kana iwe uine hanya nezve kuberekana, kuteedzera, uye ML modhi nhoroondo.
Ivo vanokubatsira iwe kuwana vhezheni yechinhu, senge hashi yedataset kana modhi, iyo iwe yaunogona kushandisa kuziva uye kuenzanisa. Iyi data vhezheni inowanzopinzwa mune yako metadata manejimendi mhinduro yekuvimbisa kuti yako modhi yekudzidziswa inoshandurwa uye inodzokororwa.
Best Data Version Kudzora maturusi
Iye zvino yave nguva yekutarisa yakanakisa data vhezheni yekudzora mhinduro dziripo, dzaunogona kushandisa kuchengetedza chikamu chese chekodhi yako.
1. git-lfs
Iyo Git LFS chirongwa chemahara kushandisa. Mukati meGit, mafaera mahombe senge odhiyo samples, mavhidhiyo, dhatabhesi, uye mapikicha anotsiviwa nemavara anonongedzera, uye zviri mukati mefaira zvinochengetwa pane iri kure server seGitHub.com kana GitHub Enterprise.
Zvinokutendera kuti ushandise Git kushandurira mafaera akakura-anosvika akati wandei GB muhukuru - tambira yakawanda muGit repositories uchishandisa ekunze chengetedzo, uye gadzira uye tora hombe faira repositori nekukurumidza. Kana zvasvika kune data manejimendi, iyi yakanaka mwenje mhinduro. Kuti ushande neGit, haudi chero mimwe mirairo, masisitimu ekuchengetedza, kana maturusi ezvishandiso.
Inodzikamisa huwandu hweruzivo rwaunotora. Izvi zvinoreva kuti cloning uye kudzoreredza mafaera mahombe kubva kumatura kuchave nekukurumidza. Iwo anongedza anogadzirwa nechinhu chakareruka uye anonongedza kuLFS.
Nekuda kweizvozvo, kana iwe uchisundidzira yako repo muhombe repository, inovandudza nekukurumidza uye inotora nzvimbo shoma.
zvayakanakira
- Inobatanidza zviri nyore mukufambiswa kwemabasa emabhizinesi mazhinji.
- Iko hakuna chikonzero chekubata mamwe kodzero nekuti inoshandisa mvumo yakafanana neyeGit repository.
nezvayakaipira
- Git LFS inoda kushandiswa kwemaseva akazvitsaurira kuchengetedza data rako. Nekuda kweizvozvo, zvikwata zvako zvesainzi yedata zvichavharirwa mukati, uye basa rako reinjiniya rinokwira.
- Yakanyanya hunyanzvi, uye inogona kudikanwa kushandiswa kweakasiyana ezvishandiso zvakasiyana zvezvikamu zvinotevera mu data sainzi mafambiro.
Pricing
Yakasununguka kushandiswa kumunhu wese.
2. LakeFS
LakeFS ndeye yakavhurika-sosi data vhezheni mhinduro inochengeta data muS3 kana GCS uye ine Git-senge branching uye kuita paradigm inoyera kune petabytes.
Iri danho rebazi rinoita kuti yako data dziva ACID ienderane nekubvumira shanduko kuti dziitike mumapazi akasiyana anogona kuvakwa, kubatanidzwa, uye kutenderedzwa kumashure atomu uye ipapo ipapo.
LakeFS inogonesa zvikwata kugadzira data dziva zviitiko zvinodzokororwa, atomiki, uye shanduro. Yipo nye, kuvhura kurugana unene mosirugana sokuzuvhisa.
Inoshandisa Git-senge bazi uye shanduro yekudzora nzira yekudyidzana neyako data lake, inokwira kusvika kuPetabytes yedata. Pachiyero cheexabyte, unogona kutarisa kutonga kweshanduro.
zvayakanakira
- Git-senge mashandiro anosanganisira branching, kuzvipira, kubatanidza, uye kudzorera.
- Pre-commit/merge hooks anoshandiswa padata CI/CD cheki.
- Inopa maficha akaomesesa senge ACID mashandisirwo eakareruka ekuchengetedza gore seS3 neGCS, zvese zvichisara fomati isina kwayakarerekera.
- Dzosera shanduko kune data munguva chaiyo.
- Zvikero zviri nyore, zvichibvumira kuti igare yakakura kwazvo data madhamu. Kudzora kweshanduro kunogona kupihwa kune ese ari maviri ekuvandudza uye ekugadzira marongero.
nezvayakaipira
- LakeFS chigadzirwa chitsva, saka kushanda uye zvinyorwa zvinogona kuchinja nekukurumidza kupfuura nemhinduro dzakapfuura.
- Sezvo yakatarisana neshanduro yedata, iwe uchafanirwa kushandisa akasiyana ekuwedzera maturusi ezvikamu zvakasiyana zve data sainzi mafambiro.
Pricing
Yakasununguka kushandiswa kumunhu wese.
3. DVC
Data Version Kudzora ndeyemahara data shanduro mhinduro yakagadzirirwa sainzi yedata uye mashandisirwo ekudzidza muchina. Icho chirongwa chinokubvumira kuti utsanangure pombi yako mune chero mutauro.
Nekugadzirisa mafaera akakura, seti yedata, modhi yekudzidza yemuchina, kodhi, zvichingodaro, chishandiso chinoita kuti mhando dzemashini dzekudzidza dzigovane uye dzigone kudhindwa. Chirongwa ichi chinotevera kutungamira kwaGit mukupa mutsara wakapfava wekuraira unogona kumisikidzwa mumatanho mashoma.
Sezvinoreva zita rayo, DVC haisi yekungoshandura data chete. Inofambisa zvakare manejimendi epombi uye modhi yekudzidza muchina kuzvikwata.
Chekupedzisira, DVC inobatsira mukuvandudza kuenderana kwemhando dzechikwata chako uye kudzokorora kwavo. Panzvimbo pekushandisa yakaoma faira suffixes uye makomendi mukodhi, tora mukana we Git mapazi kuedza mazano matsva. Kuti ufambe, shandisa automated metric-tracking pane bepa nepenzura.
Kutumira mabundle anoenderana e machine learning modhi, data, uye kodhi mukugadzira, kure makomputa, kana desktop yewaunoshanda naye, unogona kushandisa push/kudhonza mirairo pane ad-hoc zvinyorwa.
zvayakanakira
- Iyo yakareruka, yakavhurika-sosi, uye inoshanda nemapuratifomu makuru emakore uye marudzi ekuchengetedza.
- Flexible, agnostic yefomati uye chimiro, uye iri nyore kuita.
- Yese ML modhi yekushanduka-shanduka inogona kuteverwa kumashure kune yayo kodhi kodhi uye data.
nezvayakaipira
- Pipeline manejimendi uye DVC vhezheni kutonga kwakabatana zvisingaite. Pachave nekudzokororwa kana timu yako iri kutoshandisa imwe data pombi chigadzirwa.
- Sezvo DVC iri kuremerwa, timu yako ingangoda kugadzira mamwe maficha nemaoko kuti iwedzere mushandisi-hushamwari.
Pricing
Yakasununguka kushandiswa kumunhu wese.
4. DeltaLake
DeltaLake ndeye yakavhurika-sosi yekuchengetedza layer iyo inowedzera data dziva kuvimbika. Delta Lake inotsigira ACID transaction uye scalable metadata manejimendi mukuwedzera mukutenderera uye batch data kugadzirisa.
Inoshanda neApache Spark APIs uye inogara pane yako iripo data dziva. Delta Kugovera ndiyo yekutanga pasirese yakavhurika protocol yekugovana data zvakachengeteka mubhizinesi, zvichiita kuti zvive nyore kuchinjanisa data nemamwe mabhizinesi akazvimirira pamakomputa avo.
Delta Lakes inokwanisa kubata petabytes yedata zviri nyore. Metadata inochengetwa nenzira yakafanana nedata, uye vashandisi vanogona kuiwana vachishandisa Describe Detail method. Delta Lakes ine chivakwa chimwechete chinogona kuverenga zvese kuyerera uye batch data.
Upsets zviri nyore kuita uchishandisa Delta. Izvi zvinomutsa kana kusanganisa muDelta tafura inofananidzwa neSQL Merges. Iwe unogona kuishandisa kubatanidza data kubva kune imwe data furemu mutafura yako uye kuita zvigadziriso, kuisa, uye kudzima.
zvayakanakira
- Mazhinji masimba, senge ACID transaction uye yakasimba metadata manejimendi, inogona kuwanikwa mune yako yazvino data yekuchengetedza mhinduro.
- Delta Lake ikozvino inogona kushanda nesimba kubata matafura ane mabhiriyoni ezvikamu uye mafaera pane petabyte-chiyero.
- Inodzikisira kudiwa kwebhuku redhisheni yedhisheni uye zvimwe zvinonetsa, zvichibvumira vanogadzira kuti vatarise kugadzira zvigadzirwa pamusoro pemadziva avo edata.
nezvayakaipira
- Sezvo yakagadzirirwa kushanda neSpark uye data hombe, Delta Lake inowanzowandisa kune mazhinji mabasa.
- Inoda kushandiswa kweiyo yakatsaurirwa data fomati, iyo inomisa kuchinjika kwayo uye inoita kuti isaenderane nemafomu ako aripo.
Pricing
Yakasununguka kushandiswa kumunhu wese.
5. Dolt
Dolt ndeye SQL dhatabhesi inoita forking, cloning, branching, kubatanidza, kusunda, uye kudhonza nenzira imwechete sezvinoita git repository. Kuti uvandudze ruzivo rwemushandisi weiyo vhezheni yekudzora dhatabhesi, Dolt inobvumira data uye chimiro kuchinja mukuwiriranisa.
Icho chishandiso chakanakisa chekuti iwe nevaunoshanda navo mubatane pachiri. Iwe unogona kubatana neDolt nenzira imwechete yaungaite kune chero imwe MySQL dhatabhesi uye kumhanya mibvunzo kana kuita shanduko kune iyo data uchishandisa SQL mirairo.
Kana zvasvika pakushandurwa kwedata, Dolt ndeye-ye-a-mhando. Dolt idhatabhesi, kupesana nedzimwe mhinduro dzinongove shanduro data. Kunyange iyo software parizvino iri mumatanho ekutanga, kune tarisiro yekuita kuti ienderane zvizere neGit uye MySQL munguva pfupi iri kutevera.
Yese yemirairo yaunoziva kushandisa neGit inoshandawo neDolt. Git vhezheni mafaera, Dolt vhezheni matafura Uchishandisa iyo yekuraira mutsara interface, pinza mafaera eCSV, ita shanduko dzako, zvishambadzire kure, uye batanidza shanduko yeumwe wako.
zvayakanakira
- Lightweight uye open source muchidimbu.
- Mukuenzanisa nesarudzo dzakawanda dzisina kujeka, ine SQL interface, ichiita kuti isvike kune vanoongorora data.
nezvayakaipira
- Mukuenzanisa nedzimwe nzira dzekushandura dhatabhesi, Dolt ichiri chigadzirwa chiri kusimukira.
- Sezvo Dolt iri dhatabhesi, iwe unofanirwa kuendesa data rako mairi kuti uwane mabhenefiti.
Pricing
Wese munhu anogamuchirwa kushandisa musangano wenharaunda. Iyo puratifomu haina kupa premium mitengo; pachinzvimbo, unofanira kubata mupi.
6. Pachyderm
Pachyderm ndeyemahara data sainzi shanduro yekudzora system ine akawanda maficha. Pachyderm Enterprise inzvimbo ine simba yesainzi yedata yakagadzirirwa kudyidzana kwakakura munzvimbo dzakachengeteka zvakanyanya.
Pachyderm ndeimwe yezvinyorwa zvishoma zvesainzi data mapuratifomu. Chinangwa chaPachyderm ndechekupa chikuva chinobata kutenderera kwakazara kwedata uye inoita kuti zvive nyore kutevedzera zvakawanikwa zvemashini ekudzidza modhi. Pachyderm inozivikanwa se "iyo Docker yeData" mune ino mamiriro. Pachyderm mapakeji kumusoro kwenzvimbo yako yekuuraya uchishandisa Docker midziyo. Izvi zvinoita kuti zvive nyore kutevedzera zvakafanana zvabuda.
Masayendisiti edata uye zvikwata zveDevOps zvinogona kutumira mamodheru nechivimbo nekuda kwekusanganiswa kwedata rakashandurwa neDocker. Nekuda kweiyo inoshanda sisitimu yekuchengetedza, petabytes yedata yakarongeka uye isina kurongeka inogona kuchengetedzwa nepo mari yekuchengetera ichichengetwa iri shoma.
Muzvikamu zvese zvepombi, faira-based versioning inopa yakakwana yekuongorora rekodhi kune ese data uye zvigadzirwa, kusanganisira zvepakati zvinobuda. Huzhinji hwezvishandiso zvechishandiso zvinofambiswa nembiru idzi, izvo zvinobatsira zvikwata kuwana zvakanyanya kubva pazviri.
zvayakanakira
- Zvichienderana nemidziyo, yako data nharaunda ichave inotakurika uye nyore kutamisa pakati pevanopa makore.
- Robust, nekugona kuyera kubva kudiki kusvika kune yakakura kwazvo masisitimu.
nezvayakaipira
- Sezvo paine akawanda anofamba zvinhu, senge Kubernetes server inodiwa kubata Pachyderm's edition yemahara, kune yakakwira yekudzidza curve.
- Pachyderm inogona kuve yakaoma kubatanidza mune yekambani iripo zvivakwa nekuda kwezvakawanda zvetekinoroji zvikamu.
Pricing
Unogona kutanga kushandisa papuratifomu nechikamu chenharaunda uye kune bhizinesi edition, unofanirwa kubata mutengesi.
7. Neptune
Model-building metadata inotungamirwa neML metadata store, inova chinhu chakakosha cheMLOps stack. Kune yega yega MLOps mafambiro, Neptune inoshanda sepakati metadata kuchengetedza.
Iwe unogona kuchengeta, kuona, uye kuenzanisa zviuru zvemashini ekudzidza modhi ese munzvimbo imwechete. Inosanganisira maficha akadai sekuyedza kuteedzera, modhi registry, uye modhi yekutarisa, pamwe nekubatana interface. Inosanganisira anopfuura makumi maviri neshanu maturusi akasiyana uye maraibhurari akasanganiswa, anosanganisira akati wandei emhando yekudzidzira uye hyperparameter tuning maturusi.
Unogona kujoina Neptune usingashandisi kadhi rako rechikwereti. A Gmail account ichakwana panzvimbo yayo.
zvayakanakira
- Kubatanidza chero pombi, kuyerera, codebase, kana chimiro chiri nyore.
- Iwo chaiwo-nguva yekuona, iyo iri nyore API, uye inokurumidza kutsigirwa
- NeNeptune, unogona kugadzira "backup" yeese data yako yekuyedza munzvimbo imwechete, iyo yaunogona kudzoreredza gare gare.
nezvayakaipira
- Kunyangwe isiri yakavhurika-sosi, imwe vhezheni ingangove yakakwana kushandiswa kwakavanzika, kunyangwe kuwana kwakadaro kunogumira kumwedzi mumwe.
- Kune mashoma mashoma ekugadzira kukanganisa kuwanikwa.
Pricing
Unogona kutanga kushandisa chikuva neIndividual hurongwa iyo yemahara kushandisa kune wese munhu. Chikamu chemitengo chinotanga kubva kumadhora zana nemakumi mashanu / mwedzi.
mhedziso
Mune ino post, takakurukura akanakisa data shanduro maturusi. Chishandiso chega chega, sezvataona, chine seti yezvimiro zvayo. Dzimwe dzaive dzemahara, nepo dzimwe dzaida muripo. Mamwe akanyatsokodzera kune diki bhizinesi modhi, nepo mamwe akanyatsokodzera kune yakakura bhizinesi modhi.
Nekuda kweizvozvo, iwe unofanirwa kusarudza iyo yakanakisa software yezvinangwa zvako mushure mekuyera zvakanakira uye zvazvakaipira. Isu tinokukurudzira kuti uedze iyo yemahara vhezheni usati watenga chigadzirwa chepremium.
Leave a Reply