Isiqulatho[Fihla][Bonisa]
Ukuba ungumfundi wePython okanye ukhangela isixhobo esinamandla sokusebenzisa ukwazisa ukufundwa koomatshini kwinkqubo yemveliso, iScikit-learn lithala leencwadi ekufuneka ulijonge.
I-Scikit-lear ibhalwe kakuhle kwaye kulula ukuyisebenzisa, nokuba umtsha ekufundeni ngoomatshini, ufuna ukuphakama nokusebenza ngokukhawuleza, okanye ufuna ukusebenzisa esona sixhobo sihlaziyiweyo seML sophando.
Ikuvumela ukuba wakhe imodeli yedatha eqikelelweyo kwimigca embalwa yekhowudi kwaye emva koko usebenzisa loo modeli ukuze ihambelane nedatha yakho njengethala leencwadi eliphezulu. Iguquguquka kwaye isebenza kakuhle nezinye Amathala eencwadi ePython njenge-Matplotlib ye-charting, i-NumPy ye-array vectorization, kunye ne-pandas yokubonwa kwedatha.
Kwesi sikhokelo, uya kufumanisa konke malunga nokuba yintoni, ungayisebenzisa njani, kunye nezinto eziluncedo kunye nezingalunganga.
Yintoni i Scikit-funda?
I-Scikit-learn (ekwaziwa ngokuba yi-sklearn) ibonelela ngeseti eyahluka-hlukeneyo yeemodeli zamanani kunye nokufunda koomatshini. Ngokungafaniyo neemodyuli ezininzi, i-sklearn iphuhliswa kwi-Python kunokuba i-C. Nangona iphuhliswe kwi-Python, ukusebenza kakuhle kwe-sklearn kuchazwe ekusebenziseni kwayo i-NumPy yokusebenza okuphezulu kwe-algebra yomgca kunye nokusebenza kwe-array.
I-Scikit-Learn yenziwa njengenxalenye yeprojekthi yeKhowudi yeHlobo likaGoogle kwaye ukusukela ngoko yenza ubomi bezigidi zezazinzulu zedatha yePython-centric kwihlabathi liphela balula. Eli candelo loluhlu ligxininise ekuboniseni ithala leencwadi kunye nokugxila kwinto enye - ukuguqulwa kwedatha yedatha, eyona nto ibalulekileyo kunye nenyathelo elibalulekileyo ekufuneka lithathwe ngaphambi kokuphuhlisa imodeli yokuqikelela.
Ithala leencwadi lisekelwe kwi-SciPy (iPython yeSayensi), ekufuneka ifakwe phambi kokuba usebenzise i-scikit-learn. Esi sitaki siqulethe ezi zinto zilandelayo:
- I-NumPy: Iphakheji ye-Python esemgangathweni yoluhlu lwe-n-dimensional
- SciPy: Yiphakheji esisiseko yekhompyuter yesayensi
- IiPanda: Izakhiwo zedatha kunye nohlalutyo
- IMatplotlib: Lithala leencwadi elinamandla le-2D/3D lokucwangcisa
- I-Sympy: imathematika engumqondiso
- IPython: Ukuphuculwa konxibelelwano lwekhonsoli
Usetyenziso lwethala leencwadi leScikit
I-Scikit-learn yiphakheji ye-Python evulekileyo kunye nohlalutyo lwedatha oluyinkimbinkimbi kunye neempawu zemigodi. Iza ne-plethora ye-algorithms eyakhelwe-ngaphakathi ukukunceda ukuba ufumane okuninzi kwiiprojekthi zakho zesayensi yedatha. Ithala leencwadi le-Scikit lisetyenziswa ngezi ndlela zilandelayo.
1. Ukuhlehla
Uhlalutyo lokuhlehla bubuchule bobalo bokuhlalutya kunye nokuqonda unxibelelwano phakathi kwezinto ezimbini okanye ngaphezulu. Indlela esetyenziselwa ukwenza uhlalutyo lokuhlehla inceda ekuboneni ukuba zeziphi izinto ezifanelekileyo, ezinokungahoywa, kunye nendlela ezisebenzisana ngayo. Iindlela zokubuyisela umva, umzekelo, zingasetyenziselwa ukuqonda ngcono ukuziphatha kwamaxabiso esitokhwe.
Ii-algorithms zokuhlehla ziquka:
- Ukunyuswa koMgca
- I-Ridge Regression
- Uxinzelelo lweLasso
- Isigqibo soMthi wokuthotywa
- Random Forest
- Oomatshini beVector yenkxaso (SVM)
2. Ukuhlelwa
Indlela yoHlulo yindlela yokuFundisa eLawulayo esebenzisa idatha yoqeqesho ukuchonga udidi lwemigqaliselo emitsha. I-algorithm kuHlelo lufunda kwinto enikiweyo idathasethi okanye imigqaliselo aze ahlele imigqaliselo eyongezelelweyo ibenye yeendidi ezininzi okanye amaqela. Zinokuthi, umzekelo, zisetyenziswe ukuhlela unxibelelwano lwe-imeyile njengogaxekile okanye hayi.
Ii-algorithms zokuhlela ziquka oku kulandelayo:
- Ulawulo loLungiselelo
- K-Abamelwane abakufutshane
- Inkxaso yeVector Machine
- Umthi wesigqibo
- Random Forest
3. Ukudibanisa
I-algorithms yokudibanisa kwi-Scikit-learn isetyenziselwa ukulungelelanisa idatha kunye neepropati ezifanayo kwiiseti. I-Clustering yinkqubo yokuhlanganisa iseti yezinto ukuze ezo zikwiqela elinye zifane ngakumbi nezo zamanye amaqela. Idatha yabathengi, umzekelo, inokwahlulwa ngokusekelwe kwindawo abakuyo.
I-algorithms yokudibanisa iquka oku kulandelayo:
- DB-SCAN
- K-Iindlela
- Ibhetshi encinci ye-K-Ndlela
- I-Spectral Clustering
4. Ukukhetha imodeli
I-algorithms yokukhetha imodeli ibonelela ngeendlela zokuthelekisa, ukuqinisekiswa, kunye nokukhetha iiparamitha ezifanelekileyo kunye neemodeli zokusetyenziswa kumanyathelo esayensi yedatha. Idatha enikeziweyo, ukhetho lwemodeli yingxaki yokukhetha imodeli yeenkcukacha-manani kwiqela leemodeli zomgqatswa. Kwiimeko ezisisiseko, ukuqokelelwa kwangaphambili kwedatha kuthathelwa ingqalelo. Nangona kunjalo, umsebenzi unokubandakanya uyilo lweemvavanyo ukwenzela ukuba idatha efunyenweyo ifaneleke kakuhle kwingxaki yokukhetha imodeli.
Iimodyuli zokukhetha imodeli ezinokuphucula ukuchaneka ngokuhlengahlengisa iiparamitha zibandakanya:
- Ukuqinisekiswa okunqamlezileyo
- Ukukhangela kwigridi
- Metrics
5. UkuNcitshiswa koMda
Ugqithiso lwedatha ukusuka kwindawo enomgangatho ophezulu ukuya kwindawo ephantsi-dimensional ukwenzela ukuba umboniso ophantsi-dimensional ulondoloze imiba ebalulekileyo yedatha yoqobo, ngokufanelekileyo kufutshane nobukhulu bayo bendalo, kwaziwa njengokuncitshiswa kobukhulu. Inani leenguqu ezingahleliweyo zokuhlalutya ziyancipha xa ubukhulu buncitshiswa. Idatha yangaphandle, umzekelo, ayinakuqwalaselwa ukuphucula ukusebenza kakuhle kokubonwayo.
I-algorithm yokuNcitshiswa kweDimensionality ibandakanya oku kulandelayo:
- Ukukhetha amanqaku
- Uhlalutyo lweCandelo eliyiNtloko (PCA)
Ukufakela i-Scikit-learn
I-NumPy, i-SciPy, i-Matplotlib, i-IPython, i-Sympy, kunye ne-Pandas ziyafuneka ukuba zifakwe phambi kokusebenzisa i-Scikit-learn. Masizifake usebenzisa ipip esuka kwikhonsoli (isebenza kuphela kwiWindows).
Masifake iScikit-sifunde ngoku kuba sifake amathala eencwadi afunekayo.
Iimbonakalo
I-Scikit-learn, ngamanye amaxesha eyaziwa ngokuba yi-sklearn, sisixhobo sePython sokuphumeza iimodeli zokufunda koomatshini kunye nemodeli yezibalo. Sinokuyisebenzisa ukwenza imifuziselo yokufunda koomatshini abaninzi ukuhlehla, ukuhlelwa, kunye nokudibanisa, kunye nezixhobo zokubala zokuvavanya ezi modeli. Ikwabandakanya ukuncitshiswa kobukhulu, ukukhetha uphawu, ukutsalwa kweempawu, iindlela zokuhlanganisana, kunye neeseti zedatha eyakhelweyo. Siza kuphanda nganye yezi mpawu ibe nye ngexesha.
1. Ukuthathwa ngaphandle kweeSeti zedatha
I-Scikit-learn iquka inani leedatha ezakhiwe kwangaphambili, ezifana ne-iris dataset, i-dataset yexabiso lasekhaya, i-titanic dataset, njalo njalo. Iinzuzo eziphambili zezi seti zedatha kukuba zilula ukuzibamba kwaye zingasetyenziselwa ukuphuhlisa ngokukhawuleza iimodeli zeML. Ezi datha zifanelekile kubaqalayo. Ngokufanayo, ungasebenzisa i-sklearn ukungenisa idatha ezongezelelweyo. Ngokufanayo, unokuyisebenzisa ukungenisa idatha ezongezelelweyo.
2. Ukwahlula iSeti yedatha yoQeqesho noVavanyo
I-Sklearn iquka ukukwazi ukwahlula idatha ibe ngamacandelo oqeqesho kunye novavanyo. Ukwahlula isethi yedatha kuyafuneka kuvavanyo olungakhethi cala lomsebenzi woqikelelo. Singacacisa ukuba ingakanani idatha yethu ekufuneka ifakwe kuloliwe kunye neeseti zedatha zovavanyo. Sahlula-hlula i-dataset sisebenzisa ulwahlulo lovavanyo lukaloliwe kangangokuba iseti kaloliwe ibandakanya i-80% yedatha kwaye isethi yovavanyo ine-20%. Iseti yedatha inokwahlulwa ngolu hlobo lulandelayo:
3. Ukuhlehla ngomgca
Ukuhlehla ngomgca yindlela ephantsi yokufunda esekelwe kumatshini wokufunda. Iqhuba umsebenzi wokubuyisela. Ngokusekwe kwiinguqu ezizimeleyo, iimodeli zokuhlehla ixabiso lokuqikelela injongo. Ikakhulu isetyenziselwa ukumisela ikhonkco phakathi kwezinto eziguquguqukayo kunye nokuqikelela. Iimodeli ezahlukeneyo zokuhlehla ziyahluka ngokohlobo loxhulumaniso abaluvavanyayo phakathi kwezinto ezixhomekeke kunye nezizimeleyo, kunye nenani leenguqu ezizimeleyo ezisetyenzisiweyo. Sinokwenza ngokulula imodeli yoHlengahlengiso yoMda sisebenzisa i-sklearn ngolu hlobo lulandelayo:
4. ULungiselelo loLungiselelo
Indlela eqhelekileyo yokwahlulahlula kukuhlehla kwenkqubo. Ikusapho olunye njengepolynomial kunye nelinear regression kwaye yeyosapho lokuhlelwa ngokwemigca. Izinto ezifunyanisiweyo zokuhlehliswa kwezinto zilula ukuziqonda kwaye ziyakhawuleza ukubala. Ngendlela efanayo nohlengahlengiso lomgca, uhlengahlengiso lolungiselelo bubuchule obubekwe esweni. Uguqulo lwemveliso lucategorical, ngoko ngumahluko kuphela. Iyakwazi ukugqiba ukuba ngaba isigulane sinesifo senhliziyo.
Imiba eyahlukahlukeneyo yokuhlelwa, efana nokuchongwa kwe-spam, ingasonjululwa kusetyenziswa uhlengahlengiso lolungiselelo. Ukubikezela kwesifo sikashukela, ukugqiba ukuba umthengi uya kuthenga imveliso ethile okanye atshintshele kwimbangi, ukugqiba ukuba umsebenzisi uya kucofa ikhonkco elithile lokuthengisa, kunye nezinye iimeko ezininzi ziyimizekelo embalwa.
5. Umthi wesigqibo
Olona luhlu lunamandla kwaye lusetyenziswa ngokubanzi kunye nobuchule bokuxela kwangaphambili ngumthi wesigqibo. Umthi wesigqibo sisakhiwo somthi esibonakala ngathi yi-flowchart, kunye ne-node yangaphakathi nganye emele uvavanyo kwimpawu, isebe ngalinye elimele ukugqitywa kovavanyo, kunye ne-leaf node (i-terminal node) ebambe ileyibhile yeklasi.
Xa iinguqu ezixhomekeke kuzo zingenalo unxulumano lomgca kunye neenguqu ezizimeleyo, okt xa ukuhlehliswa komgca kungavelisi iziphumo ezichanekileyo, imithi yesigqibo inenzuzo. Into yeDecisionTreeRegression () inokusetyenziselwa ngendlela efanayo ukusebenzisa umthi wesigqibo sokuhlehla.
6. Random Forest
A ihlathi random a yokufunda umatshini indlela yokusombulula ukuhlehla kunye nemiba yokuhlela. Isebenzisa ukufunda okudityanisiweyo, okuyindlela edibanisa abadidiyeli abaninzi ukusombulula iingxaki ezinzima. Indlela yehlathi engacwangciswanga yenziwe ngenani elikhulu lemithi yesigqibo. Isenokusetyenziswa ukwahlula izicelo zemali-mboleko, ukubona ubuqhophololo, kunye nokulindela ukuqhambuka kwezifo.
7. Ukubhideka kweMatrix
I-matrix yokubhideka yitheyibhile esetyenziselwa ukuchaza imodeli yohlelo. La magama mane alandelayo asetyenziswa ukujonga imeko yokubhideka:
- Inyani elungileyo: Ibonisa ukuba imodeli ibonise isiphumo esilungileyo kwaye ichanekile.
- Inyani engalunganga: Ibonisa ukuba imodeli ibonise isiphumo esibi kwaye ichanekile.
- Ubuxoki obulungileyo: Kuthetha ukuba imodeli ibilindele isiphumo esilungileyo kodwa ibingeyonyani.
- Ubuxoki obungalunganga: Kubonisa ukuba imodeli ilindele isiphumo esibi, ngelixa isiphumo sasilungile.
Ukuphunyezwa kwe-matrix yokubhideka:
eziluncedo
- Ilula ukuyisebenzisa.
- Iphakheji ye-Scikit-learn iguquguquka kakhulu kwaye iluncedo, ikhonza iinjongo zehlabathi zokwenyani ezifana nokuqikelela ukuziphatha kwabathengi, uphuhliso lwe-neuroimage, njalo njalo.
- Abasebenzisi abanqwenela ukudibanisa i-algorithms kunye namaqonga abo baya kufumana amaxwebhu aneenkcukacha ze-API kwiwebhusayithi ye-Scikit-lear.
- Ababhali abaninzi, abadibanisi, kunye nenkxaso enkulu yabahlali behlabathi kwi-intanethi kwaye ugcine iScikit-ifunda isexesheni.
neengozi
- Ayilokhetho lufanelekileyo lofundo olunzulu.
isiphelo
I-Scikit-Learn yiphakheji ebalulekileyo kuyo yonke inzululwazi yedatha ukuba ibambe ngamandla kunye namava athile nayo. Esi sikhokelo kufuneka sikuncede ngokukhohlisa idatha usebenzisa i-sklearn. Kukho izinto ezininzi zokukwazi ukwenza i-Scikit-ufunde onokuthi uyifumane njengoko uqhubela phambili kuhambo lwakho lwesayensi yedatha. Yabelana ngeengcinga zakho kwizimvo.
Shiya iMpendulo