Okuqukethwe[Fihla][Bonisa]
Uma ungumfundi wePython noma ufuna ikhithi yamathuluzi enamandla ongayisebenzisa ukwethula ukufundwa komshini ohlelweni lokukhiqiza, i-Scikit-learn iwumtapo wolwazi okudingeka uwuhlole.
I-Scikit-learn ibhalwe kahle futhi kulula ukuyisebenzisa, noma ngabe umusha ekufundeni ngomshini, ufuna ukuvuka usebenze ngokushesha, noma ufuna ukusebenzisa ithuluzi lokucwaninga le-ML elisesikhathini samanje.
Ikuvumela ukuthi wakhe imodeli yedatha eqagelayo emigqeni embalwa kuphela yekhodi bese usebenzisa leyo modeli ukuze ifanele idatha yakho njengomtapo wolwazi wezinga eliphezulu. Iyavumelana nezimo futhi isebenza kahle nezinye Imitapo yolwazi yePython njenge-Matplotlib yokushadi, i-NumPy ye-array vectorization, kanye nama-panda okubuka idatha.
Kulo mhlahlandlela, uzothola konke mayelana nokuthi iyini, ukuthi ungayisebenzisa kanjani, kanye nobuhle nobubi bayo.
Kuyini Scikit-funda?
I-Scikit-learn (eyaziwa nangokuthi i-sklearn) inikeza isethi ehlukahlukene yamamodeli ezibalo nokufunda komshini. Ngokungafani namamojula amaningi, i-sklearn ithuthukiswa nge-Python kune-C. Naphezu kokuthi ithuthukiswe ku-Python, ukusebenza kahle kwe-sklearn kubalulwe ekusebenziseni kwayo i-NumPy ekusebenzeni okuphezulu kwe-algebra yomugqa kanye nokusebenza kohlu.
I-Scikit-Learn idalwe njengengxenye yephrojekthi ye-Google Yehlobo Lekhodi futhi kusukela ngaleso sikhathi yenze izimpilo zezigidi zososayensi bedatha bePython emhlabeni wonke zaba lula. Lesi sigaba sochungechunge sigxile ekwethuleni umtapo wolwazi nokugxila entweni eyodwa - ukuguqulwa kwedathasethi, okuyisinyathelo esibalulekile nesibalulekile okufanele sithathwe ngaphambi kokwenza imodeli yokubikezela.
Umtapo wolwazi usekelwe ku-SciPy (Scientific Python), okumele ifakwe ngaphambi kokuthi usebenzise i-scikit-learn. Lesi sitaki siqukethe izinto ezilandelayo:
- I-NumPy: Iphakheji ye-Python evamile ye-n-dimensional array
- I-SciPy: Kuyiphakheji eliyisisekelo lekhompyutha yesayensi
- AmaPanda: Izakhiwo zedatha nokuhlaziywa
- I-Matplotlib: Ingumtapo wolwazi onamandla we-2D/3D wokuhlela
- I-Sympy: Izibalo ezingokomfanekiso
- I-IPython: Ikhonsoli esebenzisanayo ethuthukisiwe
Izinhlelo zokusebenza zomtapo wolwazi we-Scikit
I-Scikit-learn iyiphakheji ye-Python yomthombo ovulekile enokuhlaziywa kwedatha okuyinkimbinkimbi nezici zezimayini. Iza nenqwaba yama-algorithms akhelwe ngaphakathi ukuze ikusize uthole okuningi kumaphrojekthi wakho wesayensi yedatha. Umtapo wolwazi we-Scikit usetshenziswa ngalezi zindlela ezilandelayo.
1. Ukuhlehla
Ukuhlaziywa kokuhlehla kuyindlela yezibalo yokuhlaziya nokuqonda ukuxhumana phakathi kokuguquguquka okubili noma ngaphezulu. Indlela esetshenziswayo ukwenza ukuhlaziya kokuhlehla isiza ekunqumeni ukuthi yiziphi izici ezifanele, ezinganakwa, kanye nendlela ezisebenzisana ngayo. Amasu okwehla, isibonelo, angase asetshenziselwe ukuqonda kangcono ukuziphatha kwezintengo zesitoko.
Ama-algorithms wokuhlehla afaka:
- Ukunciphisa okukodwa
- I-Ridge Regression
- Ukucindezelwa kweLasso
- Ukuhlehla Kwesihlahla Sesinqumo
- Ihlathi Elingahleliwe
- Imishini Yokusekela Vector (SVM)
2. Ukwahlukaniswa
Indlela Yokuhlukanisa iyindlela yokufunda egadiwe esebenzisa idatha yokuqeqeshwa ukuhlonza isigaba sokusha okubhekiwe. I-algorithm Ekuhlukaniseni ifunda kokunikeziwe idathasethi noma ukuqaphela bese ihlukanisa ukuqaphela okwengeziwe kube izigaba noma amaqoqo amaningi. Angakwazi, isibonelo, ukusetshenziselwa ukuhlukanisa ukuxhumana kwe-imeyili njengogaxekile noma cha.
I-algorithm yokuhlukanisa ihlanganisa okulandelayo:
- Ukucindezelwa Kwenhloso
- K-Omakhelwane Abaseduze
- Sekela Vector Machine
- Isihlahla Sesinqumo
- Ihlathi Elingahleliwe
3. Ukuhlanganisa
Ama-algorithms okuhlanganisa ku-Scikit-learn asetshenziselwa ukuhlela ngokuzenzakalela idatha enezici ezifanayo zibe amasethi. Ukuhlanganisa kuyinqubo yokuhlanganisa isethi yezinto ukuze labo abaseqenjini elifanayo bafane kakhulu nabakwamanye amaqembu. Idatha yekhasimende, isibonelo, ingase ihlukaniswe ngokusekelwe endaweni yabo.
I-Clustering algorithms ihlanganisa okulandelayo:
- I-DB-SCAN
- I-K-Means
- I-Mini-Batch K-Means
- I-Spectral Clustering
4. Ukukhetha Imodeli
Ama-algorithms wokukhetha amamodeli ahlinzeka ngezindlela zokuqhathanisa, zokuqinisekisa, nokukhetha amapharamitha alungile namamodeli azosetshenziswa ezinhlelweni zesayensi yedatha. Idatha enikezwe, ukukhethwa kwemodeli kuyinkinga yokukhetha imodeli yezibalo eqenjini lamamodeli ekhandidethi. Ezimweni eziyisisekelo kakhulu, iqoqo ledatha elivele likhona liyacatshangelwa. Kodwa-ke, umsebenzi ungase futhi uhlanganise ukuklama kokuhlolwa ukuze idatha etholiwe ifaneleke kahle inkinga yokukhetha imodeli.
Amamojula wokukhetha amamodeli angathuthukisa ukunemba ngokulungisa amapharamitha afaka:
- Ukuqinisekisa okuphambene
- Ukusesha kweGridi
- Ama-metric
5. Ukuncishiswa kobukhulu
Ukudluliswa kwedatha isuka endaweni enobukhulu obuphezulu iye endaweni enobukhulu obuphansi ukuze ukumelwa kwe-dimensional ephansi kugcine izici ezithile ezibalulekile zedatha yasekuqaleni, ngokufanelekile eduze nobukhulu bayo bemvelo, kwaziwa njengokuncishiswa kobukhulu. Inani lokuguquguquka okungahleliwe lokuhlaziya liyancishiswa lapho ubukhulu buncishiswa. Idatha yangaphandle, isibonelo, ingase ingacatshangelwa njengokuthuthukisa ukusebenza kahle kokubonwayo.
I-Dimensionality Reduction algorithm ihlanganisa lokhu okulandelayo:
- Ukukhetha kwesici
- Ukuhlaziywa Kwezinhloko Eziyinhloko (PCA)
Ifaka i-Scikit-learn
I-NumPy, i-SciPy, i-Matplotlib, i-IPython, i-Sympy, ne-Pandas kudingeka ukuthi ifakwe ngaphambi kokusebenzisa i-Scikit-learn. Masizifake sisebenzisa ipayipi elivela kukhonsoli (isebenza kuphela iWindows).
Masifake i-Scikit-funda manje njengoba sesifake amalabhulali adingekayo.
Izici
I-Scikit-learn, kwesinye isikhathi eyaziwa ngokuthi i-sklearn, iyikhithi yamathuluzi ye-Python yokusebenzisa amamodeli okufunda omshini nokumodela kwezibalo. Singase siyisebenzisele ukudala amamodeli okufunda emishini amaningi okuhlehla, ukuhlukanisa, nokuhlanganisa, kanye namathuluzi ezibalo okuhlola lawa mamodeli. Kuphinde kuhlanganise nokwehliswa kobukhulu, ukukhethwa kwezici, ukukhishwa kwesici, izindlela zokuhlanganisa, namasethi edatha akhelwe ngaphakathi. Sizophenya ngayinye yalezi zimfanelo ngayinye ngesikhathi.
1. Ingenisa Amasethi Yedatha
I-Scikit-learn ihlanganisa inombolo yamasethi edatha akhelwe ngaphambili, njengedathasethi ye-iris, isethi yedatha yenani lekhaya, isethi yedatha ye-titanic, njalo njalo. Izinzuzo ezibalulekile zalawa madathasethi ukuthi alula ukuwaqonda futhi angasetshenziswa ukuthuthukisa amamodeli e-ML ngokushesha. Lawa madathasethi afanele abaqalayo. Ngokufanayo, ungasebenzisa i-sklearn ukungenisa amasethi edatha engeziwe. Ngokufanayo, ungayisebenzisela ukungenisa idathasethi engeziwe.
2. Ukuhlukanisa Isethi Yedatha Yokuqeqeshwa Nokuhlolwa
I-Sklearn ifake phakathi ikhono lokuhlukanisa idathasethi ibe amasegimenti okuqeqesha nokuhlola. Ukuhlukanisa idathasethi kuyadingeka ukuze kuhlolwe ukusebenza kokubikezela okungachemile. Singacacisa ukuthi ingakanani idatha yethu okufanele ifakwe esitimeleni nakudathasethi yokuhlola. Sihlukanise idathasethi sisebenzisa ukuhlukaniswa kokuhlolwa kwesitimela kangangokuthi isethi yesitimela yakha u-80% wedatha futhi isethi yokuhlola ibe no-20%. Idathasethi ingahlukaniswa kanje:
3. Ukwehla Komugqa
I-Linear Regression iyindlela egadiwe yokufunda yomshini esekelwe ekufundeni. Yenza umsebenzi wokubuyisela emuva. Ngokusekelwe kokuhluka okuzimele, amamodeli wokuhlehla inani lokubikezela igoli. Isetshenziswa kakhulu ukunquma ukuxhumana phakathi kokuguquguqukayo nokubikezela. Amamodeli ahlukene okuhlehla ayahluka ngokohlobo lokuxhumana abaluhlolayo phakathi kokuhluka okuncikile nezizimele, kanye nenani lezinhlobonhlobo ezizimele ezisetshenzisiwe. Singamane sidale imodeli ye-Linear Regression sisebenzisa i-sklearn kanje:
4. Logistic Regression
Indlela ejwayelekile yokuhlukanisa iwukuhlehla kwezinto. Isemndenini ofanayo nokuhlehla kwe-polynomial kanye nomugqa futhi ingeyomndeni wokuhlukanisa ngomugqa. Okutholwe kokuhlehla kwezinto kulula ukukuqonda futhi kuyashesha ukubala. Ngendlela efanayo nokuhlehla komugqa, ukwehla kwezinto kuyindlela yokuhlehla egadiwe. Okuguquguqukayo okukhiphayo kungokwesigaba, ngakho umehluko kuphela lowo. Inganquma ukuthi isiguli sinesifo senhliziyo noma cha.
Izinkinga ezihlukahlukene zokuhlukanisa, ezifana nokutholwa kogaxekile, zingaxazululwa kusetshenziswa ukuhlehliswa phansi kwezinto. Ukubikezela isifo sikashukela, okunquma ukuthi umthengi uzothenga yini umkhiqizo othile noma ashintshele imbangi, okunquma ukuthi umsebenzisi uzochofoza isixhumanisi esithile sokumaketha, futhi ezinye izimo eziningi ziyizibonelo ezimbalwa.
5. Isihlahla Sesinqumo
Indlela enamandla kakhulu futhi esetshenziswa kabanzi yokuhlukanisa kanye nendlela yokubikezela isihlahla sesinqumo. Isihlahla sesinqumo isakhiwo sesihlahla esibukeka njengeshadi eligelezayo, inodi ngayinye yangaphakathi imelela ukuhlolwa kusibaluli, igatsha ngalinye limelela isiphetho sokuhlolwa, kanye nenodi yeqabunga ngalinye (i-terminal node) ephethe ilebula yekilasi.
Uma okuguquguqukayo okuncikile kungenabo ubudlelwano bomugqa neziguquguquki ezizimele, okungukuthi lapho ukuhlehla komugqa kungakhiqizi okutholakele okulungile, izihlahla zezinqumo zinenzuzo. Into ye-DecisionTreeRegression() ingase isetshenziswe ngendlela efanayo ukuze kusetshenziswe isihlahla sesinqumo sokuhlehla.
6. Ihlathi Elingahleliwe
Ihlathi elingahleliwe yi-a ukufunda imishini indlela yokuxazulula izinkinga zokuhlehla nokuhlukaniswa. Isebenzisa ukufunda okuhlangene, okuyindlela ehlanganisa abahlukanisi abaningi ukuxazulula izinkinga eziyinkimbinkimbi. Indlela yehlathi engahleliwe yenziwe inani elikhulu lezihlahla zokunquma. Ingase isetshenziselwe ukuhlukanisa izicelo zokubolekwa imali, ukuthola ukuziphatha kokukhwabanisa, nokulindela ukuqubuka kwezifo.
7. Ukudideka Matrix
I-matrix yokudideka ithebula elisetshenziselwa ukuchaza ukusebenza kwemodeli yokuhlukanisa. Amagama amane alandelayo asetshenziswa ukuhlola i-matrix yokudideka:
- I-True Positive: Kusho ukuthi imodeli iveze umphumela omuhle futhi yayilungile.
- Okubi Kweqiniso: Kusho ukuthi imodeli iveze umphumela omubi futhi yayilungile.
- Okuhle Okungelona iqiniso: Kuchaza ukuthi imodeli ibilindele umphumela omuhle kodwa bekungumubi ngempela.
- Okubi Kwamanga: Kubonisa ukuthi imodeli ilindele umphumela ongemuhle, kuyilapho umphumela wawumuhle ngempela.
Ukuqaliswa kwe-matrix yokudideka:
buhle
- Kulula ukuyisebenzisa.
- Iphakheji le-Scikit-learn livumelana nezimo ngokwedlulele futhi liwusizo, linikeza izinjongo zomhlaba wangempela njengokuqagela ukuziphatha kwabathengi, ukuthuthukiswa kwe-neuroimage, nokunye.
- Abasebenzisi abafisa ukuxhuma ama-algorithms nezinkundla zabo bazothola imibhalo enemininingwane ye-API kuwebhusayithi ye-Scikit-lear.
- Ababhali abaningi, abahlanganyeli, kanye nokwesekwa okukhulu komphakathi oku-inthanethi emhlabeni wonke futhi ugcine i-Scikit-learn isesikhathini samanje.
bawo
- Akuyona inketho ekahle yocwaningo olunzulu.
Isiphetho
I-Scikit-learn iyiphakheji ebalulekile yawo wonke usosayensi wedatha ukuba ababambe ngokuqinile kanye nolwazi oluthile ngalo. Lo mhlahlandlela kufanele ukusize ngokukhohlisa idatha usebenzisa i-sklearn. Kunamakhono amaningi we-Scikit-learn ozowathola njengoba uthuthukela kuhambo lwakho lwesayensi yedatha. Yabelana ngemicabango yakho kumazwana.
shiya impendulo