Emphakathini wanamuhla, isayensi yedatha ibaluleke kakhulu!
Kangangoba usosayensi wedatha uthweswe umqhele “Umsebenzi Obucayi Kakhulu Wekhulu Lamashumi Amabili Nanye,” naphezu kokuthi akekho noyedwa olindele imisebenzi ye-geeky ukuthi ibe mnandi!
Kodwa-ke, ngenxa yokubaluleka okukhulu kwedatha, Isayensi Yedatha idume kakhulu njengamanje.
I-Python, nokuhlaziywa kwayo kwezibalo, imodeli yedatha, nokufundeka, ingenye ehamba phambili izilimi zokuhlela ukuze kukhishwe inani kule datha.
I-Python ayilokothi iyeke ukumangaza abahleli bayo uma kukhulunywa ngokunqoba izinselelo zesayensi yedatha. Iwulimi olusetshenziswa kakhulu, olugxile entweni, umthombo ovulekile, ulimi lohlelo olusebenza kahle kakhulu olunezici eziningi ezengeziwe.
I-Python yakhelwe ngemitapo yolwazi emangalisayo yesayensi yedatha abahleli bezinhlelo abayisebenzisa nsuku zonke ukuxazulula ubunzima.
Nansi imitapo yolwazi ehamba phambili yePython okufanele icatshangelwe:
1. AmaPandas
I-Pandas iyiphakheji eklanyelwe ukusiza onjiniyela ekusebenzeni ngedatha "enelebula" kanye "nezobudlelwane" ngendlela engokwemvelo. Yakhelwe phezu kwezakhiwo zedatha ezimbili ezinkulu: “Uchungechunge” (uhlangothi olulodwa, olufana nohlu lwezinto) kanye “Nezinhlaka Zedatha” (ezinozinhlangothi ezimbili, njengethebula elinamakholomu amaningi).
AmaPanda asekela ukuguqula izakhiwo zedatha zibe izinto ze-DataFrame, ezibhekana nedatha engekho, ukwengeza/ukususa amakholomu ku-DataFrame, kufakwe amafayela angekho, kanye ukubona idatha usebenzisa ama-histograms noma amabhokisi esakhiwo.
Iphinde inikeze inani lamathuluzi okufunda nokubhala idatha phakathi kwezakhiwo zedatha yenkumbulo kanye namafomethi amaningana wamafayela.
Kafushane nje, ilungele ukucutshungulwa kwedatha okusheshayo nokulula, ukuhlanganisa idatha, ukufundwa kwedatha nokubhala, kanye nokubonwa kwedatha. Lapho udala iphrojekthi yesayensi yedatha, uzohlala usebenzisa i-Pandas yelabhulali yesilo ukuze uphathe futhi uhlaziye idatha yakho.
2. numpy
I-NumPy (Numerical Python) iyithuluzi elihle kakhulu lokwenza izibalo zesayensi kanye nemisebenzi yohlu eyisisekelo neyinkimbinkimbi.
Umtapo wolwazi uhlinzeka ngezici eziwusizo eziningi zokusebenza ngama-n-arrays kanye namatrices kuPython.
Kwenza kube lula ukucubungula amalungu afanayo aqukethe amanani ohlobo lwedatha efanayo nokwenza imisebenzi ye-arithmetic kumalungu afanayo (okuhlanganisa i-vectorization). Eqinisweni, ukusebenzisa uhlobo lwe-NumPy array ukuze kuvezwe ukusebenza kwezibalo kuthuthukisa ukusebenza futhi kunciphisa isikhathi sokwenza.
Usekelo lwama-multidimensional arrays okusebenza kwezibalo nokunengqondo isici esiyinhloko selabhulali. Imisebenzi ye-NumPy ingasetshenziselwa ukukhomba, ukuhlela, ukubumba kabusha, nokuxhumana okubonwayo namagagasi omsindo njengohlu lwezinombolo zangempela ezinobukhulu obuhlukahlukene.
3. I-Matplotlib
Emhlabeni wePython, iMatplotlib ingenye yemitapo yolwazi esetshenziswa kakhulu. Isetshenziselwa ukukhiqiza ukubonwa kwedatha okumile, okugqwayizayo, nokusebenzisanayo. I-Matplotlib inezinketho eziningi zokushadi nokwenza ngokwezifiso.
Besebenzisa ama-histograms, abahleli bangasakaza, balungise, futhi bahlele amagrafu. Ilabhulali yomthombo ovulekile inikeza i-API egxile entweni yokwengeza iziqephu ezinhlelweni.
Lapho usebenzisa le labhulali ukuze ukhiqize ukubonwa okuyinkimbinkimbi, nokho, abathuthukisi kufanele babhale amakhodi amaningi kunokujwayelekile.
Kuyaphawuleka ukuthi imitapo yolwazi ethandwayo ihlangana neMatplotlib ngaphandle kokushayisana.
Phakathi kwezinye izinto, isetshenziswa emibhalweni yePython, amagobolondo ePython kanye ne-IPython, izincwadi zokubhalela zeJupyter, kanye uhlelo lokusebenza lewebhu amaseva.
Izakhiwo, amashadi ebha, amashadi ophaya, ama-histograms, ama-scatterplots, amashadi amaphutha, i-spectra yamandla, ama-stemplots, nanoma yiluphi olunye uhlobo lweshadi lokubonisa lungadalwa nalo.
4. ozalwa olwandle
Umtapo wolwazi waseSeaborn wakhiwe kuMatplotlib. I-Seaborn ingasetshenziswa ukwenza amagrafu ezibalo akhangayo nafundisayo kune-Matplotlib.
I-Seaborn ihlanganisa i-API edidiyelwe yesethi yedatha yokuphenya ukusebenzisana phakathi kwezinto eziningi eziguquguqukayo, ngaphezu kokusekelwa okugcwele kokuboniswa kwedatha.
I-Seaborn inikezela ngenani elimangalisayo lezinketho zokubukwa kwedatha, okuhlanganisa ukubonwa kochungechunge lwesikhathi, iziqephu ezihlanganyelwe, imidwebo yevayolini, nokunye okuningi.
Isebenzisa imephu ye-semantic kanye nokuhlanganiswa kwezibalo ukuze inikeze ukubonwa okufundisayo ngemininingwane ejulile. Kufaka phakathi inombolo yezinqubo zokushaja ezigxile kudathasethi ezisebenza ngozimele bedatha nezinhlaka ezihlanganisa amasethi edatha aphelele.
Ukubukwa kwedatha yayo kungabandakanya amashadi ebha, amashadi ophayi, ama-histogram, ama-scatterplots, amashadi amaphutha, nezinye izithombe. Le labhulali yokubuka idatha ye-Python ihlanganisa namathuluzi okukhetha amaphalethi ombala, asiza ekwembuleni okuthrendayo kudathasethi.
5. Scikit-funda
I-Scikit-learn iyilabhulali ye-Python enkulu kunazo zonke yokumodela idatha nokuhlola amamodeli. Ingenye yemitapo yolwazi yePython ewusizo kakhulu. Inamandla amaningi aklanyelwe inhloso yokumodela kuphela.
Kuhlanganisa wonke ama-algorithms Wokufunda Komshini Ogadiwe Nengagadiwe, kanye nemisebenzi echazwe ngokugcwele ye-Ensemble Learning and Boosting Machine Learning.
Isetshenziswa ososayensi bedatha ukwenza isimiso ukufunda imishini kanye nemisebenzi yezimayini yedatha efana nokuhlanganisa, ukuhlehla, ukukhetha amamodeli, ukunciphisa ubukhulu, nokuhlukanisa. Iza nemibhalo ephelele futhi yenza ngendlela encomekayo.
I-Scikit-learn ingasetshenziswa ukudala izinhlobonhlobo zamamodeli wokufunda womshini ogadiwe futhi ongagadiwe njengokuhlukanisa, ukuhlehla, Imishini yokusekela amaVector, Amahlathi Angahleliwe, Omakhelwane Abaseduze, Naive Bayes, Izihlahla Zezinqumo, Ukuhlanganisa, njalonjalo.
Umtapo wolwazi wokufunda womshini we-Python uhlanganisa amathuluzi ahlukahlukene alula kodwa asebenza kahle okwenza ukuhlaziya idatha nemisebenzi yezimayini.
Ukuze ufunde okwengeziwe, nali umhlahlandlela wethu ku I-Scikit-funda.
6. XGBoost
I-XGBoost iyikhithi yamathuluzi ekhulisa i-gradient esabalalisiwe eyenzelwe isivinini, ukuguquguquka, nokuphatheka. Ukuze kuthuthukiswe ama-algorithms e-ML, isebenzisa uhlaka lwe-Gradient Boosting. I-XGBoost iyindlela yokukhulisa isihlahla esheshayo nenembile engaxazulula izinkinga eziningi zesayensi yedatha.
Kusetshenziswa uhlaka lwe-Gradient Boosting, le labhulali ingasetshenziselwa ukudala ama-algorithms okufunda komshini.
Kuhlanganisa ukukhuliswa kwesihlahla okuhambisanayo, okusiza amaqembu ekuxazululeni izinkinga ezihlukahlukene zesayensi yedatha. Enye inzuzo ukuthi abathuthukisi bangasebenzisa ikhodi efanayo ye-Hadoop, i-SGE, ne-MPI.
Kubuye kuthembeke kuzo zombili izimo ezisabalalisiwe nezinenkumbulo.
7. I-Tensorflow
I-TensorFlow iyiplathifomu ye-AI yokuphela-kuya-ekupheleni yamahhala enohlu olukhulu lwamathuluzi, imitapo yolwazi, nezinsiza. I-TensorFlow kumele yazi kunoma ubani osebenza kuyo amaphrojekthi wokufunda ngomshini ePython.
Iwumthombo ovulekile wekhithi yezibalo ezingokomfanekiso zokubala izinombolo kusetshenziswa amagrafu agelezayo athuthukiswe i-Google. Amanodi egrafu abonisa izinqubo zezibalo kugrafu evamile yokugeleza kwedatha ye-TensorFlow.
Ngakolunye uhlangothi, amaphethelo egrafu angama-multidimensional datarrays, aziwa nangokuthi ama-tensor, ageleza phakathi kwamanodi enethiwekhi. Ivumela abahleli bohlelo basabalalise ukucutshungulwa phakathi kwe-CPU eyodwa noma ngaphezulu noma ama-GPU kudeskithophu, idivayisi yeselula, noma iseva ngaphandle kokushintsha ikhodi.
I-TensorFlow ithuthukiswe ku-C no-C++. Nge-TensorFlow, ungamane uklame futhi qeqesha Machine Learning amamodeli asebenzisa ama-API asezingeni eliphezulu njengeKeras.
Iphinde inamadigri amaningi wokuthatha, okukuvumela ukuthi ukhethe isixazululo esingcono kakhulu semodeli yakho. I-TensorFlow futhi ikuvumela ukuthi usebenzise amamodeli Wokufunda Ngomshini efwini, kusiphequluli, noma kudivayisi yakho.
Kuyithuluzi elisebenza kahle kakhulu lemisebenzi efana nokuqashelwa kwento, ukubonwa kwenkulumo, nokunye okuningi. Isiza ekuthuthukisweni kokwenziwa amanethiwekhi we-neural okumele kubhekane nemithombo eminingi yedatha.
Nansi inkomba yethu esheshayo ku-TensorFlow ukuze ufunde kabanzi.
8. UKeras
I-Keras iwumthombo wamahhala futhi ovulekile Python-based neural network ikhithi yamathuluzi yobuhlakani bokwenziwa, ukufunda okujulile, nemisebenzi yesayensi yedatha. Amanethiwekhi e-Neural nawo asetshenziswa Kusayensi Yedatha ukuhumusha idatha yokuqaphela (izithombe noma umsindo).
Iqoqo lamathuluzi okudala amamodeli, idatha yegrafu, nokuhlola idatha. Kuphinde kuhlanganise namasethi edatha anamalebula ngaphambili angangeniswa ngokushesha futhi alayishwe.
Kulula ukuyisebenzisa, inezinto eziningi, futhi ilungele ucwaningo lokuhlola. Ngaphezu kwalokho, ikuvumela ukuthi udale uxhumeke ngokugcwele, uguqule, uhlanganise, uphindaphindeke, ushumeke, nezinye izinhlobo zeNeural Networks.
Lawa mamodeli angahlanganiswa ukuze akhe Inethiwekhi Ye-Neural ephelele yamasethi wedatha nezinkinga ezinkulu. Iwumtapo wolwazi omuhle kakhulu wokumodela nokudala amanethiwekhi emizwa.
Kulula ukuyisebenzisa futhi inikeza abathuthukisi ukuguquguquka okukhulu. I-Keras iyavilapha uma iqhathaniswa namanye amaphakheji okufunda ngomshini wePython.
Lokhu kungenxa yokuthi iqala ikhiqize igrafu yekhompyutha isebenzisa ingqalasizinda ye-backend bese iyisebenzisela ukuqhuba imisebenzi. I-Keras iveza ngendlela emangalisayo futhi ivumelana nezimo uma kuziwa ekwenzeni ucwaningo olusha.
9. I-PyTorch
I-PyTorch iyiphakheji ethandwayo yePython ukufunda okujulile nokufunda ngomshini. Kuyisofthiwe yekhompuyutha yesayensi esekelwe kumthombo ovulekile ye-Python yokusebenzisa i-Deep Learning kanye ne-Neural Networks kumasethi wedatha amakhulu.
I-Facebook isebenzisa kakhulu leli qoqo lamathuluzi ukuze idale amanethiwekhi emizwa asiza emisebenzini efana nokubona ubuso kanye nokumaka okuzenzakalelayo.
I-PyTorch iyinkundla yososayensi bedatha abafisa ukuqeda imisebenzi yokufunda ejulile ngokushesha. Ithuluzi livumela ukubalwa kwe-tensor ukuthi kwenziwe ngokusheshisa kwe-GPU.
Iphinde isetshenziselwe ezinye izinto, okuhlanganisa ukwakha amanethiwekhi ekhompyutha ashukumisayo nokubala ngokuzenzakalelayo ama-gradient.
Ngenhlanhla, i-PyTorch iphakethe elihle elivumela abathuthukisi ukuthi bashintshe kalula ukusuka kokuthiyori nokucwaninga baye ekuqeqesheni nasekuthuthukisweni uma kuziwa ekufundeni komshini kanye nocwaningo lokufunda olujulile ukuze banikeze ukuguquguquka okukhulu nesivinini.
10. I-NLTK
I-NLTK (I-Natural Language Toolkit) iyiphakheji edumile yePython yososayensi bedatha. Ukumaka umbhalo, ukwenza amathokheni, ukucabanga nge-semantic, neminye imisebenzi ehlobene nokucutshungulwa kolimi lwemvelo kungafezwa nge-NLTK.
I-NLTK ingasetshenziswa futhi ukuqedela i-AI eyinkimbinkimbi (Ukuhlakanipha okungekhona okwangempela) imisebenzi. I-NLTK ekuqaleni yadalelwa ukusekela i-AI ehlukene kanye nama-paradigm okufundisa omshini, njengemodeli yolimi kanye nethiyori yengqondo.
Njengamanje ishayela i-algorithm ye-AI kanye nokuthuthukiswa kwemodeli yokufunda emhlabeni wangempela. Yamukelwe kakhulu ukuze isetshenziswe njengethuluzi lokufundisa nanjengethuluzi lokufunda lomuntu ngamunye, ngaphezu kokusetshenziswa njengenkundla yokwenza i-prototyping nokuthuthukisa izinhlelo zocwaningo.
Ukuhlukanisa, ukuhlahlela, ukucabanga kwe-semantic, ukugxilisa ingqondo, ukumaka, nokwenza amathokheni konke kuyasekelwa.
Isiphetho
Lokho kuphetha imitapo yolwazi eyishumi ephezulu yePython yesayensi yedatha. Imitapo yolwazi yesayensi yedatha ye-Python ibuyekezwa njalo njengoba isayensi yedatha nokufunda komshini kuya kudume kakhulu.
Kunemitapo yolwazi eminingi yePython Yesayensi Yedatha, futhi ukukhetha komsebenzisi kunqunywa kakhulu uhlobo lwephrojekthi asebenza kuyo.
shiya impendulo