Ukutholwa kwento kuwuhlobo lokuhlukaniswa kwesithombe lapho inethiwekhi ye-neural ilindela izinto ezisesithombeni futhi idweba amabhokisi abophayo eduze kwazo. Ukuthola nokwenza okwasendaweni izinto esithombeni esivumelana nesethi esethiwe yamakilasi kubizwa ngokuthi ukutholwa kwento.
Ukutholwa kwento (okubuye kwaziwe njengokubonwa kwento) isizinda esingaphansi esibaluleke kakhulu seComputer Vision ngoba imisebenzi efana nokutholwa, ukuhlonza, nokwenza kwasendaweni ithola ukusetshenziswa okubanzi kuzimo zomhlaba wangempela.
Indlela ye-YOLO ingakusiza wenze le misebenzi. Kule ndaba, sizobhekisisa i-YOLO, okuhlanganisa ukuthi iyini, ukuthi isebenza kanjani, ukuhlukahluka okuhlukile, nokuningi.
Ngakho, yini i-YOLO?
I-YOLO iyindlela yokuhlonza into yesikhathi sangempela nokubonwa ezithombeni. Kuyisifinyezo sokuthi Ubukeka Kanye Kuphela. Redmond et al. iphakamise indlela ephepheni elashicilelwa okokuqala ngo-2015 eNgqungqutheleni ye-IEEE/CVF yeComputer Vision and Pattern Recognition (CVPR).
I-OpenCV People's Choice Award yanikezwa ephepheni. Ngokungafani nezindlela zangaphambilini zokuhlonza izinto, eziphinde zahlomisela abahlukanisi ukuba bathole, i-YOLO iphakamisa ukusetshenziswa kokuphela ukuya ekupheleni. inethiwekhi ye-neural ebikezela amabhokisi okubopha kanye namathuba ekilasi kanyekanye.
I-YOLO ikhiqiza imiphumela yesimanje ngokuthatha indlela entsha ngokuyisisekelo yokuqaphela into, esebenza kahle kakhulu kunendlela yangaphambilini yokutholwa kwezinto zesikhathi sangempela.
I-YOLO isebenza
Indlela ye-YOLO ihlukanisa isithombe sibe amagridi angu-N, ngalinye linomkhakha wobukhulu obulinganayo be-SxS. Ngayinye kulawa magridi angu-N iphethe ukuthola nokuthola into equkethwe.
Lawa magridi, wona, abikezela izixhumanisi zebhokisi elibophezelayo elingu-B ngokuhlobene nezixhumanisi zamaseli, kanye negama lento kanye nokwenzeka kokuthi into ibe khona kuseli. Ngenxa yamaseli amaningi abikezela into efanayo ngokubikezela kwebhokisi elibophayo elihlukahlukene, le nqubo inciphisa kakhulu ukubala ngoba kokubili ukutholwa nokubonwa kuphathwa amaseli avela esithombeni.
Nokho, ikhiqiza izibikezelo eziningi eziyimpinda. Ukubhekana nale nkinga, i-YOLO isebenzisa i-Non-Maximal Suppression. I-YOLO icindezela wonke amabhokisi abophezelayo anamathuba aphansi okucindezelwa Okungekona Okukhulu.
I-YOLO yenza lokhu ngokuhlola amathuba okuba amaphuzu axhunywe nenketho ngayinye nokukhetha enamaphuzu aphezulu. Amabhokisi abophezelayo aneMpambano-ndlela enkulu kunazo zonke phezu kweNyunyana anebhokisi elibophezelayo lamanje elikhulu kunawo wonke abese ecindezelwa.
Le nqubo iqhutshwa kuze kube yilapho amabhokisi okubophayo eseqedile.
Izinhlobonhlobo ezahlukene ze-YOLO
Sizobheka ezinye zezinguqulo ezivame kakhulu ze-YOLO. Ake siqale.
1. YOLOv1
Inguqulo yokuqala ye-YOLO yamenyezelwa ngo-2015 ekushicilelweni "Ubheka Kanye Kuphela: Okuhlanganisiwe, Ukutholwa Kwento Yesikhathi Sangempela” nguJoseph Redmon, Santosh Divvala, Ross Girshick, no-Ali Farhadi.
Ngenxa yesivinini sayo, ukunemba, nekhono lokufunda, i-YOLO yabusa ngokushesha indawo yokuhlonza into futhi yaba i-algorithm esetshenziswa kakhulu. Kunokuba kubhekwane nokutholwa kwento njengenkinga yokuhlukanisa, ababhali bakubheke njengenkinga yokuhlehla ngamabhokisi ahlukaniswe ngokwendawo kanye namathuba ekilasi ahlotshaniswayo, abawaxazulule besebenzisa eyodwa. inethiwekhi ye-neural.
I-YOLOv1 icubungule izithombe ngozimele abangu-45 ngesekhondi ngesikhathi sangempela, kanti okuhlukile okuncane, i-Fast YOLO, kucutshungulwe ngozimele abangu-155 ngesekhondi futhi isathole kabili i-maP yezinye izitholi zesikhathi sangempela.
2. YOLOv2
Ngemva konyaka, ku-2016, uJoseph Redmon no-Ali Farhadi bakhulula i-YOLOv2 (eyaziwa nangokuthi i-YOLO9000) ephepheni elithi "I-YOLO9000: Ingcono, Ngokushesha, Kakhudlwana. "
Amandla emodeli okubikezela ngisho nezigaba zezinto ezihlukene ezingu-9000 ngenkathi isebenza ngesikhathi sangempela ayizuzele igama elithi 9000. Akugcinanga ngokuthi inguqulo entsha yemodeli ayiqeqeshelwanga kanyekanye ekutholeni into namasethi edatha okuhlukaniswa, kodwa iphinde yathola i-Darknet-19 njengesisekelo esisha. imodeli.
Ngenxa yokuthi i-YOLOv2 nayo yaba yimpumelelo enkulu futhi ngokushesha yaba imodeli elandelayo yokuqashelwa kwento yesimanje, abanye onjiniyela baqala ukuhlola i-algorithm futhi bakhiqiza ezabo, izinguqulo ze-YOLO ezihlukile. Ezinye zazo kuzoxoxwa ngazo ngamaphuzu ahlukene ephepheni.
3. YOLOv3
Ephepheni “I-YOLOv3: Ukuthuthukiswa Okuthuthukile,” UJoseph Redmon no-Ali Farhadi bashicilele inguqulo entsha ye-algorithm ngo-2018. Yakhiwe phezu kwezakhiwo ze-Darknet-53. Izihlukanisi ezizimele zokungena zithathe indawo yendlela yokwenza kusebenze i-softmax ku-YOLOv3.
Ukulahlekelwa kwe-cross-entropy kanambambili kusetshenziswe phakathi nokuqeqeshwa. I-Darknet-19 iye yathuthukiswa futhi yaqanjwa kabusha ngokuthi i-Darknet-53, manje esenezendlalelo ze-convolutional ezingu-53. Ngaphandle kwalokho, izibikezelo zenziwa ngezikali ezintathu ezihlukene, ezisize i-YOLOv3 ithuthukise ukunemba kwayo ekubikezeleni izinto ezincane.
I-YOLOv3 kwakuyinguqulo yokugcina ka-Joseph Redmon ye-YOLO, njengoba akhetha ukungaqhubeki noma yikuphi ukuthuthukiswa kwe-YOLO (noma ngisho nendawo yokubona ikhompuyutha) ukuze agweme umsebenzi wakhe ube nethonya elilimazayo emhlabeni. Manje isisetshenziswa kakhulu njengesiqalo sokwakha izakhiwo ezihlukile zokutholwa kwento.
4. Yolov4
U-Alexey Bochkovskiy, uChien-Yao Wang, noHong-Yuan Mark Liao bashicilelwe “I-YOLOv4: Isivinini Esilungile Nokunemba Kokutholwa Kwento” ngo-Ephreli 2020, obekuyimpinda yesine ye-algorithm ye-YOLO.
I-Weighted Residual Connections, i-Cross-Stage-Partial Connections, i-cross-batch normalization, ukuziqeqeshela ukuzilwela, ukusebenzisa i-mish, i-drop block, kanye nokulahlekelwa kwe-CIoU konke kwethulwe njengengxenye yezakhiwo ze-SPDarknet53.
I-YOLOv4 iyinzalo yomndeni wakwa-YOLO, nokho, yathuthukiswa ososayensi abahlukene (hhayi uJoseph Redmon no-Ali Farhadi). I-SPDarknet53 backbone, i-spatial pyramid pooling, i-PANet path-aggregation njengentamo, kanye nekhanda le-YOLOv3 lakha ukwakheka kwayo.
Ngenxa yalokho, uma iqhathaniswa nomzali wayo, i-YOLOv3, i-YOLOv4 ifinyelela u-10% ophezulu Owesilinganiso Sokunemba kanye no-12% ongcono we-Frame per Second metrics.
5. YOLOv5
YOLOv5 iphrojekthi yomthombo ovulekile ehlanganisa uhla lwamamodeli okuhlonza into nama-algorithms asekelwe kumodeli ye-YOLO eqeqeshwe kusengaphambili kudathasethi ye-COCO.
I-YOLOv5 iqoqo lamamodeli okuhlonza into enesikali esihlanganisiwe abaqeqeshwe kudathasethi ye-COCO, enekhono elilula le-TTA, ukuhlanganiswa kwemodeli, ukuthuthukiswa kwe-hyperparameter, nokuthekelisa ku-ONNX, CoreML, kanye ne-TFLite. Ngenxa yokuthi i-YOLOv5 ayisebenzisi noma ithuthukise noma yiziphi izindlela ezihlukile, iphepha elisemthethweni alikwazanga ukukhishwa. Kumane kuyisandiso se-YOLOv3 sika-PyTorch.
I-Ultranytics isebenzise lesi simo ukuze imemezele inguqulo “entsha ye-YOLO” ngaphansi koxhaso lwayo. Ngenxa yokuthi kukhona futhi amamodeli amahlanu aqeqeshwe ngaphambilini afinyelelekayo, ikhasi eliyisiqalo le-YOLOv5 liqondile futhi lakhiwe ngokomsebenzi futhi libhaliwe, linezifundo eziningi neziphakamiso zokuqeqesha nokusebenzisa amamodeli e-YOLOv5.
Imikhawulo ye-YOLO
Nakuba i-YOLO ibonakala iyindlela enhle kakhulu yokuxazulula ukutholwa kwento izinkinga, inenani lezinkinga. Ngenxa yokuthi igridi ngayinye ingakwazi ukubona into eyodwa kuphela, i-YOLO inenkinga yokuthola nokuhlukanisa izinto ezincane ezithombeni ezenzeka ngamaqembu. Izinto ezincane eziwuquqaba, njengoswebezane lwezintuthwane, kunzima ku-YOLO ukuzibona nokuzibona.
Uma kuqhathaniswa nezindlela zokuhlonza into ezihamba kancane njenge-Fast RCNN, i-YOLO nayo ibonakala ngokunemba okuncane.
Qala ukusebenzisa i-YOLOv5
Uma ungathanda ukubona i-YOLOv5 isebenza, bheka I-GitHub esemthethweni futhi I-YOLOv5 ku-PyTorch.
Isiphetho
Inguqulo yokuqala ye-YOLOv5 iyashesha kakhulu, iyasebenza, futhi kulula ukuyisebenzisa. Nakuba i-YOLOv5 ingangezi noma iyiphi imodeli yezakhiwo ezintsha kumndeni we-YOLO, inikeza ukuqeqeshwa kwe-PyTorch nohlaka lokuphakela olusha oluthuthukisa isimo sobuciko sezitholi zezinto.
Ngaphezu kwalokho, i-YOLOv5 isebenziseka kalula futhi iphuma "ebhokisini" ilungele ukusetshenziswa ezintweni ezibekiwe.
shiya impendulo