Ngatimbofungidzira uri kuedza kudzidzisa robhoti kufamba. Kusiyana nekudzidzisa komputa nzira yekufanotaura mitengo yemasheya kana kuisa mumapoka mapikicha, isu hatina chaizvo dhatabheti hombe ratinogona kushandisa kudzidzisa marobhoti edu.
Kunyange zvazvo zvingauya zvoga kwauri, kufamba chaizvoizvo chiito chakaoma zvikuru. Kufamba nhanho kunowanzobatanidza akawanda emamhasuru akasiyana anoshanda pamwechete. Kuedza uye nzira dzinoshandiswa kufamba kubva kune imwe nzvimbo kuenda kune imwe dzinoenderanawo nezvinhu zvakasiyana-siyana, kusanganisira kana iwe wakatakura chimwe chinhu kana kuti pane kurerekera kana mamwe marudzi ezvipingamupinyi.
Muzviitiko zvakaita seizvi, tinogona kushandisa nzira inozivikanwa seyekusimbisa kudzidza kana RL. NeRL, unogona kutsanangura chinangwa chaunoda kuti modhi yako igadzirise uye zvishoma nezvishoma ita kuti modhi idzidze yega nzira yekuizadzisa.
Muchinyorwa chino, tichaongorora izvo zvekutanga zvekusimbisa kudzidza uye mashandisiro atingaita iyo RL chimiro kune akasiyana matambudziko akasiyana munyika chaiyo.
Chii chinonzi reinforcement learning?
Kusimbisa kudzidza kunoreva chimwe chikamu che machine learning iyo inotarisa pakutsvaga mhinduro nekupa mubayiro maitiro anodikanwa uye kuranga maitiro asingadiwi.
Kusiyana nekudzidza kunotariswa, nzira yekudzidzira yekusimbisa kazhinji haina dataset rekudzidzisa rinopa izvo zvakabuda zvakanaka pane zvakapihwa. Mukushaikwa kwedata rekudzidzisa, iyo algorithm inofanirwa kuwana mhinduro kuburikidza nekuyedza uye kukanganisa. Iyo algorithm, iyo yatinowanzo taura nezvayo se muiti, inofanira kuwana mhinduro yega nekudyidzana ne mhepo mvura nenzvimbo.
Vatsvakurudzi vanosarudza kuti ndezvipi zvinoguma zvaitika mubayiro uye izvo algorithm inokwanisa kuita. Every chiito iyo algorithm inotora inogashira imwe nzira yemhinduro iyo inoratidza kuti algorithm iri kuita sei. Munguva yekudzidzira maitiro, iyo algorithm inozopedzisira yawana yakakwana mhinduro yekugadzirisa rimwe dambudziko.
Muenzaniso Wakajeka: 4 × 4 Grid
Ngatitarisei muenzaniso wakapfava wedambudziko ratinogona kugadzirisa nekudzidza kwekusimbisa.
Ngatitii tine 4 × 4 grid senharaunda yedu. Mumiriri wedu anoiswa zvisina tsarukano mune imwe yemakwere pamwe chete nezvipingamupinyi zvishoma. Giridhi ichava nezvipingamupinyi zvitatu zve "gomba" izvo zvinofanirwa kudzivirirwa uye mubairo mumwe chete we "dhaimani" uyo mumiririri anofanira kuwana. Tsananguro yakazara yezvakatipoteredza inozivikanwa semhoteredzo mamiriro.
Mune yedu RL modhi, mumiririri wedu anogona kuenda kune chero yakatarisana sikweya chero pasina zvipingamupinyi zvinovavharira. Iyo seti yezvese zviito zvinoshanda munzvimbo yakapihwa inozivikanwa se nzvimbo yekuita. Chinangwa chemumiririri wedu ndechekutsvaga nzira pfupi yekuenda kumubairo.
Mumiririri wedu achashandisa nzira yekudzidzira yekusimbisa kutsvaga nzira yekuenda kudhaimondi inoda nhanho shoma. Nhanho imwe neimwe yakarurama ichapa robhoti mubairo uye nhanho imwe neimwe isina kururama ichabvisa mubairo werobhoti. Iyo modhi inoverengera mubairo wese kana mumiririri asvika padhaimani.
Zvino zvatatsanangura mumiririri uye nharaunda, tinofanirawo kutsanangura mitemo yekushandisa pakusarudza chiito chinotevera chichatorwa nemumiririri zvichienderana nemamiriro ayo uye nharaunda.
Mitemo uye Mibayiro
Muchidzidzo chekusimbisa, a urongwa zvinoreva nzira inoshandiswa nemumiririri kuzadzisa zvinangwa zvavo. Gwaro remumiririri ndiro rinosarudza kuti mumiririri anofanira kuita sei kana achitarisa mamiriro azvino emumiririri nenharaunda yake.
Mumiriri anofanira kuongorora zvese zvinogoneka marongero kuti aone kuti ndeupi mutemo wakanaka.
Mumuenzaniso wedu wakapfava, kumhara panzvimbo isina chinhu kunodzosera kukosha kwe -1. Kana mumiririri asvika panzvimbo ine mubairo wedhaimani, achawana kukosha kwegumi. Tichishandisa hunhu uhu, tinogona kuenzanisa mitemo yakasiyana tichishandisa utility function U.
Ngatimbofananidzai mashandisiro emitemo miviri yaonekwa pamusoro apa:
U(A) = -1 – 1 -1 + 10 = 7
U(B) = -1 – 1 – 1 – 1 – 1 + 10 = 5
Mhedzisiro yacho inoratidza kuti Policy A ndiyo nzira iri nani yekuwana mubairo. Saka, mumiririri achashandisa Nzira A pamusoro pePolisi B.
Kuongorora maringe nekushandiswa
Dambudziko rekuongorora maringe nekushandiswa kwekutengesa mukusimbisa kudzidza idambudziko iro mumiriri anofanira kutarisana naro panguva yekuita sarudzo.
Ko vamiririri vanofanirwa kutarisa kutsvaga nzira nyowani kana sarudzo kana kuti vaenderere mberi nekushandisa sarudzo dzavanoziva?
Kana mumiririri akasarudza kuongorora, pane mukana wekuti mumiririri awane imwe sarudzo iri nani, asi zvinogona zvakare kukanganisa kutambisa nguva nezviwanikwa. Kune rimwe divi, kana mumiririri akasarudza kushandisa mhinduro yaanotoziva, inogona kupotsa imwe sarudzo iri nani.
Applications Applications
Hedzino dzimwe nzira AI vaongorori vakashandisa nzira dzekusimbisa dzekudzidza kugadzirisa matambudziko epasirese:
Kusimbisa Kudzidza muMota dzinozvityaira
Kudzidza kwekusimbisa kwakashandiswa kumotokari dzinozvityaira kuitira kuti dzivandudze kugona kwadzo kutyaira zvakachengeteka uye nemazvo. Iyo tekinoroji inogonesa mota dzinozvimiririra kuti dzidzidze kubva mukukanganisa kwadzo uye dzichigara dzichigadzirisa maitiro adzo kuitira kuti dziwedzere kuita kwavo.
Semuenzaniso, iyo London-based AI kambani Wayve yakabudirira kushandisa yakadzika yekusimbisa yekudzidzira modhi yekutyaira yakazvimiririra. Mukuedza kwavo, vakashandisa basa remubairo rinowedzera huwandu hwenguva inomhanya mota pasina mutyairi arimubhodhi achipa mapindiro.
RL modhi dzinobatsirawo mota kuita sarudzo zvichienderana nenharaunda, sekunzvenga zvipingaidzo kana kubatanidza mutraffic. Aya mamodheru anofanira kuwana nzira yekushandura nharaunda yakaoma yakatenderedza mota mune inomiririra nyika nzvimbo iyo modhi inogona kunzwisisa.
Kusimbisa Kudzidza muRobhoti
Vatsvagiri vanga vachishandisawo kusimbisa kudzidza kugadzira marobhoti anogona kudzidza mabasa akaomarara. Kuburikidza nemhando idzi dzeRL, marobhoti anokwanisa kuona zvakatipoteredza uye kuita sarudzo zvichienderana nekuona kwavo.
Semuyenzaniso, tsvakiridzo yakaitwa pakushandisa einforcement learning modhi kubvumira bipedal marobhoti kudzidza maitiro Walk voga.
Vatsvagiri vanofunga RL senzira yakakosha mumunda we robotics. Kusimbisa kudzidza kunopa marobhoti maajenti hurongwa hwekudzidza zviito zvakaomarara izvo zvingave zvakaoma kuita mainjiniya.
Kusimbisa Kudzidza muMitambo
RL mhando dzakashandiswawo kudzidza kutamba mitambo yemavhidhiyo. MaAgents anogona kugadzikwa kuti adzidze kubva mukukanganisa kwavo uye nekuramba achivandudza kuita kwavo mumutambo.
Vatsvagiri vakatogadzira vamiririri vanogona kutamba mitambo yakaita sechess, Go, uye poker. Muna 2013, Deepmind yakashandisa Deep Reinforcement Kudzidza kubvumira modhi kudzidza kutamba mitambo yeAtari kubva pakutanga.
Mitambo yakawanda yebhodhi nemitambo yemavhidhiyo ine nzvimbo shoma yekuita uye chinangwa chekongiri chakanyatsotsanangurwa. Aya maitiro anoshanda kune iyo RL modhi mukana. Nzira dzeRL dzinogona kukurumidza kudzokorora pamusoro pemamiriyoni emitambo yakaedzerwa kudzidza nzira dzakakwana dzekukunda.
mhedziso
Kungave kudzidza kufamba kana kudzidza kutamba mitambo yemavhidhiyo, maRL modhi akaratidzirwa kuve anobatsira AI masisitimu ekugadzirisa matambudziko anoda kwakaoma kuita sarudzo.
Sezvo tekinoroji ichiramba ichishanduka, vese vaongorori nevagadziri vacharamba vachitsvaga maapplication matsva anotora mukana weiyo modhi yekuzvidzidzisa kugona.
Ndeapi maitiro anoshanda aunofunga kuti kusimbaradza kudzidza kunogona kubatsira nawo?
Leave a Reply