← 返回列表

IziBuzo ze-AI Series 11: Ungayithuthukisa kanjani i-RAG?

Ukuthuthukiswa kwe-RAG akukhona ukulungisa ingxenye eyodwa, kodwa inqubo yokuthuthukisa lonke uchungechunge. Ngezansi ngihlinzeka ngamasu okuthuthukisa ahlelekile avela ezinhlangothini ezine: uhlangothi lokuhlonza idatha, uhlangothi lokusesha, uhlangothi lokukhiqiza, kanye nohlangothi lokuhlola, futhi ngifaka nolwazi olungokoqobo ongakusho ekuphikweni.


1. Ukuthuthukiswa kohlangothi lokuhlonza idatha (ukuthuthukisa ikhwalithi "umtapo wolwazi")

Lokhu kuvame ukunganakwa kanti kunomphumela osheshayo.

Indawo yokuthuthukisa Isimo esiyinkinga Indlela yokwenza Izinkomba zomphumela
Ukuhunyushwa kwemibhalo Ama-thebula, ama-flowchart ku-PDF ayashiywa, noma imibhalo iyaxoka, ukulandelana kuyaphazamiseka. Sebenzisa izincwadi zokuhumusha ezingcono (njenge unstructured, pypdf emodi yokugcina ukwakheka); kuma-thebula sebenzisa pandas ukukhipha bese uguqula kube i-Markdown. I-recall rate +5~15%
Ubukhulu bokuhlukaniswa kombhalo I-chunk encane ilahlekelwa umongo (isb. "ukukhula kwemali engenayo kulo nyaka" lapho "lo" kungaziwa); i-chunk enkulu iletha umsindo omningi ekusesheni. Hlola ubukhulu be-chunk obuhlukile (256/512/768 token), overlap ibe ngu-10~20%; emibhalweni emide, hlukanisa ngemingcele yomongo (isigaba/isihloko) hhayi ubude obungaguquki. I-hit rate / ukwethembeka
Ukungezwa kwemethadatha Ipharagrafu efanele iyatholakala, kodwa umthombo noma isikhathi akwaziwa, noma kudingeka ukuhlunga ngomkhakha. Engeza imethadatha ku-chunk ngayinye: source (igama lefayela/URL), timestamp, page_num, doc_type. Ekusesheni sebenzisa isihlungi (isb. doc_type == 'legal'). Ukunemba kokuhlunga
Ukukhethwa kwemodeli yokushumeka I-embedding ejwayelekile ayisebenzi kahle emikhakheni eqondile (ezokwelapha, ikhodi, umthetho). Sebenzisa amamodeli aqeqeshwe emkhakheni (BGE-large-zh, GTE-Qwen2-7B-instruct); noma uqeqeshe i-embedding model yakho (usebenzisa i-triplet loss). I-MRR@10 yokusesha +10~20%

2. Ukuthuthukiswa kohlangothi lokusesha (ukwenza "ukubuka incwadi" kunembeke)

Ukusesha kunquma ikhwalithi "yemithombo yokubhekisela" enikezwa i-LLM.

Indawo yokuthuthukisa Isimo esiyinkinga Indlela yokwenza Umphumela
Ukusesha okuxubile Ukusesha kwe-vector akukwazi ukufanisa amatemu aqondile (isb. umkhiqizo ABC-123), ukusesha ngamagama akwazi ukuqonda amagama afanayo. Sebenzisa ukusesha kwe-vector (semantic) kanye ne-BM25 (amagama) kanyekanye, uhlanganise ngesisindo (isb. 0.7vector + 0.3BM25) noma ngokuhlela kabusha (rerank). I-recall rate +10~25%
Ukuhlela kabusha (Rerank) Imiphumela yokuqala yokusesha kwe-vector ayiyona ehamba phambili ngokufanele, eyeshumi ingcono kakhulu. Sebenzisa i-cross-encoder model (isb. BGE-reranker-v2, Cohere Rerank) ukuhlela kabusha amaphoyisa (isb. angama-20) bese uthatha i-top-K. I-hit rate ikhuphuke kakhulu (ikakhulukazi i-top-1)
Ukubhala kabusha umbuzo Umbuzo womsebenzisi awucacile noma ezingxoxweni eziningi isithenjwa asicaci ("Iyini intengo yayo?"). Sebenzisa i-LLM ukubhala kabusha umbuzo wangempela ube isimo esifanele ukusesha (isb. "Iyini intengo ye-iPhone 15?"); noma sebenzisa umlando wengxoxo ukugcwalisa. I-recall rate +5~15%
HyDE Umbuzo womsebenzisi mfushane kakhulu noma ungabonakali (isb. "Chaza nge-photosynthesis"), ukusesha okuqondile akusebenzi. Qala vumela i-LLM ikhiqize impendulo ecatshangwayo, bese usebenzisa leyo mpendulo ukusesha imibhalo. Isebenza emkhakheni ovulekile, kodwa ayisebenzi emibuzweni eqondile eyiqiniso
Ukulungiswa kwe-Top-K U-K omncane ungase ungakutholi ulwazi olubalulekile; u-K omkhulu ukhuphula ukusetshenziswa kwe-token nomsindo. Hlola K=3/5/10, ubone ukulinganisela phakathi kwe-recall rate nokwethembeka kwempendulo. Ukuhweba phakathi kokusebenza nomphumela

3. Ukuthuthukiswa kohlangothi lokukhiqiza (ukwenza i-LLM isebenzise imithombo yokubhekisela kahle)

Ukusesha kunembile, kodwa uma i-prompt ingalungile noma imodeli ingamampunge, akusizi.

Indawo yokuthuthukisa Isimo esiyinkinga Indlela yokwenza Umphumela
Ubunjiniyela bezikhumbuzi I-LLM iyayishaya indiva imininingwane efunyenwe, noma iyayiqamba. Nikeza umyalo ocacile: "Phendula kuphela ngokusekelwe emithonjeni enikeziwe. Uma imininingwane inganele noma ingahlobene, phendula ngokuthi 'Akukho lwazi olwanele'." Engeza izibonelo ezimbalwa (few-shot) ezikhombisa ukuthi ungacaphuna kanjani umthombo. Ukwethembeka +20~40%
Ukunciphisa umongo Okuqukethwe okutholakele kukude kakhulu (kweqa iwindi lomongo lemodeli), noma iningi umsindo. Sebenzisa LLMLingua noma selective context ukucindezela, ugcine imishwana ebaluleke kakhulu ngaphambi kokuyinika i-LLM. Ukunciphisa ubungozi bokulahlekelwa ulwazi
Ukuthuthukiswa kwemodeli ye-LLM Amamodeli amancane (7B) awakwazi ukwenza ukucabanga okuyinkimbinkimbi, noma akukhumbuli umongo omude. Shintshela kumamodeli anamandla (GPT-4o, Claude 3.5 Sonnet, Qwen2.5-72B). Ukunemba kokucabanga kukhuphuka kakhulu
Ukusakaza nezithenjwa Umsebenzisi akakwazi ukuqinisekisa ukwethembeka kwempendulo. Ekukhiqizeni vumela i-LLM ikhiphe [citation:1] ehambisana nenombolo yomthombo. Ngemuva, engeza isixhumanisi sombhalo wangempela. Ukwethembeka komsebenzisi + ukuhlulwa
Ukulinganisa ukwenqaba ukuphendula Imodeli iyayiqamba lapho kungafanele, noma ithi ayazi lapho kufanele iphendule. Misa umkhawulo wokufana: uma i-chunk engu-top-1 itholakale inokufana kwe-cosine engaphansi kuka-0.7 nomuzo, khumbuza i-LLM ukuthi "imininingwane ayihlobene". Ukunciphisa izinga le-hallucination

4. Uhlangothi lokuhlola nokuphindaphinda (ukwazi ukuthi ulungisa kuphi)

Ngaphandle kwesilinganiso akukho ukuthuthukisa.

Indawo yokuthuthukisa Indlela yokwenza Izinkomba
Ukusungula isethi yokuhlola Lungisa imibuzo yangempela yabasebenzisi engu-100~300 + izimpendulo ezijwayelekile + ama-ID emibhalo efanele yokusesha. Kufanele ihlanganise amazinga obunzima ahlukene, izinhloso ezahlukene.
Ukuhlola okuzenzakalelayo Sebenzisa RAGAS (Faithfulness, Answer Relevance, Context Recall) noma TruLens. Izinkomba eziyinhloko ezintathu: ukwethembeka, ukuhambisana kwempendulo, ukukhumbula umongo.
Ukuhlola komuntu Viki ngamunye hlola amacala amabi angu-20, uhlaziye uhlobo lwephutha (ukuhluleka kokusesha / iphutha lokukhiqiza / ukungabi khona emtatsheni wolwazi). Ukuhlela ukuthuthukiswa ngokubaluleka.
Ukuhlola A/B Endaweni yokukhiqiza, hlola amasu ahlukene okusesha emaqenjini ahlukene (isb. BM25 vs ukusesha okuxubile). Izinkomba eziku-inthanethi: ukwaneliseka komsebenzisi, izinga lokungabi nasixazululo.

5. "Ulwazi olungokoqobo" ongakusho ekuphikweni (amaphuzu engeziwe)

"Kumsebenzi wami weprojekthi ye-RAG, ekuqaleni i-hit rate yayingu-67% kuphela. Ngenza izinto ezintathu:
1. Ukuhlukaniswa kwemibhalo kusuka ku-1024 okungaguquki kwashintsha kwaba ukuhlukaniswa okuhambisana nomongo (ngokwezigaba nezihloko), i-hit rate yakhuphuka yaba ngu-74%;
2. Ngengeza ukusesha okuxubile (vector + BM25) kanye nererank model encane, i-hit rate yakhuphuka yaba ngu-83%;
3. Ngathuthukisa i-prompt futhi ngaphoqa ukuthi kuthiwe '[Akukho lwazi olutholakele]', izinga le-hallucination lehla lisuka ku-22% laya ngaphansi kuka-5%.

Ngaphezu kwalokho, sakha umgudu wokuhlola oqhubekayo, ngaphambi koshintsho ngalunye sihlephula amaphuzu e-RAGAS emibuzweni engu-200 ukuqinisekisa ukuthi akukho ukuwohloka."


Isiphetho sombhalo wonke: Umhlahlandlela ophelele wokuthuthukisa i-RAG

Ungqimba lwedatha ─→ Ukuhlanza imibhalo, ukuthuthukiswa kokuhlukaniswa, ukuthuthukiswa kwemethadatha, i-embedding yomkhakha
Ungqimba lokusesha ─→ Ukusesha okuxubile, ukuhlela kabusha, ukubhala kabusha umbuzo, i-HyDE, ukulungiswa kwe-Top-K
Ungqimba lokukhiqiza ─→ Ukuqinisa i-prompt, izimfuneko zomyalo, ukunciphisa, izithenjwa, umkhawulo wokwenqaba
Ungqimba lokuhlola ─→ Isethi yokuhlola, i-RAGAS, ukuhlaziya komuntu, isivivinyo A/B

评论

暂无已展示的评论。

发表评论(匿名)