{"id":686,"date":"2025-01-29T07:35:31","date_gmt":"2025-01-29T07:35:31","guid":{"rendered":"https:\/\/janusai.pro\/?p=686"},"modified":"2025-01-29T07:37:05","modified_gmt":"2025-01-29T07:37:05","slug":"i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive","status":"publish","type":"post","link":"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/","title":{"rendered":"Am distilat cuno\u0219tin\u021bele despre capacitatea de ra\u021bionament a lui DeepSeek-R1 \u00een Qwen2, iar rezultatele au fost cu adev\u0103rat explozive!!!"},"content":{"rendered":"<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Tabla de con\u021binut<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Tabelul de con\u021binut\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#%E2%85%A0_What_is_knowledge_distillation\" >\u2160. Ce este distilarea cuno\u0219tin\u021belor?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#IICore_concepts\" >II.Concepte de baz\u0103<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#21_Template_design\" >2.1 Proiectarea \u0219ablonului<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#22_Reasoning_trajectory_The_%E2%80%9Cthinking_chain%E2%80%9D_of_the_models_solution\" >2.2 Traiectoria ra\u021bionamentului: \"Lan\u021bul de g\u00e2ndire\" al solu\u021biei modelului<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#23_Rejection_sampling_Filtering_good_data_from_%E2%80%9Ctrial_and_error\" >2.3 E\u0219antionarea de respingere: Filtrarea datelor bune din \"\u00eencerc\u0103ri \u0219i erori<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#%E2%85%A2Generation_of_distilled_data\" >\u2162.Generarea de date distilate<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#Data_sources\" >Surse de date:<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#Distillation_data_generation_process\" >Procesul de generare a datelor de distilare:<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#%E2%85%A3Distillation_process\" >\u2163.Procesul de distilare<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#Teacher_and_student_roles\" >Rolurile profesorului \u0219i ale elevului:<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#Training_steps\" >Etape de formare:<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#%E2%85%A4_Example_demonstration\" >\u2164. Exemplu demonstrativ<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/janusai.pro\/ro\/i-distilled-deepseek-r1s-reasoning-ability-knowledge-into-qwen2-and-the-results-were-really-explosive\/#%E2%85%A5_Summary\" >\u2165. Rezumat<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"%E2%85%A0_What_is_knowledge_distillation\"><\/span><strong>\u2160. <\/strong>Ce este distilarea cuno\u0219tin\u021belor?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Distilarea cuno\u0219tin\u021belor este o tehnic\u0103 de comprimare a modelelor utilizat\u0103 pentru a transfera cuno\u0219tin\u021be de la un model mare \u0219i complex (modelul profesorului) la un model mic (modelul elevului). <\/p>\n\n\n\n<p>Principiul de baz\u0103 este c\u0103 modelul profesor \u00eel \u00eenva\u021b\u0103 pe modelul elev prin prezicerea rezultatelor (cum ar fi distribu\u021biile de probabilit\u0103\u021bi sau procesele de inferen\u021b\u0103), iar modelul elev \u00ee\u0219i \u00eembun\u0103t\u0103\u021be\u0219te performan\u021ba \u00eenv\u0103\u021b\u00e2nd din aceste preziceri. <\/p>\n\n\n\n<p>Aceast\u0103 metod\u0103 este potrivit\u0103 \u00een special pentru dispozitive cu resurse limitate, cum ar fi telefoanele mobile sau dispozitivele integrate.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"IICore_concepts\"><\/span>II.Concepte de baz\u0103<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"21_Template_design\"><\/span>2.1 Proiectarea \u0219ablonului<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u0218ablon: Un format structurat utilizat pentru standardizarea rezultatelor modelului. De exemplu\n<ul class=\"wp-block-list\">\n<li>: Marcheaz\u0103 \u00eenceputul procesului de ra\u021bionament.<\/li>\n\n\n\n<li>: Marcheaz\u0103 sf\u00e2r\u0219itul procesului de ra\u021bionament.<\/li>\n\n\n\n<li>: Marcheaz\u0103 \u00eenceputul r\u0103spunsului final.<\/li>\n\n\n\n<li>: Marcheaz\u0103 sf\u00e2r\u0219itul r\u0103spunsului final.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Func\u021bie:\n<ul class=\"wp-block-list\">\n<li>Claritate: La fel ca \"cuvintele cheie\" dintr-o \u00eentrebare de completat spa\u021biile goale, aceasta spune modelului \"procesul de g\u00e2ndire merge aici, iar r\u0103spunsul merge acolo\".<\/li>\n\n\n\n<li>Coeren\u021b\u0103: Asigur\u0103 faptul c\u0103 toate rezultatele urmeaz\u0103 aceea\u0219i structur\u0103, facilit\u00e2nd prelucrarea \u0219i analiza ulterioar\u0103.<\/li>\n\n\n\n<li>lizibilitate: fiin\u021bele umane pot distinge cu u\u0219urin\u021b\u0103 \u00eentre procesul de ra\u021bionament \u0219i r\u0103spuns, \u00eembun\u0103t\u0103\u021bind experien\u021ba utilizatorului.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"22_Reasoning_trajectory_The_%E2%80%9Cthinking_chain%E2%80%9D_of_the_models_solution\"><\/span>2.2 Traiectoria ra\u021bionamentului: \"Lan\u021bul de g\u00e2ndire\" al solu\u021biei modelului<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Traiectoria ra\u021bionamentului: Etapele detaliate generate de model atunci c\u00e2nd rezolv\u0103 o problem\u0103 arat\u0103 lan\u021bul logic al modelului.<\/li>\n\n\n\n<li>Exemplu:<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"759\" height=\"290\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/b8eff676-f9d7-436c-9ee7-1e423242825d.png\" alt=\"\" class=\"wp-image-689\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/b8eff676-f9d7-436c-9ee7-1e423242825d.png 759w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/b8eff676-f9d7-436c-9ee7-1e423242825d-300x115.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/b8eff676-f9d7-436c-9ee7-1e423242825d-18x7.png 18w\" sizes=\"auto, (max-width: 759px) 100vw, 759px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"23_Rejection_sampling_Filtering_good_data_from_%E2%80%9Ctrial_and_error\"><\/span>2.3 E\u0219antionarea de respingere: Filtrarea datelor bune din \"\u00eencerc\u0103ri \u0219i erori<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>E\u0219antionare de respingere: Generarea mai multor r\u0103spunsuri ale candida\u021bilor \u0219i re\u021binerea celor bune, similar cu scrierea unui proiect \u0219i apoi copierea r\u0103spunsului corect la un examen.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"%E2%85%A2Generation_of_distilled_data\"><\/span>\u2162.Generarea de date distilate<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Primul pas \u00een distilarea cuno\u0219tin\u021belor este de a genera \"date de predare\" de \u00eenalt\u0103 calitate din care modelele mici s\u0103 \u00eenve\u021be.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Data_sources\"><\/span><strong>Surse de date<\/strong>:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>80% din datele de ra\u021bionament generate de <a href=\"https:\/\/huggingface.co\/deepseek-ai\/DeepSeek-R1\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">DeepSeek-R1<\/a><\/li>\n\n\n\n<li>20% din datele generale ale sarcinii DeepSeek-V3.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Distillation_data_generation_process\"><\/span><strong>Procesul de generare a datelor de distilare<\/strong>:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Filtrarea regulilor<\/strong>: verific\u0103 automat corectitudinea r\u0103spunsului (de exemplu, dac\u0103 r\u0103spunsul matematic este conform formulei).<\/li>\n\n\n\n<li><strong>Verificarea lizibilit\u0103\u021bii<\/strong>: elimin\u0103 limbile mixte (de exemplu, chinez\u0103 \u0219i englez\u0103 mixte) sau paragrafele lungi.<\/li>\n\n\n\n<li><strong>Generarea ghidat\u0103 de \u0219ablon<\/strong>: necesit\u0103 ca DeepSeek-R1 s\u0103 produc\u0103 traiectorii de inferen\u021b\u0103 \u00een conformitate cu modelul.<\/li>\n\n\n\n<li><strong>Filtrarea e\u0219antion\u0103rii de respingere<\/strong>:<\/li>\n\n\n\n<li><strong>Integrarea datelor<\/strong>: 800.000 de probe de \u00eenalt\u0103 calitate au fost generate \u00een cele din urm\u0103, inclusiv aproximativ 600.000 de date de inferen\u021b\u0103 \u0219i aproximativ 200.000 de date generale.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"%E2%85%A3Distillation_process\"><\/span>\u2163.Procesul de distilare<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Teacher_and_student_roles\"><\/span>Rolurile profesorului \u0219i ale elevului:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>DeepSeek-R1 ca model de profesor;<\/li>\n\n\n\n<li>Modele din seria Qwen ca model de student.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_steps\"><\/span>Etape de formare:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>\u00cen primul r\u00e2nd, introducerea datelor: trebuie s\u0103 introduce\u021bi partea de \u00eentrebare din cele 800 000 de e\u0219antioane \u00een modelul Qwen \u0219i s\u0103 \u00eei cere\u021bi s\u0103 genereze o traiectorie de inferen\u021b\u0103 complet\u0103 (proces de g\u00e2ndire + r\u0103spuns) \u00een conformitate cu modelul. Acesta este un pas foarte important<\/p>\n\n\n\n<p>\u00cen continuare, calculul pierderilor: compara\u021bi rezultatul generat de modelul elevului cu traiectoria de inferen\u021b\u0103 a modelului profesorului \u0219i alinia\u021bi secven\u021ba de text prin reglarea fin\u0103 supravegheat\u0103 (SFT). Dac\u0103 nu sunte\u021bi sigur ce este SFT, sper c\u0103 ve\u021bi c\u0103uta acest cuv\u00e2nt-cheie pentru a afla mai multe<\/p>\n\n\n\n<p>Completa\u021bi actualiz\u0103rile parametrilor pentru modelul mai mare al elevului: Optimiza\u021bi parametrii modelului Qwen prin retropropagare pentru a aproxima rezultatul modelului profesorului.<\/p>\n\n\n\n<p>Repetarea acestui proces de formare de mai multe ori asigur\u0103 un transfer suficient de cuno\u0219tin\u021be. Astfel se atinge obiectivul ini\u021bial al form\u0103rii. V\u0103 vom da un exemplu pentru a demonstra acest lucru \u0219i sper\u0103m c\u0103 ve\u021bi \u00een\u021belege<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"%E2%85%A4_Example_demonstration\"><\/span>\u2164. Exemplu demonstrativ<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Articolul demonstreaz\u0103 efectul de distilare prin intermediul unei sarcini specifice de rezolvare a unei ecua\u021bii (rezolvarea ecua\u021biei):<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Rezultat standard al modelului profesorului:<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"771\" height=\"328\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/3a53b6a8-36d2-4251-ab0f-8646d7646352.png\" alt=\"\" class=\"wp-image-690\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/3a53b6a8-36d2-4251-ab0f-8646d7646352.png 771w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/3a53b6a8-36d2-4251-ab0f-8646d7646352-300x128.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/3a53b6a8-36d2-4251-ab0f-8646d7646352-768x327.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/3a53b6a8-36d2-4251-ab0f-8646d7646352-18x8.png 18w\" sizes=\"auto, (max-width: 771px) 100vw, 771px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Produc\u021bia Qwen-7B \u00eenainte de distilare:<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"766\" height=\"178\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/51c44a52-01a0-474a-8d47-5483613286fb.png\" alt=\"\" class=\"wp-image-688\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/51c44a52-01a0-474a-8d47-5483613286fb.png 766w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/51c44a52-01a0-474a-8d47-5483613286fb-300x70.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/51c44a52-01a0-474a-8d47-5483613286fb-18x4.png 18w\" sizes=\"auto, (max-width: 766px) 100vw, 766px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Qwen-7B rezultat dup\u0103 distilare:<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"759\" height=\"260\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/61c7fb80-d903-4339-971c-9613b5ac199c.png\" alt=\"\" class=\"wp-image-687\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/61c7fb80-d903-4339-971c-9613b5ac199c.png 759w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/61c7fb80-d903-4339-971c-9613b5ac199c-300x103.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/01\/61c7fb80-d903-4339-971c-9613b5ac199c-18x6.png 18w\" sizes=\"auto, (max-width: 759px) 100vw, 759px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Solu\u021bie optimizat\u0103: Este generat un proces de inferen\u021b\u0103 structurat, iar r\u0103spunsul este acela\u0219i cu modelul profesorului.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"%E2%85%A5_Summary\"><\/span>\u2165. Rezumat<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Prin distilarea cuno\u0219tin\u021belor, capacitatea de inferen\u021b\u0103 a DeepSeek-R1 este transferat\u0103 eficient c\u0103tre seria Qwen de modele mici. Acest proces se concentreaz\u0103 pe ie\u0219irea \u0219ablonat\u0103 \u0219i pe e\u0219antionarea respingerii. Prin generarea de date structurate \u0219i formarea rafinat\u0103, modelele mici pot efectua, de asemenea, sarcini complexe de inferen\u021b\u0103 \u00een scenarii cu resurse limitate. Aceast\u0103 tehnologie ofer\u0103 o referin\u021b\u0103 important\u0103 pentru implementarea u\u0219oar\u0103 a modelelor AI.<\/p>","protected":false},"excerpt":{"rendered":"<p>\u2160. Ce este distilarea cuno\u0219tin\u021belor? Distilarea cuno\u0219tin\u021belor este o tehnic\u0103 de comprimare a modelelor utilizat\u0103 pentru a transfera cuno\u0219tin\u021be de la un model mare \u0219i complex (modelul profesorului) la un model mic (modelul elevului). Principiul de baz\u0103 este c\u0103 modelul profesor \u00eel \u00eenva\u021b\u0103 pe modelul student prin prezicerea rezultatelor (cum ar fi distribu\u021biile de probabilit\u0103\u021bi sau procesele de inferen\u021b\u0103), iar...<\/p>","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-686","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/posts\/686","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/comments?post=686"}],"version-history":[{"count":2,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/posts\/686\/revisions"}],"predecessor-version":[{"id":692,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/posts\/686\/revisions\/692"}],"wp:attachment":[{"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/media?parent=686"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/categories?post=686"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/janusai.pro\/ro\/wp-json\/wp\/v2\/tags?post=686"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}