{"id":847,"date":"2025-02-04T16:27:27","date_gmt":"2025-02-04T16:27:27","guid":{"rendered":"https:\/\/janusai.pro\/?p=847"},"modified":"2025-02-04T16:27:28","modified_gmt":"2025-02-04T16:27:28","slug":"how-good-is-deepseeks-janus-pro","status":"publish","type":"post","link":"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/","title":{"rendered":"Jak dobr\u00fd je Janus-Pro spole\u010dnosti DeepSeek?"},"content":{"rendered":"<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>\n<p>V p\u0159edve\u010der jarn\u00edho festivalu byl vyd\u00e1n model DeepSeek-R1. D\u00edky sv\u00e9 \u010dist\u011b RL architektu\u0159e se pou\u010dil z velk\u00fdch inovac\u00ed spole\u010dnosti CoT a p\u0159ekon\u00e1v\u00e1 ji. <a href=\"https:\/\/openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">ChatGPT<\/a> v matematice, k\u00f3du a logick\u00e9m uva\u017eov\u00e1n\u00ed.<\/p>\n\n\n\n<p>Krom\u011b toho se DeepSeek d\u00edky sv\u00fdm otev\u0159en\u00fdm modelov\u00fdm v\u00e1h\u00e1m, n\u00edzk\u00fdm n\u00e1klad\u016fm na tr\u00e9nink a levn\u00fdm cen\u00e1m API stal hitem na internetu, co\u017e dokonce zp\u016fsobilo, \u017ee ceny akci\u00ed spole\u010dnost\u00ed NVIDIA a ASML na \u010das prudce klesly.<\/p>\n\n\n\n<p>Spole\u010dnost DeepSeek vydala tak\u00e9 aktualizovanou verzi multimod\u00e1ln\u00edho velk\u00e9ho modelu Janus (Janus), Janus-Pro, kter\u00fd zd\u011bdil jednotnou architekturu p\u0159edchoz\u00ed generace multimod\u00e1ln\u00edho porozum\u011bn\u00ed a generov\u00e1n\u00ed a optimalizoval strategii tr\u00e9nov\u00e1n\u00ed, \u0161k\u00e1lov\u00e1n\u00ed tr\u00e9ninkov\u00fdch dat a velikosti modelu, co\u017e p\u0159in\u00e1\u0161\u00ed vy\u0161\u0161\u00ed v\u00fdkon.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"427\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02.png\" alt=\"\" class=\"wp-image-850\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-300x119.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-1024x405.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-768x304.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-18x7.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"522\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481.png\" alt=\"\" class=\"wp-image-854\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-300x145.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-1024x495.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-768x371.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-18x9.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Obsah<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"P\u0159epnut\u00ed tabulky obsahu\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">P\u0159ep\u00edna\u010d<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Janus-Pro\" >Janus-Pro<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Model_architecture\" >Architektura modelu<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Training_strategy\" >Strategie \u0161kolen\u00ed<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Training_data_scaling\" >\u0160k\u00e1lov\u00e1n\u00ed tr\u00e9ninkov\u00fdch dat<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Model_scaling\" >Modelov\u00e9 \u0161k\u00e1lov\u00e1n\u00ed<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/janusai.pro\/cs\/how-good-is-deepseeks-janus-pro\/#Model_evaluation\" >Hodnocen\u00ed modelu<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Janus-Pro\"><\/span>Janus-Pro<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus-Pro<\/a> je jednotn\u00fd multimod\u00e1ln\u00ed jazykov\u00fd model (MLLM), kter\u00fd dok\u00e1\u017ee sou\u010dasn\u011b zpracov\u00e1vat \u00falohy multimod\u00e1ln\u00edho porozum\u011bn\u00ed a \u00falohy generov\u00e1n\u00ed, tj. dok\u00e1\u017ee porozum\u011bt obsahu obr\u00e1zku a z\u00e1rove\u0148 generovat text.<\/p>\n\n\n\n<p>Odd\u011bluje vizu\u00e1ln\u00ed kod\u00e9ry pro multimod\u00e1ln\u00ed porozum\u011bn\u00ed a generov\u00e1n\u00ed (tj. pro vstup porozum\u011bn\u00ed obrazu a vstup a v\u00fdstup generov\u00e1n\u00ed obrazu se pou\u017e\u00edvaj\u00ed r\u016fzn\u00e9 tokeniz\u00e9ry) a zpracov\u00e1v\u00e1 je pomoc\u00ed jednotn\u00e9ho autoregresn\u00edho transform\u00e1toru.<\/p>\n\n\n\n<p>Jako pokro\u010dil\u00fd multimod\u00e1ln\u00ed model porozum\u011bn\u00ed a generov\u00e1n\u00ed je vylep\u0161enou verz\u00ed p\u0159edchoz\u00edho modelu Janus.<\/p>\n\n\n\n<p>Janus (Janus) je v \u0159\u00edmsk\u00e9 mytologii b\u016fh str\u00e1\u017ece se dv\u011bma tv\u00e1\u0159emi, kter\u00fd symbolizuje rozpor a p\u0159echod. M\u00e1 dv\u011b tv\u00e1\u0159e, co\u017e tak\u00e9 nazna\u010duje, \u017ee model Janus dok\u00e1\u017ee ch\u00e1pat a vytv\u00e1\u0159et obrazy, co\u017e je velmi vhodn\u00e9. Co p\u0159esn\u011b tedy PRO inovoval?<\/p>\n\n\n\n<p>Janus jako mal\u00fd model 1.3B je sp\u00ed\u0161e n\u00e1hledovou verz\u00ed ne\u017e ofici\u00e1ln\u00ed verz\u00ed. Zkoum\u00e1 jednotn\u00e9 multimod\u00e1ln\u00ed porozum\u011bn\u00ed a generov\u00e1n\u00ed, ale m\u00e1 mnoho probl\u00e9m\u016f, jako jsou nestabiln\u00ed efekty generov\u00e1n\u00ed obr\u00e1zk\u016f, velk\u00e9 odchylky od u\u017eivatelsk\u00fdch pokyn\u016f a nedostate\u010dn\u00e9 detaily.<\/p>\n\n\n\n<p>Verze Pro optimalizuje strategii tr\u00e9nov\u00e1n\u00ed, zv\u011bt\u0161uje soubor tr\u00e9ninkov\u00fdch dat a poskytuje v\u011bt\u0161\u00ed model (7B), ze kter\u00e9ho lze vyb\u00edrat, a z\u00e1rove\u0148 poskytuje model 1B.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_architecture\"><\/span>Architektura modelu<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Jaus-Pro a Janus<\/a> jsou z hlediska architektury modelu toto\u017en\u00e9. (Pouze 1,3B! Janus sjednocuje multimod\u00e1ln\u00ed porozum\u011bn\u00ed a generov\u00e1n\u00ed)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"571\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006.png\" alt=\"\" class=\"wp-image-851\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-300x159.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-1024x541.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-768x406.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-18x10.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>Hlavn\u00edm principem n\u00e1vrhu je odd\u011blit vizu\u00e1ln\u00ed k\u00f3dov\u00e1n\u00ed a podpo\u0159it multimod\u00e1ln\u00ed porozum\u011bn\u00ed a generov\u00e1n\u00ed. Janus-Pro k\u00f3duje p\u016fvodn\u00ed obrazov\u00fd\/textov\u00fd vstup odd\u011blen\u011b, extrahuje z n\u011bj vysokorozm\u011brn\u00e9 rysy a zpracov\u00e1v\u00e1 je prost\u0159ednictv\u00edm jednotn\u00e9ho autoregresn\u00edho transform\u00e1toru.<\/p>\n\n\n\n<p>Multimod\u00e1ln\u00ed porozum\u011bn\u00ed obrazu pou\u017e\u00edv\u00e1 SigLIP ke k\u00f3dov\u00e1n\u00ed obrazov\u00fdch prvk\u016f (modr\u00fd kod\u00e9r na obr\u00e1zku v\u00fd\u0161e) a \u00faloha generov\u00e1n\u00ed pou\u017e\u00edv\u00e1 tokeniz\u00e1tor VQ k diskretizaci obrazu (\u017elut\u00fd kod\u00e9r na obr\u00e1zku v\u00fd\u0161e). Nakonec jsou v\u0161echny posloupnosti rys\u016f vlo\u017eeny do LLM ke zpracov\u00e1n\u00ed<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_strategy\"><\/span>Strategie \u0161kolen\u00ed<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Pokud jde o strategii vzd\u011bl\u00e1v\u00e1n\u00ed, spole\u010dnost Janus-Pro dos\u00e1hla dal\u0161\u00edch zlep\u0161en\u00ed. Star\u00e1 verze syst\u00e9mu Janus pou\u017e\u00edvala t\u0159\u00edstup\u0148ovou tr\u00e9ninkovou strategii, v n\u00ed\u017e se v etap\u011b I tr\u00e9nuje vstupn\u00ed adapt\u00e9r a hlava pro generov\u00e1n\u00ed obrazu pro porozum\u011bn\u00ed obrazu a generov\u00e1n\u00ed obrazu, v etap\u011b II se prov\u00e1d\u00ed jednotn\u00e9 p\u0159edtr\u00e9nov\u00e1n\u00ed a v etap\u011b III se na tomto z\u00e1klad\u011b dola\u010fuje k\u00f3dova\u010d porozum\u011bn\u00ed. (Tr\u00e9ninkov\u00e1 strategie syst\u00e9mu Janus je zn\u00e1zorn\u011bna na obr\u00e1zku n\u00ed\u017ee.)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"381\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a.png\" alt=\"\" class=\"wp-image-849\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-300x106.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-1024x361.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-768x271.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-18x6.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>Tato strategie v\u0161ak pou\u017e\u00edv\u00e1 metodu PixArt k rozd\u011blen\u00ed tr\u00e9nov\u00e1n\u00ed generov\u00e1n\u00ed textu na obraz ve f\u00e1zi II, co\u017e m\u00e1 za n\u00e1sledek n\u00edzkou v\u00fdpo\u010detn\u00ed \u00fa\u010dinnost.<\/p>\n\n\n\n<p>Za t\u00edmto \u00fa\u010delem jsme prodlou\u017eili dobu tr\u00e9ninku v prvn\u00ed f\u00e1zi a p\u0159idali tr\u00e9nink s daty s\u00edt\u011b ImageNet, aby model mohl efektivn\u011b modelovat z\u00e1vislosti pixel\u016f s pevn\u00fdmi parametry LLM. V etap\u011b II jsme vy\u0159adili data s\u00edt\u011b ImageNet a k tr\u00e9nov\u00e1n\u00ed jsme pou\u017eili p\u0159\u00edmo data dvojic text-obr\u00e1zek, co\u017e zlep\u0161uje efektivitu tr\u00e9nov\u00e1n\u00ed. Krom\u011b toho jsme v etap\u011b III upravili pom\u011br dat (multimod\u00e1ln\u00ed:pouze textov\u00e1:vizu\u00e1ln\u011b-s\u00e9mantick\u00e1 grafick\u00e1 data ze 7:3:10 na 5:1:4), \u010d\u00edm\u017e jsme zlep\u0161ili multimod\u00e1ln\u00ed porozum\u011bn\u00ed p\u0159i zachov\u00e1n\u00ed schopnosti vizu\u00e1ln\u00edho generov\u00e1n\u00ed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_data_scaling\"><\/span>\u0160k\u00e1lov\u00e1n\u00ed tr\u00e9ninkov\u00fdch dat<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Janus-Pro tak\u00e9 \u0161k\u00e1luje tr\u00e9ninkov\u00e1 data syst\u00e9mu Janus, pokud jde o multimod\u00e1ln\u00ed porozum\u011bn\u00ed a vizu\u00e1ln\u00ed generov\u00e1n\u00ed.<\/p>\n\n\n\n<p>Multimod\u00e1ln\u00ed porozum\u011bn\u00ed: P\u0159edtr\u00e9ninkov\u00e1 data f\u00e1ze II jsou zalo\u017eena na DeepSeek-VL2 a zahrnuj\u00ed p\u0159ibli\u017en\u011b 90 milion\u016f nov\u00fdch vzork\u016f, v\u010detn\u011b dat pro popisky obr\u00e1zk\u016f (nap\u0159\u00edklad YFCC) a dat pro porozum\u011bn\u00ed tabulk\u00e1m, graf\u016fm a dokument\u016fm (nap\u0159\u00edklad Docmatix).<\/p>\n\n\n\n<p>F\u00e1ze III dola\u010fov\u00e1n\u00ed pod dohledem d\u00e1le zav\u00e1d\u00ed porozum\u011bn\u00ed MEME, \u010d\u00ednsk\u00e1 dialogov\u00e1 data atd., aby se zlep\u0161il v\u00fdkon modelu p\u0159i zpracov\u00e1n\u00ed v\u00edce \u00faloh a schopnosti dialogu.<\/p>\n\n\n\n<p>Vizu\u00e1ln\u00ed generov\u00e1n\u00ed: P\u0159edchoz\u00ed verze pou\u017e\u00edvaly re\u00e1ln\u00e1 data s n\u00edzkou kvalitou a vysok\u00fdm \u0161umem, co\u017e ovliv\u0148ovalo stabilitu a estetiku generovan\u00fdch text\u016f.<\/p>\n\n\n\n<p>Janus-Pro zav\u00e1d\u00ed p\u0159ibli\u017en\u011b 72 milion\u016f syntetick\u00fdch estetick\u00fdch dat, \u010d\u00edm\u017e se pom\u011br re\u00e1ln\u00fdch a syntetick\u00fdch dat dost\u00e1v\u00e1 na hodnotu 1:1. Experimenty uk\u00e1zaly, \u017ee syntetick\u00e1 data urychluj\u00ed konvergenci modelu a v\u00fdrazn\u011b zlep\u0161uj\u00ed stabilitu a estetickou kvalitu generovan\u00fdch sn\u00edmk\u016f.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_scaling\"><\/span>Modelov\u00e9 \u0161k\u00e1lov\u00e1n\u00ed<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Janus Pro roz\u0161i\u0159uje velikost modelu na 7B, zat\u00edmco p\u0159edchoz\u00ed verze Janusu pou\u017e\u00edvala 1,5B DeepSeek-LLM k ov\u011b\u0159en\u00ed \u00fa\u010dinnosti odd\u011blov\u00e1n\u00ed vizu\u00e1ln\u00edho k\u00f3dov\u00e1n\u00ed. Experimenty ukazuj\u00ed, \u017ee v\u011bt\u0161\u00ed LLM v\u00fdrazn\u011b urychluje konvergenci multimod\u00e1ln\u00edho porozum\u011bn\u00ed a vizu\u00e1ln\u00edho generov\u00e1n\u00ed, co\u017e d\u00e1le ov\u011b\u0159uje silnou \u0161k\u00e1lovatelnost metody.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"864\" height=\"352\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b.png\" alt=\"\" class=\"wp-image-848\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b.png 864w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-300x122.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-768x313.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-18x7.png 18w\" sizes=\"auto, (max-width: 864px) 100vw, 864px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"536\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba.png\" alt=\"\" class=\"wp-image-852\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-300x149.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-1024x508.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-768x381.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-18x9.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>Experiment pou\u017e\u00edv\u00e1 jako z\u00e1kladn\u00ed jazykov\u00fd model DeepSeek-LLM (1.5B a 7B, podporuj\u00edc\u00ed maxim\u00e1ln\u00ed sekvenci 4096). Pro multimod\u00e1ln\u00ed \u00falohu porozum\u011bn\u00ed je jako vizu\u00e1ln\u00ed kod\u00e9r pou\u017eit SigLIP-Large-Patch16-384, velikost slovn\u00edku kod\u00e9ru je 16384, n\u00e1sobek sn\u00ed\u017een\u00ed vzorkov\u00e1n\u00ed obrazu je 16 a adapt\u00e9ry pro porozum\u011bn\u00ed i generov\u00e1n\u00ed jsou dvouvrstv\u00e9 MLP.<\/p>\n\n\n\n<p>Ve druh\u00e9 f\u00e1zi tr\u00e9ninku se pou\u017e\u00edv\u00e1 strategie v\u010dasn\u00e9ho zastaven\u00ed 270K, v\u0161echny sn\u00edmky jsou jednotn\u011b upraveny na rozli\u0161en\u00ed 384 \u00d7 384 a pro zv\u00fd\u0161en\u00ed efektivity tr\u00e9ninku se pou\u017e\u00edv\u00e1 balen\u00ed sekvenc\u00ed . Janus-Pro je tr\u00e9nov\u00e1n a vyhodnocov\u00e1n pomoc\u00ed HAI-LLM. Verze 1,5B\/7B byly tr\u00e9nov\u00e1ny na 16\/32 uzlech (8\u00d7Nvidia A100 40GB na uzel) po dobu 9\/14 dn\u00ed, v tomto po\u0159ad\u00ed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_evaluation\"><\/span>Hodnocen\u00ed modelu<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Janus-Pro byl hodnocen samostatn\u011b v multimod\u00e1ln\u00edm porozum\u011bn\u00ed a generov\u00e1n\u00ed. Celkov\u011b m\u016f\u017ee b\u00fdt porozum\u011bn\u00ed m\u00edrn\u011b slab\u00e9, ale mezi open source modely stejn\u00e9 velikosti je pova\u017eov\u00e1no za vynikaj\u00edc\u00ed (h\u00e1dejte, \u017ee je do zna\u010dn\u00e9 m\u00edry omezeno pevn\u00fdm vstupn\u00edm rozli\u0161en\u00edm a mo\u017enostmi OCR).<\/p>\n\n\n\n<p>Janus-Pro-7B dos\u00e1hl v benchmarkov\u00e9m testu MMBench sk\u00f3re 79,2 bod\u016f, co\u017e se bl\u00ed\u017e\u00ed \u00farovni open source model\u016f prvn\u00ed \u00farovn\u011b (stejn\u00e1 velikost InternVL2.5 a Qwen2-VL se pohybuje kolem 82 bod\u016f). Oproti p\u0159edchoz\u00ed generaci Janusu se v\u0161ak jedn\u00e1 o dobr\u00e9 zlep\u0161en\u00ed.<\/p>\n\n\n\n<p>Z hlediska generov\u00e1n\u00ed obr\u00e1zk\u016f je zlep\u0161en\u00ed oproti p\u0159edchoz\u00ed generaci je\u0161t\u011b v\u00fdrazn\u011bj\u0161\u00ed a mezi open source modely je pova\u017eov\u00e1no za vynikaj\u00edc\u00ed \u00farove\u0148. Sk\u00f3re Janus-Pro ve srovn\u00e1vac\u00edm testu GenEval (0,80) rovn\u011b\u017e p\u0159ekon\u00e1v\u00e1 modely jako DALL-E 3 (0,67) a Stable Diffusion 3 Medium (0,74).<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"827\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349.png\" alt=\"\" class=\"wp-image-853\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-300x230.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-1024x784.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-768x588.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-16x12.png 16w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"744\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55.png\" alt=\"\" class=\"wp-image-855\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-300x207.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-1024x705.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-768x529.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-18x12.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>","protected":false},"excerpt":{"rendered":"<p>V p\u0159edve\u010der jarn\u00edho festivalu byl vyd\u00e1n model DeepSeek-R1. D\u00edky sv\u00e9 \u010dist\u011b RL architektu\u0159e se pou\u010dil z velk\u00fdch inovac\u00ed CoT a v matematice, k\u00f3du a logick\u00e9m uva\u017eov\u00e1n\u00ed p\u0159ekon\u00e1v\u00e1 ChatGPT. Nav\u00edc d\u00edky sv\u00fdm otev\u0159en\u00fdm zdrojov\u00fdm k\u00f3d\u016fm vah modelu, n\u00edzk\u00fdm n\u00e1klad\u016fm na tr\u00e9nov\u00e1n\u00ed a levn\u00fdm cen\u00e1m API se DeepSeek stal hitem nap\u0159\u00ed\u010d internetem, a to i...<\/p>","protected":false},"author":2,"featured_media":704,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-847","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/posts\/847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/comments?post=847"}],"version-history":[{"count":1,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/posts\/847\/revisions"}],"predecessor-version":[{"id":856,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/posts\/847\/revisions\/856"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/media\/704"}],"wp:attachment":[{"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/media?parent=847"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/categories?post=847"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/janusai.pro\/cs\/wp-json\/wp\/v2\/tags?post=847"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}