{"id":847,"date":"2025-02-04T16:27:27","date_gmt":"2025-02-04T16:27:27","guid":{"rendered":"https:\/\/janusai.pro\/?p=847"},"modified":"2025-02-04T16:27:28","modified_gmt":"2025-02-04T16:27:28","slug":"how-good-is-deepseeks-janus-pro","status":"publish","type":"post","link":"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/","title":{"rendered":"Mennyire j\u00f3 a DeepSeek Janus-Pro?"},"content":{"rendered":"<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>\n<p>A tavaszi fesztiv\u00e1l el\u0151est\u00e9j\u00e9n megjelent a DeepSeek-R1 modell. Tiszta RL architekt\u00far\u00e1j\u00e1val a CoT nagyszer\u0171 \u00faj\u00edt\u00e1saib\u00f3l tanult, \u00e9s fel\u00fclm\u00falja a <a href=\"https:\/\/openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">ChatGPT<\/a> matematika, k\u00f3dol\u00e1s \u00e9s logikus gondolkod\u00e1s.<\/p>\n\n\n\n<p>Emellett a ny\u00edlt forr\u00e1sk\u00f3d\u00fa modells\u00falyok, az alacsony k\u00e9pz\u00e9si k\u00f6lts\u00e9gek \u00e9s az olcs\u00f3 API-\u00e1rak miatt a DeepSeek az eg\u00e9sz internetet bej\u00e1rta, \u00e9s egy id\u0151re m\u00e9g az NVIDIA \u00e9s az ASML r\u00e9szv\u00e9nyeinek \u00e1rfolyam\u00e1t is m\u00e9lyrep\u00fcl\u00e9sbe tasz\u00edtotta.<\/p>\n\n\n\n<p>A n\u00e9pszer\u0171s\u00e9g robban\u00e1sszer\u0171 n\u00f6veked\u00e9se k\u00f6zben a DeepSeek kiadta a Janus (Janus) multimod\u00e1lis nagy modellj\u00e9nek friss\u00edtett v\u00e1ltozat\u00e1t is, az Janus-Pro-t, amely a multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s gener\u00e1l\u00e1s el\u0151z\u0151 gener\u00e1ci\u00f3j\u00e1nak egys\u00e9ges architekt\u00far\u00e1j\u00e1t \u00f6r\u00f6kli, \u00e9s optimaliz\u00e1lja a k\u00e9pz\u00e9si strat\u00e9gi\u00e1t, a k\u00e9pz\u00e9si adatok \u00e9s a modell m\u00e9ret\u00e9nek sk\u00e1l\u00e1z\u00e1s\u00e1t, er\u0151sebb teljes\u00edtm\u00e9nyt hozva.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"427\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02.png\" alt=\"\" class=\"wp-image-850\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-300x119.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-1024x405.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-768x304.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/56e80359-198e-4faf-981a-54b7dfe49f02-18x7.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"522\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481.png\" alt=\"\" class=\"wp-image-854\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-300x145.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-1024x495.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-768x371.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/af7da2cf-a17d-4ac3-95ba-42252fe1a481-18x9.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Tartalomjegyz\u00e9k<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Tartalomjegyz\u00e9k\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Janus-Pro\" >Janus-Pro<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Model_architecture\" >Modell architekt\u00fara<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Training_strategy\" >K\u00e9pz\u00e9si strat\u00e9gia<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Training_data_scaling\" >A k\u00e9pz\u00e9si adatok sk\u00e1l\u00e1z\u00e1sa<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Model_scaling\" >Modell m\u00e9retez\u00e9s<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/janusai.pro\/hu\/how-good-is-deepseeks-janus-pro\/#Model_evaluation\" >Modell\u00e9rt\u00e9kel\u00e9s<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Janus-Pro\"><\/span>Janus-Pro<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus-Pro<\/a> egy olyan egys\u00e9ges multimod\u00e1lis nyelvi modell (MLLM), amely egyszerre k\u00e9pes multimod\u00e1lis meg\u00e9rt\u00e9si \u00e9s gener\u00e1l\u00e1si feladatok feldolgoz\u00e1s\u00e1ra, azaz k\u00e9pes egy k\u00e9p tartalm\u00e1nak meg\u00e9rt\u00e9s\u00e9re \u00e9s sz\u00f6veg gener\u00e1l\u00e1s\u00e1ra is.<\/p>\n\n\n\n<p>A multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s gener\u00e1l\u00e1s vizu\u00e1lis k\u00f3dol\u00f3it sz\u00e9tv\u00e1lasztja (azaz a k\u00e9pmeg\u00e9rt\u00e9s bemenet\u00e9hez \u00e9s a k\u00e9pgener\u00e1l\u00e1s bemenet\u00e9hez \u00e9s kimenet\u00e9hez k\u00fcl\u00f6nb\u00f6z\u0151 tokeniz\u00e1torokat haszn\u00e1l), \u00e9s ezeket egy egys\u00e9ges autoregressz\u00edv transzform\u00e1torral dolgozza fel.<\/p>\n\n\n\n<p>A fejlett multimod\u00e1lis meg\u00e9rt\u00e9si \u00e9s gener\u00e1l\u00e1si modell a kor\u00e1bbi Janus modell tov\u00e1bbfejlesztett v\u00e1ltozata.<\/p>\n\n\n\n<p>A r\u00f3mai mitol\u00f3gi\u00e1ban Janus (Janus) egy k\u00e9tarc\u00fa \u0151rz\u0151 isten, aki az ellentmond\u00e1st \u00e9s az \u00e1tmenetet szimboliz\u00e1lja. K\u00e9t arca van, ami arra is utal, hogy a Janus-modell k\u00e9pes meg\u00e9rteni \u00e9s k\u00e9peket gener\u00e1lni, ami nagyon hely\u00e9nval\u00f3. Teh\u00e1t pontosan mit is friss\u00edtett a PRO?<\/p>\n\n\n\n<p>A Janus, mint az 1.3B kis modellje, ink\u00e1bb egy el\u0151zetes verzi\u00f3, mint egy hivatalos verzi\u00f3. Az egys\u00e9ges multimod\u00e1lis meg\u00e9rt\u00e9st \u00e9s gener\u00e1l\u00e1st vizsg\u00e1lja, de sz\u00e1mos probl\u00e9m\u00e1ja van, p\u00e9ld\u00e1ul instabil k\u00e9pgener\u00e1l\u00e1si hat\u00e1sok, nagy elt\u00e9r\u00e9sek a felhaszn\u00e1l\u00f3i utas\u00edt\u00e1sokt\u00f3l \u00e9s nem megfelel\u0151 r\u00e9szletek.<\/p>\n\n\n\n<p>A Pro verzi\u00f3 optimaliz\u00e1lja a k\u00e9pz\u00e9si strat\u00e9gi\u00e1t, n\u00f6veli a k\u00e9pz\u00e9si adathalmazt, \u00e9s nagyobb modell (7B) k\u00f6z\u00fcl v\u00e1laszthat, mik\u00f6zben 1B modellt biztos\u00edt.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_architecture\"><\/span>Modell architekt\u00fara<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Jaus-Pro \u00e9s Janus<\/a> modellarchitekt\u00fara szempontj\u00e1b\u00f3l azonosak. (Csak 1,3B! Janus egyes\u00edti a multimod\u00e1lis meg\u00e9rt\u00e9st \u00e9s gener\u00e1l\u00e1st)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"571\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006.png\" alt=\"\" class=\"wp-image-851\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-300x159.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-1024x541.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-768x406.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/60356ab0-3c6e-4017-9eba-7ee44e0a1006-18x10.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>A tervez\u00e9s alapelve a vizu\u00e1lis k\u00f3dol\u00e1s sz\u00e9tv\u00e1laszt\u00e1sa a multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s gener\u00e1l\u00e1s t\u00e1mogat\u00e1sa \u00e9rdek\u00e9ben. Az Janus-Pro k\u00fcl\u00f6n k\u00f3dolja az eredeti k\u00e9p\/sz\u00f6veg bemenetet, kivonja a nagydimenzi\u00f3s jellemz\u0151ket, \u00e9s egy egys\u00e9ges autoregressz\u00edv transzform\u00e1toron kereszt\u00fcl dolgozza fel \u0151ket.<\/p>\n\n\n\n<p>A multimod\u00e1lis k\u00e9pmeg\u00e9rt\u00e9s a SigLIP-et haszn\u00e1lja a k\u00e9pjellemz\u0151k k\u00f3dol\u00e1s\u00e1ra (k\u00e9k k\u00f3dol\u00f3 a fenti \u00e1br\u00e1n), a gener\u00e1l\u00e1si feladat pedig a VQ tokeniz\u00e1l\u00f3t haszn\u00e1lja a k\u00e9p diszkretiz\u00e1l\u00e1s\u00e1ra (s\u00e1rga k\u00f3dol\u00f3 a fenti \u00e1br\u00e1n). V\u00e9g\u00fcl az \u00f6sszes jellemz\u0151szekvencia az LLM-be ker\u00fcl feldolgoz\u00e1sra.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_strategy\"><\/span>K\u00e9pz\u00e9si strat\u00e9gia<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>A k\u00e9pz\u00e9si strat\u00e9gia tekintet\u00e9ben az Janus-Pro t\u00f6bb fejleszt\u00e9st hajtott v\u00e9gre. A Janus r\u00e9gi verzi\u00f3ja h\u00e1roml\u00e9pcs\u0151s k\u00e9pz\u00e9si strat\u00e9gi\u00e1t alkalmazott, amelyben az I. szakasz a bemeneti adaptert \u00e9s a k\u00e9pgener\u00e1l\u00f3 fejet k\u00e9p\u00e9rt\u00e9sre \u00e9s k\u00e9pgener\u00e1l\u00e1sra k\u00e9pzi ki, a II. szakasz egys\u00e9ges el\u0151k\u00e9pz\u00e9st v\u00e9gez, a III. szakasz pedig ennek alapj\u00e1n finomhangolja a meg\u00e9rt\u0151 k\u00f3dol\u00f3t. (A Janus k\u00e9pz\u00e9si strat\u00e9gi\u00e1ja az al\u00e1bbi \u00e1br\u00e1n l\u00e1that\u00f3).<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"381\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a.png\" alt=\"\" class=\"wp-image-849\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-300x106.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-1024x361.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-768x271.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/dbf6954f-1a18-4572-a452-ec995c8af71a-18x6.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>Ez a strat\u00e9gia azonban a PixArt-m\u00f3dszert haszn\u00e1lja a sz\u00f6veg-k\u00e9p gener\u00e1l\u00e1s k\u00e9pz\u00e9s\u00e9nek megoszt\u00e1s\u00e1ra a II. f\u00e1zisban, ami alacsony sz\u00e1m\u00edt\u00e1si hat\u00e9konys\u00e1got eredm\u00e9nyez.<\/p>\n\n\n\n<p>Ennek \u00e9rdek\u00e9ben meghosszabb\u00edtottuk az I. szakasz k\u00e9pz\u00e9si idej\u00e9t, \u00e9s kieg\u00e9sz\u00edtett\u00fck az ImageNet-adatokkal t\u00f6rt\u00e9n\u0151 k\u00e9pz\u00e9ssel, hogy a modell hat\u00e9konyan modellezze a pixelf\u00fcgg\u0151s\u00e9geket r\u00f6gz\u00edtett LLM-param\u00e9terekkel. A II. szakaszban az ImageNet-adatokat elvetett\u00fck, \u00e9s k\u00f6zvetlen\u00fcl sz\u00f6veg-k\u00e9pp\u00e1r-adatokat haszn\u00e1ltunk a k\u00e9pz\u00e9shez, ami jav\u00edtja a k\u00e9pz\u00e9s hat\u00e9konys\u00e1g\u00e1t. Ezenk\u00edv\u00fcl a III. f\u00e1zisban m\u00f3dos\u00edtottuk az adatok ar\u00e1ny\u00e1t (multimod\u00e1lis:csak-sz\u00f6veges:vizu\u00e1lis-szemantikus gr\u00e1f adatok 7:3:10-r\u0151l 5:1:4-re), jav\u00edtva a multimod\u00e1lis meg\u00e9rt\u00e9st, mik\u00f6zben fenntartottuk a vizu\u00e1lis gener\u00e1l\u00e1si k\u00e9pess\u00e9geket.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_data_scaling\"><\/span>A k\u00e9pz\u00e9si adatok sk\u00e1l\u00e1z\u00e1sa<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Az Janus-Pro a Janus k\u00e9pz\u00e9si adatait is sk\u00e1l\u00e1zza a multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s a vizu\u00e1lis gener\u00e1l\u00e1s szempontj\u00e1b\u00f3l.<\/p>\n\n\n\n<p>Multimod\u00e1lis meg\u00e9rt\u00e9s: A II. szakasz el\u0151k\u00e9pz\u00e9si adatai a DeepSeek-VL2-n alapulnak, \u00e9s k\u00f6r\u00fclbel\u00fcl 90 milli\u00f3 \u00faj mint\u00e1t tartalmaznak, k\u00f6zt\u00fck k\u00e9pfelirat-adatokat (p\u00e9ld\u00e1ul YFCC) \u00e9s t\u00e1bl\u00e1zat-, t\u00e1bl\u00e1zat- \u00e9s dokumentummeg\u00e9rt\u00e9si adatokat (p\u00e9ld\u00e1ul Docmatix).<\/p>\n\n\n\n<p>A III. szakasz fel\u00fcgyelt finomhangol\u00e1si szakasza tov\u00e1bbi MEME-meg\u00e9rt\u00e9st, k\u00ednai p\u00e1rbesz\u00e9dadatokat stb. vezet be, hogy jav\u00edtsa a modell teljes\u00edtm\u00e9ny\u00e9t a t\u00f6bbfeladat-feldolgoz\u00e1s \u00e9s a p\u00e1rbesz\u00e9dk\u00e9pess\u00e9g ter\u00e9n.<\/p>\n\n\n\n<p>Vizu\u00e1lis gener\u00e1l\u00e1s: A kor\u00e1bbi verzi\u00f3k alacsony min\u0151s\u00e9g\u0171 \u00e9s nagy zajjal rendelkez\u0151 val\u00f3s adatokat haszn\u00e1ltak, ami befoly\u00e1solta a sz\u00f6veggel gener\u00e1lt k\u00e9pek stabilit\u00e1s\u00e1t \u00e9s eszt\u00e9tik\u00e1j\u00e1t.<\/p>\n\n\n\n<p>Az Janus-Pro mintegy 72 milli\u00f3 szintetikus eszt\u00e9tikai adatot mutat be, \u00edgy a val\u00f3s adatok \u00e9s a szintetikus adatok ar\u00e1nya 1:1-re n\u0151. A k\u00eds\u00e9rletek azt mutatt\u00e1k, hogy a szintetikus adatok felgyors\u00edtj\u00e1k a modell konvergenci\u00e1j\u00e1t, \u00e9s jelent\u0151sen jav\u00edtj\u00e1k a gener\u00e1lt k\u00e9pek stabilit\u00e1s\u00e1t \u00e9s eszt\u00e9tikai min\u0151s\u00e9g\u00e9t.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_scaling\"><\/span>Modell m\u00e9retez\u00e9s<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Az Janus Pro 7B-ra n\u00f6veli a modell m\u00e9ret\u00e9t, m\u00edg a Janus el\u0151z\u0151 verzi\u00f3ja 1,5B DeepSeek-LLM-et haszn\u00e1lt a vizu\u00e1lis k\u00f3dol\u00e1s sz\u00e9tv\u00e1laszt\u00e1s\u00e1nak hat\u00e9konys\u00e1g\u00e1nak ellen\u0151rz\u00e9s\u00e9re. A k\u00eds\u00e9rletek azt mutatj\u00e1k, hogy a nagyobb LLM jelent\u0151sen felgyors\u00edtja a multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s a vizu\u00e1lis gener\u00e1l\u00e1s konvergenci\u00e1j\u00e1t, ami tov\u00e1bb igazolja a m\u00f3dszer er\u0151s sk\u00e1l\u00e1zhat\u00f3s\u00e1g\u00e1t.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"864\" height=\"352\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b.png\" alt=\"\" class=\"wp-image-848\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b.png 864w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-300x122.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-768x313.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/a19590e2-1805-493d-85e3-09c9b8e2274b-18x7.png 18w\" sizes=\"auto, (max-width: 864px) 100vw, 864px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"536\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba.png\" alt=\"\" class=\"wp-image-852\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-300x149.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-1024x508.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-768x381.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/c78ed17c-6e07-43ef-bfda-ae287f597bba-18x9.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<p>A k\u00eds\u00e9rletben a DeepSeek-LLM (1,5B \u00e9s 7B, maximum 4096 szekvencia t\u00e1mogat\u00e1s\u00e1val) alapnyelvi modellk\u00e9nt szolg\u00e1l. A multimod\u00e1lis meg\u00e9rt\u00e9si feladathoz a SigLIP-Large-Patch16-384 vizu\u00e1lis k\u00f3dol\u00f3t haszn\u00e1ljuk, a k\u00f3dol\u00f3 sz\u00f3t\u00e1rm\u00e9rete 16384, a k\u00e9p lemintav\u00e9telez\u00e9si t\u00f6bbsz\u00f6r\u00f6se 16, \u00e9s mind a meg\u00e9rt\u0151, mind a gener\u00e1l\u00f3 adapter k\u00e9tr\u00e9teg\u0171 MLP.<\/p>\n\n\n\n<p>A II. f\u00e1zis\u00fa k\u00e9pz\u00e9s 270K korai meg\u00e1ll\u00e1si strat\u00e9gi\u00e1t alkalmaz, minden k\u00e9pet egys\u00e9gesen 384\u00d7384-es felbont\u00e1sra \u00e1ll\u00edtunk, \u00e9s a k\u00e9pz\u00e9s hat\u00e9konys\u00e1g\u00e1nak jav\u00edt\u00e1sa \u00e9rdek\u00e9ben szekvenciacsomagol\u00e1st haszn\u00e1lunk. Az Janus-Pro k\u00e9pz\u00e9se \u00e9s \u00e9rt\u00e9kel\u00e9se a HAI-LLM seg\u00edts\u00e9g\u00e9vel t\u00f6rt\u00e9nik. Az 1,5B\/7B verzi\u00f3kat 16\/32 csom\u00f3ponton (csom\u00f3pontonk\u00e9nt 8\u00d7Nvidia A100 40GB) 9\/14 napon kereszt\u00fcl k\u00e9pezt\u00fck.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_evaluation\"><\/span>Modell\u00e9rt\u00e9kel\u00e9s<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Az Janus-Pro-t k\u00fcl\u00f6n \u00e9rt\u00e9kelt\u00e9k a multimod\u00e1lis meg\u00e9rt\u00e9s \u00e9s a gener\u00e1l\u00e1s ter\u00e9n. \u00d6sszess\u00e9g\u00e9ben a meg\u00e9rt\u00e9s tal\u00e1n kiss\u00e9 gyenge, de az azonos m\u00e9ret\u0171 ny\u00edlt forr\u00e1sk\u00f3d\u00fa modellek k\u00f6z\u00f6tt kiv\u00e1l\u00f3nak tekinthet\u0151 (gondolom, nagym\u00e9rt\u00e9kben korl\u00e1tozza a r\u00f6gz\u00edtett bemeneti felbont\u00e1s \u00e9s az OCR-k\u00e9pess\u00e9gek).<\/p>\n\n\n\n<p>Az Janus-Pro-7B 79,2 pontot \u00e9rt el az MMBench benchmark tesztben, ami k\u00f6zel van az els\u0151 oszt\u00e1ly\u00fa ny\u00edlt forr\u00e1sk\u00f3d\u00fa modellek szintj\u00e9hez (az InternVL2.5 \u00e9s a Qwen2-VL azonos m\u00e9rete 82 pont k\u00f6r\u00fcl van). A Janus el\u0151z\u0151 gener\u00e1ci\u00f3j\u00e1hoz k\u00e9pest azonban ez j\u00f3 el\u0151rel\u00e9p\u00e9s.<\/p>\n\n\n\n<p>A k\u00e9pgener\u00e1l\u00e1s tekintet\u00e9ben m\u00e9g jelent\u0151sebb a javul\u00e1s az el\u0151z\u0151 gener\u00e1ci\u00f3hoz k\u00e9pest, \u00e9s a ny\u00edlt forr\u00e1sk\u00f3d\u00fa modellek k\u00f6z\u00f6tt kiv\u00e1l\u00f3 szintnek sz\u00e1m\u00edt. Az Janus-Pro GenEval benchmark tesztben el\u00e9rt pontsz\u00e1ma (0,80) is meghaladja az olyan modelleket, mint a DALL-E 3 (0,67) \u00e9s a Stable Diffusion 3 Medium (0,74).<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"827\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349.png\" alt=\"\" class=\"wp-image-853\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-300x230.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-1024x784.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-768x588.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/47aa92e1-b474-4874-956e-db210da9d349-16x12.png 16w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1080\" height=\"744\" src=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55.png\" alt=\"\" class=\"wp-image-855\" srcset=\"https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55.png 1080w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-300x207.png 300w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-1024x705.png 1024w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-768x529.png 768w, https:\/\/janusai.pro\/wp-content\/uploads\/2025\/02\/38de369b-7f1f-4159-83a7-5f411e816d55-18x12.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/figure>","protected":false},"excerpt":{"rendered":"<p>A tavaszi fesztiv\u00e1l el\u0151est\u00e9j\u00e9n megjelent a DeepSeek-R1 modell. Tiszta RL-architekt\u00far\u00e1j\u00e1val tanult a CoT nagyszer\u0171 innov\u00e1ci\u00f3ib\u00f3l, \u00e9s matematik\u00e1ban, k\u00f3dban \u00e9s logikai k\u00f6vetkeztet\u00e9sekben fel\u00fclm\u00falja a ChatGPT-t. Emellett ny\u00edlt forr\u00e1sk\u00f3d\u00fa modells\u00falyai, alacsony k\u00e9pz\u00e9si k\u00f6lts\u00e9gei \u00e9s olcs\u00f3 API-\u00e1rai miatt a DeepSeek az eg\u00e9sz internetet bej\u00e1rta, s\u0151t...<\/p>","protected":false},"author":2,"featured_media":704,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-847","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/posts\/847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/comments?post=847"}],"version-history":[{"count":1,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/posts\/847\/revisions"}],"predecessor-version":[{"id":856,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/posts\/847\/revisions\/856"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/media\/704"}],"wp:attachment":[{"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/media?parent=847"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/categories?post=847"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/janusai.pro\/hu\/wp-json\/wp\/v2\/tags?post=847"}],"curies":[{"name":"munkaf\u00fczet","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}