{"id":746,"date":"2025-01-30T13:05:33","date_gmt":"2025-01-30T13:05:33","guid":{"rendered":"https:\/\/janusai.pro\/?p=746"},"modified":"2025-01-30T13:05:35","modified_gmt":"2025-01-30T13:05:35","slug":"the-complete-explanation-from-deepseek-janus-to-janus-pro","status":"publish","type":"post","link":"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/","title":{"rendered":"Tam a\u00e7\u0131klama: DeepSeek Janus'tan Janus-Pro'ye!"},"content":{"rendered":"<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>\n<p>Eve G\u00f6t\u00fcren Mesaj: Janus, \u00e7ok modlu anlama ve \u00fcretilen g\u00f6rsel kodlamay\u0131 birbirinden ay\u0131ran ve iki g\u00f6rev aras\u0131ndaki potansiyel \u00e7at\u0131\u015fmalar\u0131 azaltan basit, birle\u015fik ve geni\u015fletilebilir bir \u00e7ok modlu anlama ve \u00fcretme modelidir. Gelecekte ek girdi modalitelerini i\u00e7erecek \u015fekilde geni\u015fletilebilir. Janus-Pro, e\u011fitim stratejisini optimize ederek (e\u011fitim ad\u0131mlar\u0131n\u0131n say\u0131s\u0131n\u0131 art\u0131rmak, veri oranlar\u0131n\u0131 ayarlamak vb. dahil), daha fazla veri ekleyerek (sentetik veri kullan\u0131m\u0131 vb. dahil) ve model boyutunu (7 milyar parametreye kadar) \u00f6l\u00e7eklendirerek bu temel \u00fczerine in\u015fa edilir ve bu da modelin \u00e7ok modlu anlama ve metinden g\u00f6r\u00fcnt\u00fcye talimat ba\u011fl\u0131l\u0131\u011f\u0131 yeteneklerinde ilerlemelere yol a\u00e7ar.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/rmy9ct2fln.feishu.cn\/space\/api\/box\/stream\/download\/asynccode\/?code=Mjg4MjEwYjVlNzk0YTgyMTc0NDJlODQ4MTU2ZmRjYTVfWnhaaVEyZlEwUHFrUHNUeGNCOWpCRU1EVDN0QktBMUxfVG9rZW46SkVQZmJmSEhqb1g4YTJ4MVNYdmNPT2oybmVmXzE3MzgyNDIwMzc6MTczODI0NTYzN19WNA\" alt=\"\"\/><\/figure>\n\n\n\n<p><a href=\"https:\/\/github.com\/deepseek-ai\/JanusJanus\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Kod adresi<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/deepseek-ai\/Janus\/blob\/main\/janus_pro_tech_report.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus Pro adresi<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus-Pro<\/a> \u00f6zellikle (1) optimize edilmi\u015f bir e\u011fitim stratejisi, (2) geni\u015fletilmi\u015f e\u011fitim verileri ve (3) daha b\u00fcy\u00fck model boyutlar\u0131 dahil olmak \u00fczere \u00f6nceki \u00e7al\u0131\u015fma Janus'un geli\u015fmi\u015f bir versiyonudur. Bu iyile\u015ftirmelerle Janus-Pro, \u00e7ok modlu anlama ve metinden g\u00f6r\u00fcnt\u00fcye talimat ba\u011fl\u0131l\u0131k yeteneklerinde \u00f6nemli ilerlemeler sa\u011flarken, ayn\u0131 zamanda metinden g\u00f6r\u00fcnt\u00fcye \u00fcretimin kararl\u0131l\u0131\u011f\u0131n\u0131 da art\u0131rmaktad\u0131r. Janus-Pro'yi a\u00e7madan \u00f6nce Janus'u g\u00f6zden ge\u00e7irelim.<\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">\u0130\u00e7indekiler<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"\u0130\u00e7erik Tablosunu De\u011fi\u015ftir\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Ge\u00e7i\u015f<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Reviewing_Janus\" >Janus'u \u0130ncelemek<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Janus_training_is_divided_into_3_phases\" >Janus e\u011fitimi 3 a\u015famaya ayr\u0131lm\u0131\u015ft\u0131r:<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Phase_1\" >A\u015fama 1<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Phase_2\" >A\u015fama 2<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Phase_3\" >A\u015fama 3<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Training_Objectives\" >E\u011fitim Hedefleri<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Reasoning\" >Ak\u0131l y\u00fcr\u00fctme<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Possible_extensions\" >Olas\u0131 uzatmalar<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Janus-Pro_Upgrade\" >Janus-Pro Y\u00fckseltmesi<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Main_Improvements\" >Ana \u0130yile\u015ftirmeler<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Training_Strategy\" >E\u011fitim Stratejisi<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Data_Scale\" >Veri \u00d6l\u00e7e\u011fi<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Model_Scale\" >Model \u00d6l\u00e7e\u011fi<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Experimental_details\" >Deneysel ayr\u0131nt\u0131lar<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"https:\/\/janusai.pro\/tr\/the-complete-explanation-from-deepseek-janus-to-janus-pro\/#Insufficient\" >Yetersiz<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Reviewing_Janus\"><\/span>Janus'u \u0130ncelemek<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>\u00d6nc\u00fcl Janus, birle\u015fik \u00e7ok modlu anlama ve \u00fcretme i\u00e7in g\u00f6rsel kodlamay\u0131 ay\u0131rmak i\u00e7in kullan\u0131lan birle\u015fik \u00e7ok modlu anlama ve \u00fcretme i\u00e7in otoregresif bir \u00e7er\u00e7evedir. \u00c7ok modlu anlama i\u00e7in tasar\u0131m tipik olarak LLaVA'y\u0131 takip eder ve b\u00fcy\u00fck dil modellerinin g\u00f6r\u00fcnt\u00fcleri anlamas\u0131n\u0131 sa\u011flamak i\u00e7in g\u00f6rsel kodlay\u0131c\u0131lar\u0131 bir k\u00f6pr\u00fc olarak kullan\u0131r. \u00dcretim i\u00e7in, genellikle dif\u00fczyon modellerine dayan\u0131r ve baz\u0131lar\u0131 otoregresif y\u00f6ntemlere dayan\u0131r. Baz\u0131 yakla\u015f\u0131mlar, \u00e7ok modlu anlama ve \u00fcretme g\u00f6revlerini birle\u015ftirmeye \u00e7al\u0131\u015fmak i\u00e7in tek bir D\u00f6n\u00fc\u015ft\u00fcr\u00fcc\u00fc kullanmaya \u00e7al\u0131\u015f\u0131r; bu da genellikle her iki g\u00f6revin girdilerini i\u015flemek i\u00e7in tek bir g\u00f6rsel kodlay\u0131c\u0131 kullan\u0131r.<\/p>\n\n\n\n<p>Bununla birlikte, \u00e7ok modlu anlama ve \u00fcretme g\u00f6revleri i\u00e7in gereken temsillerde farkl\u0131l\u0131klar vard\u0131r. \u00c7ok modlu anlama g\u00f6revinde, g\u00f6rsel kodlay\u0131c\u0131 \u00fcst d\u00fczey anlamsal bilgileri (\u00f6rne\u011fin, nesne kategorileri veya g\u00f6rsel nitelikler) \u00e7\u0131karmay\u0131 ama\u00e7lar ve \u00e7\u0131kt\u0131 yaln\u0131zca g\u00f6r\u00fcnt\u00fcden bilgi \u00e7\u0131karmay\u0131 de\u011fil, ayn\u0131 zamanda kodlay\u0131c\u0131n\u0131n esas olarak y\u00fcksek boyutlu anlamsal temsillere odakland\u0131\u011f\u0131 karma\u015f\u0131k anlamsal muhakemeyi de i\u00e7erir. \u00dcretim g\u00f6revi esas olarak yerel detaylar\u0131n \u00fcretilmesi ve g\u00f6r\u00fcnt\u00fcdeki global tutarl\u0131l\u0131\u011f\u0131n korunmas\u0131 ile ilgilidir, bu nedenle uzamsal yap\u0131lar\u0131n ve doku detaylar\u0131n\u0131n d\u00fc\u015f\u00fck boyutlu kodlanm\u0131\u015f temsillerini gerektirir. Her iki g\u00f6revin temsillerini ayn\u0131 uzayda birle\u015ftirmek \u00e7at\u0131\u015fmalara yol a\u00e7abilir.<\/p>\n\n\n\n<p>Janus, \u00e7ok modlu anlama ve \u00fcretim i\u00e7in 2 ba\u011f\u0131ms\u0131z g\u00f6rsel kodlama yolu i\u00e7erir ve iki fayda sa\u011flar: 1) \u00e7ok modlu anlama ve olu\u015fturman\u0131n farkl\u0131 ayr\u0131nt\u0131 d\u00fczeyi gereksinimlerinden kaynaklanan \u00e7at\u0131\u015fmalar\u0131 azalt\u0131r ve 2) esnek ve \u00f6l\u00e7eklenebilirdir, hem anlama hem de olu\u015fturma g\u00f6revlerinin kendi alanlar\u0131na \u00f6zg\u00fc en son kodlama teknikleri kullan\u0131larak kodlanabilmesi ve gelecekte nokta bulutlar\u0131, EEG sinyalleri veya ses verileriyle beslenebilmesi ve birle\u015fik bir Transformat\u00f6r kullan\u0131larak i\u015flenebilmesi i\u00e7in ayr\u0131\u015ft\u0131r\u0131l\u0131r Gelecekte, nokta bulutlar\u0131, EEG sinyalleri veya ses verileri girilebilir ve birle\u015fik bir Transformat\u00f6r kullan\u0131larak i\u015flenebilir.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/rmy9ct2fln.feishu.cn\/space\/api\/box\/stream\/download\/asynccode\/?code=OTE3ZjkyNWQ5MmUwNDQzM2VjN2VlNWYwZjAxYTVmZGRfMXpJMWVObDBKOHYxTVJqeEw2S0pHT2hGU3RuVHdnWVdfVG9rZW46UDQyQ2Jrb0Myb1h0bjR4TFBrV2NRS29GbkRmXzE3MzgyNDIwMzc6MTczODI0NTYzN19WNA\" alt=\"\"\/><\/figure>\n\n\n\n<p>Metin anlama i\u00e7in, metin LLM'nin yerle\u015fik Tokenizer'\u0131 kullan\u0131larak ayr\u0131k kimliklere d\u00f6n\u00fc\u015ft\u00fcr\u00fcl\u00fcr;<\/p>\n\n\n\n<p>\u00c7ok modlu anlama i\u00e7in, g\u00f6r\u00fcnt\u00fclerdeki y\u00fcksek boyutlu anlamsal \u00f6zellikler SigLIP kodlay\u0131c\u0131lar\u0131 kullan\u0131larak \u00e7\u0131kar\u0131l\u0131r (yazar\u0131n notu: Cosmos ayr\u0131ca Guardrails b\u00f6l\u00fcm\u00fcnde SigLIP kodlay\u0131c\u0131lar\u0131n\u0131 kullan\u0131r) ve \u00e7\u0131kar\u0131lan \u00f6zellikler Adaptor (2 katmanl\u0131 MLP) kullan\u0131larak LLM'nin metin \u00f6zellik uzay\u0131na e\u015flenir;<\/p>\n\n\n\n<p>Uzun kenar 384 piksele ayarland\u0131 ve k\u0131sa kenar RGB(127, 127, 127) kullan\u0131larak 384 piksele dolduruldu;<\/p>\n\n\n\n<p>G\u00f6rsel \u00fcretim i\u00e7in, g\u00f6r\u00fcnt\u00fc VQ Tokenizer kullan\u0131larak ayr\u0131k kimliklere d\u00f6n\u00fc\u015ft\u00fcr\u00fclm\u00fc\u015f ve her kimlik Adapt\u00f6r (2 katmanl\u0131 MLP) kullan\u0131larak LLM'nin metinsel \u00f6zellik uzay\u0131na e\u015flenmi\u015ftir;<\/p>\n\n\n\n<p>K\u0131sa kenarlar 384 piksele yeniden boyutland\u0131r\u0131lm\u0131\u015f ve uzun kenarlar 384 piksele k\u0131rp\u0131lm\u0131\u015ft\u0131r;<\/p>\n\n\n\n<p>Genel e\u011fitim, her biri 8 Nvidia A100 GPU i\u00e7eren 16 d\u00fc\u011f\u00fcm kullan\u0131larak ger\u00e7ekle\u015ftirilmi\u015ftir;<\/p>\n\n\n\n<p>Hem g\u00f6rsel \u00fcretim hem de \u00e7ok modlu anlama g\u00f6revleri i\u00e7in, g\u00f6r\u00fcnt\u00fc \u00f6zellik dizileri ve metin \u00f6zellik dizileri LLM'ye girdi olarak birbirine ba\u011flan\u0131r (metinde DeepSeek-LLM 1.3B kullan\u0131lm\u0131\u015ft\u0131r);<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>LLM'nin yerle\u015fik tahmin kafas\u0131, hem saf metin anlama hem de \u00e7ok modlu anlama g\u00f6revlerinde metin tahminleri i\u00e7in kullan\u0131l\u0131rken, g\u00f6rsel olu\u015fturma g\u00f6revinde g\u00f6r\u00fcnt\u00fc tahminleri i\u00e7in rastgele ba\u015flat\u0131lan bir tahmin kafas\u0131 kullan\u0131l\u0131r. Modelin tamam\u0131, \u00f6zel olarak tasarlanm\u0131\u015f dikkat maskelerine ihtiya\u00e7 duymadan otoregresif bir \u00e7er\u00e7eveye ba\u011fl\u0131d\u0131r.<\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Janus_training_is_divided_into_3_phases\"><\/span><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus e\u011fitimi<\/a> 3 a\u015famaya ayr\u0131lm\u0131\u015ft\u0131r:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Phase_1\"><\/span>A\u015fama 1<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><strong>Tren Adapt\u00f6r\u00fc ve G\u00f6r\u00fcnt\u00fc Kafas\u0131<\/strong> G\u00f6mme uzay\u0131nda dilsel ve g\u00f6rsel \u00f6\u011feler aras\u0131nda ba\u011flant\u0131lar olu\u015fturmak, LLM'nin g\u00f6r\u00fcnt\u00fcdeki varl\u0131klar\u0131 anlamas\u0131n\u0131 ve ilk g\u00f6rsel olu\u015fturma yeteneklerine sahip olmas\u0131n\u0131 sa\u011flamak;<\/p>\n\n\n\n<p>\u00c7ok modlu anlama i\u00e7in SHareGPT4V'den 1,25 milyon resim-metin e\u015fle\u015ftirilmi\u015f ba\u015fl\u0131k verisini format\u0131nda kullan\u0131n: ;<\/p>\n\n\n\n<p>G\u00f6rsel olu\u015fturma i\u00e7in ImageNet1k'dan 1,2 milyon \u00f6rnek format\u0131nda kullan\u0131l\u0131r: ;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Phase_2\"><\/span>A\u015fama 2<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><strong>Birle\u015fik \u00f6n e\u011fitim<\/strong>\u00e7ok modlu anlama ve \u00fcretmeyi \u00f6\u011frenmek i\u00e7in birle\u015fik \u00f6n e\u011fitim i\u00e7in \u00e7ok modlu bir derlem kullan\u0131r. Bu a\u015famada d\u00fcz metin verileri, \u00e7ok modlu anlama verileri ve g\u00f6rsel \u00fcretim verileri kullan\u0131l\u0131r. ImageNet-1k kullan\u0131larak basit g\u00f6rsel \u00fcretim e\u011fitimi, ard\u0131ndan modelin a\u00e7\u0131k etki alan\u0131nda g\u00f6rsel \u00fcretimi geli\u015ftirmek i\u00e7in genel metinden g\u00f6r\u00fcnt\u00fcye veri kullan\u0131m\u0131;<\/p>\n\n\n\n<p>D\u00fcz metin verileri: DeepSeek-LLM \u00f6nceden e\u011fitilmi\u015f derlem;<\/p>\n\n\n\n<p>Araya serpi\u015ftirilmi\u015f g\u00f6r\u00fcnt\u00fc-metin verileri: WikiHow ve WIT veri k\u00fcmeleri;<\/p>\n\n\n\n<p>Resim Altyaz\u0131s\u0131 verileri: Birden fazla kaynaktan al\u0131nan g\u00f6r\u00fcnt\u00fcler ve a\u00e7\u0131k kaynakl\u0131 multimodal modeller kullan\u0131larak baz\u0131 g\u00f6r\u00fcnt\u00fclerin yeniden altyaz\u0131land\u0131r\u0131lmas\u0131, verilerin Soru-Cevap \u00e7iftleri olarak bi\u00e7imlendirilmesi, \u00f6rne\u011fin G\u00f6r\u00fcnt\u00fcy\u00fc ayr\u0131nt\u0131l\u0131 olarak a\u00e7\u0131klay\u0131n.;<\/p>\n\n\n\n<p>Tablo ve grafik verileri: DeepSeek-VL'den  format\u0131nda ilgili tablo ve grafik verileri;<\/p>\n\n\n\n<p>G\u00f6rsel olarak olu\u015fturulan veriler: birden fazla veri k\u00fcmesinden g\u00f6r\u00fcnt\u00fc-ba\u015fl\u0131k \u00e7iftleri ve 2 milyon dahili veri;<\/p>\n\n\n\n<p>E\u011fitim s\u0131ras\u0131nda, altyaz\u0131n\u0131n yaln\u0131zca ilk c\u00fcmlesi 25% olas\u0131l\u0131kla rastgele kullan\u0131l\u0131r;<\/p>\n\n\n\n<p>ImageNet \u00f6rnekleri yaln\u0131zca ilk 120K e\u011fitim ad\u0131m\u0131nda g\u00f6r\u00fcn\u00fcr, di\u011fer veri k\u00fcmelerinden g\u00f6r\u00fcnt\u00fcler sonraki 60K ad\u0131mda g\u00f6r\u00fcn\u00fcr;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Phase_3\"><\/span>A\u015fama 3<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><strong>Denetimli ince ayar<\/strong>\u00d6nceden e\u011fitilmi\u015f modellerin talimatlar\u0131 ve diyalo\u011fu takip etme yeteneklerini geli\u015ftirmek i\u00e7in talimat ince ayar verileri kullan\u0131larak ince ayar yap\u0131ld\u0131\u011f\u0131 yer. \u00dcreten kodlay\u0131c\u0131 hari\u00e7 t\u00fcm parametrelerde ince ayar. Cevaplar\u0131 denetlerken sistemi ve kullan\u0131c\u0131 ipu\u00e7lar\u0131n\u0131 maskeleme. Janus'un hem \u00e7ok modlu anlama hem de \u00fcretme konusunda yeterlili\u011fe sahip olmas\u0131n\u0131 sa\u011flamak i\u00e7in, modeller belirli g\u00f6revler i\u00e7in ayr\u0131 ayr\u0131 ince ayarlanmam\u0131\u015ft\u0131r. Bunun yerine, \u00e7e\u015fitli senaryolarda \u00e7ok y\u00f6nl\u00fcl\u00fck sa\u011flamak i\u00e7in yaln\u0131zca metin diyalog verileri, \u00e7ok modlu anlama verileri ve g\u00f6rsel \u00fcretim verilerinin bir kar\u0131\u015f\u0131m\u0131n\u0131 kullan\u0131yoruz;<\/p>\n\n\n\n<p>Metin anlama: belirli kaynaklardan elde edilen verileri kullan\u0131r;<\/p>\n\n\n\n<p>\u00c7ok modlu anlama: \u00f6\u011fretim ayarlamas\u0131 i\u00e7in \u00e7oklu kaynaklardan gelen verilerin kullan\u0131lmas\u0131;<\/p>\n\n\n\n<p>G\u00f6rsel olu\u015fturma: baz\u0131 Faz II veri k\u00fcmelerinden g\u00f6r\u00fcnt\u00fc-metin \u00e7iftlerinin bir alt k\u00fcmesinin yan\u0131 s\u0131ra 4 milyon dahili veri kullan\u0131larak;<\/p>\n\n\n\n<p>Veri format\u0131 \u015f\u00f6yledir: Kullan\u0131c\u0131: \\n Asistan: ;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/rmy9ct2fln.feishu.cn\/space\/api\/box\/stream\/download\/asynccode\/?code=M2I3MWQ5MjQyNTM5NjIyZTkyMjdlODgwMDg5NzIwYzJfSGVTUnVzb0I3bEREQXBkMEJGN0lqT0JBaEVUWEQwS05fVG9rZW46Vm9OMWJzYnNsbzRGR1R4YlJrNWNad1psblhjXzE3MzgyNDIwMzc6MTczODI0NTYzN19WNA\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_Objectives\"><\/span>E\u011fitim Hedefleri<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Janus, \u00e7apraz entropi kay\u0131p fonksiyonu kullan\u0131larak e\u011fitilen bir otoregresif modeldir, d\u00fcz metin anlama ve \u00e7ok modlu anlama g\u00f6revleri i\u00e7in kay\u0131p metin dizisinde hesaplan\u0131r. G\u00f6rsel \u00fcretim g\u00f6revleri i\u00e7in kay\u0131p sadece g\u00f6r\u00fcnt\u00fc dizisi \u00fczerinde hesaplan\u0131r. Tasar\u0131m\u0131 basit tutmak i\u00e7in farkl\u0131 g\u00f6revlere farkl\u0131 kay\u0131p a\u011f\u0131rl\u0131klar\u0131 atanmam\u0131\u015ft\u0131r.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Reasoning\"><\/span>Ak\u0131l y\u00fcr\u00fctme<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>D\u00fcz metin anlama ve \u00e7ok modlu anlama i\u00e7in bir sonraki s\u00f6zc\u00fcksel \u00f6\u011fe tahmin y\u00f6ntemini kullanarak, s\u00f6zc\u00fcksel \u00f6\u011feler tahmin da\u011f\u0131l\u0131m\u0131ndan s\u0131rayla \u00f6rneklenir. G\u00f6r\u00fcnt\u00fc \u00fcretimi i\u00e7in s\u0131n\u0131fland\u0131r\u0131c\u0131s\u0131z bir \u00f6ny\u00fckleme kullan\u0131l\u0131r.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Possible_extensions\"><\/span>Olas\u0131 uzatmalar<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>\u00c7ok modlu anlama i\u00e7in, 1) daha g\u00fc\u00e7l\u00fc bir g\u00f6rsel kodlay\u0131c\u0131 se\u00e7ilebilir ve 2) dinamik y\u00fcksek \u00e7\u00f6z\u00fcn\u00fcrl\u00fckl\u00fc teknikler kullan\u0131labilir;<\/p>\n\n\n\n<p>G\u00f6rme \u00fcretimi i\u00e7in, 1) daha ince taneli kodlay\u0131c\u0131lar se\u00e7ilebilir, 2) \u00f6zellikle g\u00f6rme \u00fcretimi i\u00e7in tasarlanm\u0131\u015f kay\u0131p fonksiyonlar\u0131 kullan\u0131labilir ve 3) nedensel dikkat ve paralel y\u00f6ntemler birle\u015ftirilebilir;<\/p>\n\n\n\n<p>3D nokta bulutlar\u0131, haptikler, EEG ve kay\u0131p modaliteleri i\u00e7in di\u011fer girdileri entegre etme yetene\u011fi ile daha fazla modalite;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Janus-Pro_Upgrade\"><\/span><a href=\"https:\/\/huggingface.co\/deepseek-ai\/Janus-Pro-7B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Janus-Pro Y\u00fckseltmesi<\/a><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>S\u0131n\u0131rl\u0131 e\u011fitim verisi ve nispeten k\u00fc\u00e7\u00fck model kapasitesi (1B) ile Janus, k\u0131sa ipu\u00e7lar\u0131 alt\u0131nda g\u00f6r\u00fcnt\u00fc \u00fcretiminin zay\u0131f temsili ve metinden g\u00f6r\u00fcnt\u00fcye \u00fcretimin tutars\u0131z kalitesi gibi baz\u0131 y\u00f6nlerden eksiktir. Janus-Pro'nin mimarisi, a\u015fa\u011f\u0131daki \u015fekilde g\u00f6r\u00fclebilece\u011fi gibi Janus ile ayn\u0131d\u0131r:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/rmy9ct2fln.feishu.cn\/space\/api\/box\/stream\/download\/asynccode\/?code=NDY0ZWM0NTJiOTNlYTE4MWI4NmMwNGE4Mjc3NmYyMDJfc1FEMHVOMHo1OUM0ZVhoakJtU1lZQXdZNTd4NVFXRzhfVG9rZW46RjJrTGI3VVlqb0IxS3N4aHVVN2NxUWxJbnZkXzE3MzgyNDIwMzc6MTczODI0NTYzN19WNA\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Main_Improvements\"><\/span>Ana \u0130yile\u015ftirmeler<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Training_Strategy\"><\/span>E\u011fitim Stratejisi<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p>A\u015fama 1: E\u011fitim ad\u0131mlar\u0131n\u0131n say\u0131s\u0131n\u0131 art\u0131r\u0131n ve ImageNet \u00fczerinde tamamen e\u011fitin;<\/p>\n\n\n\n<p>2. A\u015fama: Art\u0131k ImageNet kullanmay\u0131n, e\u011fitim i\u00e7in do\u011frudan normal metin-resim verilerini kullan\u0131n;<\/p>\n\n\n\n<p>A\u015fama 3: \u00c7ok modlu veri, d\u00fcz metin verisi ve metin-imaj verisi oran\u0131n\u0131 7:3:10'dan 5:1:4'e de\u011fi\u015ftirerek ince ayar s\u00fcrecinde veri k\u00fcmesi oranlar\u0131n\u0131 de\u011fi\u015ftirin;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Data_Scale\"><\/span>Veri \u00d6l\u00e7e\u011fi<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p>Multimodal anlay\u0131\u015f<\/p>\n\n\n\n<p>2. A\u015fama: Resim altyaz\u0131s\u0131 i\u00e7in YFCC ve tablo ve grafik belgelerini anlamak i\u00e7in Doc-matrix dahil olmak \u00fczere 90 milyon \u00f6rnek ekleyin;<\/p>\n\n\n\n<p>A\u015fama 3: DeepSeek-VL2'ye MEME anlay\u0131\u015f\u0131 gibi ek veri k\u00fcmeleri ekleyin;<\/p>\n\n\n\n<p>G\u00f6rsel \u00fcretim: ger\u00e7ek d\u00fcnya verileri d\u00fc\u015f\u00fck kalite i\u00e7erebilir, bu da istikrars\u0131z metinden g\u00f6r\u00fcnt\u00fcye \u00fcretim ve zay\u0131f estetik \u00e7\u0131kt\u0131 ile sonu\u00e7lanabilir. Janus-Pro, ger\u00e7ek verilerin sentetik verilere oran\u0131n\u0131n 1:1 oldu\u011fu tek tip bir \u00f6n e\u011fitim a\u015famas\u0131 (A\u015fama 2) ile 72 milyon sentetik estetik veri \u00f6rne\u011fi kullan\u0131r;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Model_Scale\"><\/span>Model \u00d6l\u00e7e\u011fi<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p>Model parametrelerini 7 milyar parametre \u00f6l\u00e7e\u011fine \u00f6l\u00e7eklendirin;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Experimental_details\"><\/span>Deneysel ayr\u0131nt\u0131lar<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Janus ile kar\u015f\u0131la\u015ft\u0131r\u0131ld\u0131\u011f\u0131nda, Janus-Pro deneylerinin ayr\u0131nt\u0131lar\u0131 temelde ayn\u0131d\u0131r. Buna kar\u015f\u0131l\u0131k, daha b\u00fcy\u00fck parametreli modelde daha fazla k\u00fcme d\u00fc\u011f\u00fcm\u00fc (16 ila 32) kullan\u0131lm\u0131\u015ft\u0131r.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/rmy9ct2fln.feishu.cn\/space\/api\/box\/stream\/download\/asynccode\/?code=NDM1YTM1ZDliNDUwYzAzNzg4MTNiNjUzYWZlZjVhZjhfZGI5ZWloREhYV29OZUxiaEVFc0dhN1dMTDhGdG5ZSnNfVG9rZW46STA0amJtbVlhb0NySk94NkRKNmNqNDVybmdiXzE3MzgyNDIwMzc6MTczODI0NTYzN19WNA\" alt=\"\"\/><\/figure>\n\n\n\n<p>Janus-Pro e\u011fitim hiperparametreleri<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Insufficient\"><\/span>Yetersiz<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>\u00c7ok modlu anlama i\u00e7in, giri\u015f \u00e7\u00f6z\u00fcn\u00fcrl\u00fc\u011f\u00fc 384\u00d7384 ile s\u0131n\u0131rl\u0131d\u0131r ve bu da ince taneli g\u00f6rsel g\u00f6revlerdeki performans\u0131 etkiler. Metinden g\u00f6r\u00fcnt\u00fcye \u00fcretim i\u00e7in, d\u00fc\u015f\u00fck \u00e7\u00f6z\u00fcn\u00fcrl\u00fck, \u00fcretilen sonu\u00e7larda ayr\u0131nt\u0131 eksikli\u011fine neden olur.<\/p>","protected":false},"excerpt":{"rendered":"<p>Eve G\u00f6t\u00fcren Mesaj: Janus, \u00e7ok modlu anlama ve \u00fcretilen g\u00f6rsel kodlamay\u0131 birbirinden ay\u0131ran ve iki g\u00f6rev aras\u0131ndaki potansiyel \u00e7at\u0131\u015fmalar\u0131 azaltan basit, birle\u015fik ve geni\u015fletilebilir bir \u00e7ok modlu anlama ve \u00fcretme modelidir. Gelecekte ek girdi modalitelerini i\u00e7erecek \u015fekilde geni\u015fletilebilir. Janus-Pro, e\u011fitim stratejisini optimize ederek bu temel \u00fczerine in\u015fa edilmi\u015ftir (e\u011fitim s\u00fcresinin art\u0131r\u0131lmas\u0131 dahil).<\/p>","protected":false},"author":2,"featured_media":684,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-746","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/posts\/746","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/comments?post=746"}],"version-history":[{"count":1,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/posts\/746\/revisions"}],"predecessor-version":[{"id":747,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/posts\/746\/revisions\/747"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/media\/684"}],"wp:attachment":[{"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/media?parent=746"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/categories?post=746"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/janusai.pro\/tr\/wp-json\/wp\/v2\/tags?post=746"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}