LLMã¯ã226-68=ãã®ãããªããã³ãããä¸ããã¨ã158ãã¨è¨ç®ãã¦ããã¾ããããã®è¨ç®ã¯æã ãæ³åãããããå¥å¦ãªæ¹æ³ã§è¡ã£ã¦ãããã¨ãç´¹ä»ãã¾ã [Nikankin+ ICLR 2025]ã
ã¾ãã¯åææ¡ä»¶ã確èªãã¾ããæèã®é£éã¯ä½¿ãããã226-68=ãã®ãããªããã³ããã«å¯¾ãã¦ã158ãã®ããã«çããç´æ¥åºåããå ´åãèãã¾ãã
ä¸ä¾ã¨ã㦠Llama3-8B ãèãã¾ããLlama3 ã®ãã¼ã¯ãã¤ã¶ã¯ 0 ãã 1000 ã¾ã§ã®æ°ã« 1 ã¤ã®ãã¼ã¯ã³ãå²ãå½ã¦ãã®ã§ãã226-68=ããå ¥åããã¨ã次ã®ãã¼ã¯ã³ã158ããã0ãã1ã...ã157ãã158ãã159ã...ã1000ããªã©ã®ãã¼ã¯ã³ã®ä¸ãããæã確çãé«ããã®ã¨ãã¦é¸ã°ãã¾ãã
ã¤ããã»ãã«ã³ãã³ãã®çºè¦ [Nikankin+ ICLR 2025] ã¯ãLlama3-8B ã¯çããå ¥åã«ã¤ãã¦ã®ç²ãæ¡ä»¶ã夿°è©ä¾¡ãããã®ç©ã¿éãã§ãã®ãããªååæ¼ç®ãè§£ãã¦ããã¨ãããã¨ã§ãã
ä¾ãã°ãã{op1} - {op2}ãã¨ããããã³ãããã³ãã¬ã¼ãï¼{op1} 㨠{op2} ã¯å ·ä½çãªæ°ã§åãã¾ãï¼ãå ¥åããã¨ã
- 第 24 層㮠12439 çªç®ã®ãã¥ã¼ãã³ã¯ã{op1} - {op2} ã®è©ä¾¡çµæã 150 ãã 180 ã®éã«ããã¨ãã«çºç«ãã â çºç«ããã¨ãã¼ã¯ã³ã150ãã151ãã152ã...ã179ãã180ãã®åºå確çãå¢ãã
- 第 30 層㮠1582 çªç®ã®ãã¥ã¼ãã³ã¯ã{op1} - {op2} ã®è©ä¾¡çµæã mod 10 ã§ 8 ã®ã¨ãã«çºç«ãã â çºç«ããã¨ãã¼ã¯ã³ã8ãã18ãã28ã...ã998ãã®åºå確çãå¢ãã
ãªã©ã観å¯ããã¦ãã¾ãã
ä¾ãã°ã226-68=ããå ¥åããã¨ãè©ä¾¡çµæ 158 㯠150 ãã 180 ã®éã«ããã®ã§ç¬¬ 24 層㮠12439 çªç®ã®ãã¥ã¼ãã³ãçºç«ããã¼ã¯ã³ã150ãã151ãã152ã...ã179ãã180ãã®åºå確çã䏿ãè©ä¾¡çµæ 158 㯠mod 10 ã§ 8 ãªã®ã§ç¬¬ 30 層㮠1582 çªç®ã®ãã¥ã¼ãã³ãçºç«ããã¼ã¯ã³ã8ãã18ãã28ã...ã998ãã®åºå確çã䏿ãã¾ãã
ãã®ã¨ããã150ããã998ããªã©ã®ãã¼ã¯ã³ã®ç¢ºçãåæã«å¢ãã¾ããããããç©ã¿ä¸ããåæ°ã¯å ããªã®ã«å¯¾ãã¦ãçã®çãã158ãã¯æ¯åç©ã¿ä¸ããã®ã§ãå ¨ã¦ã®ãã¥ã¼ãã³ã®ç´¯ç©ã§ãã¼ã¯ã³ã158ããååºãã¦ç¢ºçãé«ããªãã¾ãã

ããããã®ãã¥ã¼ãã³ã¯ååæ¼ç®ãå³å¯ã«è§£ãã¦ããããç²ãæ¡ä»¶ãè©ä¾¡ãã¦ããã«éãã¾ããããç²ãæ¡ä»¶ãç¡æ°ã«ç©ã¿éãªããçã®åçãæµ®ã彫ãã«ãªãã¾ããèè ãã¯ãã®ã¡ã«ããºã ããã¥ã¼ãªã¹ãã£ãã¯ã®æ (bag of heuristics) ã¨å¼ãã§ãã¾ãã
ä¸è¬ã«ã{op1}ã{op2}ãã¾ãã¯è©ä¾¡çµæãç¹å®ã®ãã¿ã¼ã³ã«å½ã¦ã¯ã¾ãã¨ãã®ã¿çºç«ãããã¥ã¼ãã³ããã¥ã¼ãªã¹ãã£ãã¯ãã¥ã¼ãã³ã¨å¼ã³ã¾ãããã¥ã¼ãªã¹ãã£ãã¯ãã¥ã¼ãã³ã®ç¨®é¡ã¨ãã¦ã¯ä»¥ä¸ã®ãã®ãèãã¦ãã¾ãã
- ç¯å²ãã¥ã¼ãªã¹ãã£ãã¯ï¼å¤ã [a, b] ã®ç¯å²ã«å«ã¾ããã
- å°ä½ãã¥ã¼ãªã¹ãã£ãã¯ï¼å¤ mod n = m ãæãç«ã¤ã
- ãã¿ã¼ã³ãã¥ã¼ãªã¹ãã£ãã¯ï¼å¤ã
1.2ã®ããã«ç¹å®ã®æ£è¦è¡¨ç¾ã«ãããããã - ãªãã©ã³ãä¸è´ãã¥ã¼ãªã¹ãã£ãã¯ï¼{op1} = {op2} ãæãç«ã¤ã
- è¤æ°çµæãã¥ã¼ãªã¹ãã£ãã¯ï¼å²ãç®ã®ã¨ãã®ã¿ä½¿ç¨ï¼ï¼å¤ãéå S ã«å«ã¾ãããS ã¯è¦ç´ æ° 2 ~ 4 ã®éåã
ç¹å®ã®ãã¥ã¼ãã³ããã¥ã¼ãªã¹ãã£ãã¯ãã¥ã¼ãã³ãã©ããã調ã¹ãã«ã¯ãã{op1} - {op2}ãã®å½¢ã®æ§ã ãªããã³ãããå ¥åããã©ã®ãããªå ´åã«ãã®ãã¥ã¼ãã³ãå¼·ãçºç«ããããè¨é²ãããã¿ã¼ã³ã®åè£ã¨ã®ä¸è´åº¦ã測ãã°ããã§ãã

åãã¥ã¼ãã³ã«ããåºåã¸ã®å¯ä¸ã¯ãã¸ããã¬ã³ãº (logit lens) [nostalgebraist 2020] ã¨ããæ¹æ³ã§ç®åºã§ãã¾ãã

ãã©ã³ã¹ãã©ã¼ãã¼ã¯æ³¨ææ©æ§ã¨å¤å±¤ãã¼ã»ãããã³ (MLP) ãæ®å·®æ¥ç¶ã§ç©ã¿éãªã£ã¦ãã¾ããã¤ã¾ãããã©ã³ã¹ãã©ã¼ãã¼ã®æçµç·å½¢å±¤ã¸ã®å
¥å ã¯åå±¤ã®æ³¨ææ©æ§ã¨ MLP ã®åºåã®ç·å
ã§ãããæçµç·å½¢å±¤ã¯ããã«èªå½ãµã¤ãº
ã®è¡å
ãæãã¦åãã¼ã¯ã³ã®ãã¸ãããè¨ç®ãã¾ããé常ã®è¨ç®ã§ã¯ãå
¨ã¦ã®å±¤ãè¶³ãåããã¦ãããã¼ã¯ã³ã®ç¢ºçãè¨ç®ãã¾ãããè¦æ¹ãå¤ããã¨ãå層ã®åºå
ã¯ããã®é½åº¦ããã¼ã¯ã³ã®ãã¸ããã
ã ãæ¼ãä¸ãã¦ããã¨è§£éã§ãã¾ããç¹ã«ã2 層 MLP ã®ä¸éãã¥ã¼ãã³
ã¯ã2 層ç®ã®ãã©ã¡ã¼ã¿è¡åã®
åç®
ã«æ¥ç¶ããã¦ããããã®ãã¥ã¼ãã³ãçºç«ããã¨ãã¼ã¯ã³ã®ãã¸ããã¯
ã ãæ¼ãä¸ãããããã¨ãåããã¾ããããã«ãããåãã¥ã¼ãã³ã«ããåºåã¸ã®å¯ä¸ãå
·ä½çã«åããã¾ãã
ãã®ãã¬ã¼ã ã¯ã¼ã¯ã«ãããLLM ãè¨ç®ãééããã¨ã®åå ãåæã§ãã¾ããLlama3-8B ã¯ãã¾ã«è¨ç®ãééãã¾ããæ£è§£ããã¨ãã®ãã¥ã¼ãã³ã®çºç«ãã¿ã¼ã³ã¨ãééããã¨ãã®ãã¥ã¼ãã³ã®çºç«ãã¿ã¼ã³ãåæãã¦ã¿ãã¨ãééããã¨ãã«ã¯ããã¥ã¼ãªã¹ãã£ãã¯ãã¥ã¼ãã³ãæ¼ãä¸ããæ£è§£ãã¼ã¯ã³ã®ãã¸ãããå°ãªããã¨ãåããã¾ããã

ããã«ãããæ£çãã¼ã¯ã³ãæµ®ã彫ãã«ãªãåãããè¨ç®ãééã£ã¦ãã¾ã£ãã¨èãããã¾ãã
ååæ¼ç®ã¯èªæãªã¿ã¹ã¯ã¨æããã¡ã§ããããã®ãããªåç´ãªã¿ã¹ã¯ã§ã LLM ã¯æãããããªãæ¹æ³ã§è§£ãã¦ããã調ã¹ã¦ã¿ãã¨æ§ã ãªãã¨ãåããã¾ãã
ãããã«
ãã®è«æã§ã¯ãæ¨è«ã¿ã¹ã¯ã®ä»£è¡¨ä¾ã¨ãã¦ååæ¼ç®ã¿ã¹ã¯ã対象ã«åæãã¦ãã¾ãããä¸è¬ã®æ¨è«ã¿ã¹ã¯ã«ã¤ãã¦ã LLM ã¯åæ§ã®æ¹æ³ã§æ¨è«ãã¦ããå¯è½æ§ãããã¾ãã
ããªã㨠ChatGPT ã®ä¼è©±ã説å¾åã®ããçµè«ã«è¦ãã¦ããAI ã¯è£ã§ã¯ãã®ãããªãã¢ãæ¹æ³ã§çµè«ãåºãã¦ããããããã¾ããã
æ¬ç¨¿ã LLM ã®æ¨è«è½åã«ã¤ãã¦èãããã£ããã«ãªãã°å¹¸ãã§ãã
èè æ å ±
ãã®è¨äºãããã«ãªã£ãã»é¢ç½ãã£ãã¨æã£ãæ¹ã¯ SNS ãªã©ã§ææ³ããã ããã¨å¬ããã§ãã
æ°çè¨äºãã¹ã©ã¤ã㯠@joisino_ (Twitter) ã«ã¦çºä¿¡ãã¦ãã¾ãããã²ãã©ãã¼ãã¦ãã ãããã
ä½è¤ ç«é¦¬ï¼ãã¨ã ãããã¾ï¼
京é½å¤§å¦æ å ±å¦ç ç©¶ç§å士課ç¨ä¿®äºãåå£«ï¼æ å ±å¦ï¼ãç¾å¨ãå½ç«æ å ±å¦ç ç©¶æå©æãèæ¸ã«ã深層ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®é«éåããã°ã©ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ããæé©è¼¸éã®çè«ã¨ã¢ã«ã´ãªãºã ããããã