Colossal-AI ¶Ô700 ÒÚ²ÎÊý´óÄ£Ð͵ÄѵÁ·¼ÓËÙ195%£¬´óÐÍÄ£ÐÍ¿ª·¢ºÍÓ¦Óõijɱ¾½«´ó´ó½µµÍ¡£¸ù¾ÝHPC-AI Technology ¹ÙÍø£¬×÷ΪȫÇò¹æÄ£×î´ó¡¢×î»îÔ¾µÄ´óÄ£ÐÍ¿ª·¢¹¤¾ßÓëÉçÇø£¬Colossal-AI Ôٴεü´ú£¬Ìṩ¿ªÏä¼´ÓõÄ8 µ½512 ¿¨LLaMA2ѵÁ·¡¢Î¢µ÷¡¢ÍÆÀí·½°¸£¬¶Ô700 ÒÚ²ÎÊýѵÁ·¼ÓËÙ195%£¬²¢Ìá¹©Ò»Õ¾Ê½ÔÆÆ½Ì¨½â¾ö·½°¸¡£ÔÚʹÓÃ8 ¿¨ÑµÁ·/΢µ÷LLaMA2-7B ʱ£¬Colossal-AI ÄÜ´ïµ½Ô¼54%µÄÓ²¼þÀûÓÃÂÊ£¨MFU£©£¬´¦ÓÚÒµ½çÁìÏÈˮƽ¡£¶ø¶ÔÓÚԤѵÁ·ÈÎÎñ£¬Colossal-AI ÔòÒò׿ԽµÄϵͳÓÅ»¯ºÍÀ©Õ¹ÐÔ£¬ÈÔÄܱ£³ÖÁ¼ºÃÐÔÄÜ£¬ÑµÁ·ÌáËÙ195%¡£Colossal-AI LLaMA-2ѵÁ·/΢µ÷·½°¸µÄ¸ßÐÔÄÜÀ´Ô´ÓÚеÄÒì¹¹ÄÚ´æ¹ÜÀíϵͳGemini ºÍ¸ßÐÔÄÜËã×Ó£¨°üÀ¨Flash attention 2£©µÈϵͳÓÅ»¯¡£ÐÂGemini ÌṩÁ˸߿ÉÀ©Õ¹ÐÔ£¬¸ß³°ôÐÔ£¬¸ßÒ×ÓÃÐԵĽӿڡ£Í¬Ê±£¬Colossal-AI µÄShardFormer ÌṩÁË¿ªÏä¼´ÓõÄ*²¢ÐкÍËã×ÓÓÅ»¯µÄÄÜÁ¦£¬½öÐèÊýÐдúÂë¼´¿ÉʹÓã¬ÔÚµ¥»úÒÔ¼°´ó¹æÄ£¼¯ÈºÉ϶¼ÄÜÌṩÁ¼ºÃµÄÐÔÄÜ¡£Æä´Î£¬ÎªÁ˽øÒ»²½ÌáÉý¿ª·¢ºÍ²¿ÊðЧÂÊ£¬Colossal-AI ÍŶӻ¹½«ÉÏÊöϵͳÓÅÊÆÓëËãÁ¦½áºÏ£¬ÌṩColossal-AI ÔÆÆ½Ì¨£¬ÌṩÁ®¼ÛËãÁ¦ºÍ¿ªÏä¼´ÓõÄAIÖ÷Á÷Ó¦Óá£Óû§Ö»ÐèÒªÉÏ´«Ïà¹ØÊý¾Ý£¬¼´¿ÉÎÞ´úÂëѵÁ·¸öÐÔ»¯Ë½ÓÐÄ£ÐÍ£¬²¢½«ÑµÁ·ºÃµÄÄ£ÐÍÒ»¼ü²¿Êð¡£ÎÒÃÇÈÏΪ£¬Colossal-AI µÄÕâ´Îµü´ú´ó´ó½µµÍÄ£ÐÍѵÁ·ÒÔ¼°²¿ÊðµÄ³É±¾£¬ÓÐÍû¶Ô´óÄ£ÐͲúÒµµÄ¼ÓËÙ·¢Õ¹Æðµ½ÖØÒªµÄ´ß»¯×÷Óá£
AI ÓÐÍû´úÌæÈËÀà½øÐÐÇ¿»¯Ñ§Ï°£¬´óÄ£ÐÍѵÁ·ËÙ¶ÈÓÐÍû´ó·ùÌáÉý¡£À´×ÔÈËÀà·´À¡µÄÇ¿»¯Ñ§Ï°£¨RLHF£©¿ÉÒÔÓÐЧµØÊ¹´óÐÍÓïÑÔÄ£ÐÍ£¨LLM£©ÓëÈËÀàÆ«ºÃ±£³ÖÒ»Ö£¬µ«ÊÕ¼¯¸ßÖÊÁ¿µÄÈËÀàÆ«ºÃ±êÇ©ÊÇÒ»¸ö¹Ø¼üÆ¿¾±¡£Bai µÈÈË̽Ë÷ʹÓÃAI À´ÑµÁ·ÓÃÓÚÇ¿»¯Ñ§Ï°Î¢µ÷µÄ½±ÀøÄ£ÐÍ£¬µ«ËûÃǵŤ×÷²¢Ã»ÓÐÖ±½Ó±È½ÏÈËÀàÓëAI ·´À¡µÄÓÐЧÐÔ£¬ÕâʹµÃÀ´×ÔÈ˹¤ÖÇÄÜ·´À¡µÄÇ¿»¯Ñ§Ï°£¨RLAIF£©ÊÇ·ñ¿ÉÒÔ³ÉΪRLHF µÄºÏÊÊÌæ´úÆ·µÄÎÊÌâÈÔûÓж¨ÂÛ¡£½üÈÕ£¬ ¹È¸èÑо¿Ôº·¢±íÁËÂÛÎÄ RLAIF: ScalingReinforcement Learning from Human Feedback with AI Feedback£¬Ö±½Ó±È½ÏÁËÈËÀàÓëAI ·´À¡µÄÓÐЧÐÔ£¬·¢ÏÖRLAIF ¿ÉÒÔ²úÉúÈËÀàˮƽµÄÐÔÄÜ£¬ÕâÒ»½á¹ûΪRLHFµÄ¿ÉÀ©Õ¹ÐÔ·½ÃæµÄÏÞÖÆÌṩÁËDZÔڵĽâ¾ö·½°¸¡£ÔÚÕâÏîÑо¿ÖУ¬Ñо¿Õ߸ø¶¨ÁËÒ»¶ÎÎı¾ºÍÁ½¸öºòÑ¡ÏìÓ¦£¬Ê¹ÓÃÏÖÓеÄLLM ΪÆä·ÖÅäÒ»¸öÆ«ºÃ±êÇ©¡£È»ºóÔÙ»ùÓÚ¸ÃLLM Æ«ºÃ£¬Ê¹ÓöԱÈËðʧѵÁ·Ò»¸ö½±ÀøÄ£ÐÍ£¨RM£©¡£×îºó£¬ËûÃÇʹÓøÃRMÀ´Ìṩ½±Àø£¬Í¨¹ýÇ¿»¯Ñ§Ï°·½·¨Î¢µ÷µÃµ½Ò»¸ö²ßÂÔÄ£ÐÍ¡£Ëæºó£¬Ñо¿ÕßʹÓÃÁËÈý¸öÆÀ¹ÀÖ¸±ê£º´ò±êÇ©AI ¶ÔÆë¶È¡¢Åä¶Ô׼ȷ¶ÈºÍʤÂÊÀ´¶ÔAI ºÍÈËÀàµÄ·´À¡½øÐÐÆÀ¹À¡£×îÖյõ½RLAIF ºÍRLHF ²ßÂÔ·Ö±ðÔÚ71%ºÍ73%µÄʱ¼äÀï±È¼à¶½Ê½Î¢µ÷£¨SFT£©»ù×¼¸üÊÜÈËÀàÇàíù£¬¶øÕâÁ½¸öʤÂÊÔÚͳ¼ÆÑ§ÒâÒåÉÏûÓÐÏÔÖø²î±ð¡£Í¬Ê±£¬µ±±»ÒªÇóÖ±½Ó±È½ÏRLAIF ÓëRLHF µÄ½á¹ûʱ£¬ÈËÀà¶ÔÁ½Õߵį«ºÃ´óÖÂÏàͬ£¨¼´ 50%ʤÂÊ£©¡£ÎÒÃÇÈÏΪ£¬ÕâЩ½á¹û±íÃ÷RLAIF ²»ÒÀÀµÓÚÈËÀà±ê×¢£¬²¢ÇÒ¾ßÓÐÁ¼ºÃµÄÀ©Õ¹ÐÔ£¬¹Ê¶øÓµÓÐÌæ´úRLHF µÄDZÁ¦£¬Èç¹ûAI Äܹ»´úÌæÈËÀà½øÐÐÇ¿»¯Ñ§Ï°£¬Î´À´´óÄ£Ð͵ÄѵÁ·ËÙ¶ÈÓÐÍû´ó·ùÌáÉý¡£
´óÄ£Ðͱ»Ö¤ÊµÀíÂÛÉÏÄ̻ܽá»úÆ÷È˶ÔÄ£ºýÖ¸Áî×ö³ö»ØÓ¦£¬¾ßÉíÖÇÄÜ·¢Õ¹»ò³ÖÐøÉ¡£½üÈÕ£¬¹È¸èDeepMind ºÍ¶«¾©´óѧµÄÑо¿ÍŶӹ²Í¬·¢²¼ÁËһƪÂÛÎÄSayTap:Language to Quadrupedal Locomotion£¬ÂÛÎÄÖÐÌá³öµÄ½»»¥Ê½ÏµÍ³£¨SayTap£©·½·¨Ê¹ÓÃÁË´óÐÍÓïÑÔÄ£ÐÍ£¬¿É½«×ÔÈ»ÓïÑÔÖ¸ÁîתÒë³ÉËÄ×ã»úÆ÷ÈËµÄµÍ²ã¿ØÖÆÐźţ¬¶øÇÒÕâЩָÁî¿ÉÒÔÏ൱ģºý¡£½üÀ´´óÐÍÓïÑÔÄ£ÐÍ£¨LLM£©·¢Õ¹Ñ¸ËÙ£¬ÒѾչÏÖ³öÁËÖ´ÐÐ¸ß²ã¹æ»®µÄDZÁ¦¡£È»¶ø£¬¶ÔLLM À´Ëµ£¬Àí½âµÍ²ãÖ¸ÁîÒÀÈ»ºÜÄÑ£¬±ÈÈç¹Ø½Ú½Ç¶ÈÄ¿±ê»òµç»úŤ¾Ø£¬ÓÈÆäÊǶÔÓÚ±¾Éí¾Í²»Îȶ¨¡¢±ØÐè¸ßƵ¿ØÖÆÐźŵÄ×ãʽ»úÆ÷ÈË¡£Òò´Ë£¬´ó¶àÊýÏÖÓй¤×÷¶¼»á¼ÙÉèÒÑΪLLM ÌṩÁ˾ö¶¨»úÆ÷ÈËÐÐΪµÄ¸ß²ãAPI£¬¶øÕâ¾Í´Ó¸ù±¾ÉÏÏÞÖÆÁËϵͳµÄ±íÏÖÄÜÁ¦¡£
´ËÑо¿ÖУ¬Ñо¿ÕßÌá³öÁËÒ»ÖÖʹÓýŽӴ¥Ä£Ê½×÷Ϊ½Ó¿ÚµÄ·½·¨£¬¸Ã½Ó¿ÚÔÚ×ÔÈ»ÓïÑÔÖÐÁ¬½ÓÈËÀàÃüÁÒÔ¼°Êä³öÕâЩµÍ¼¶ÃüÁîµÄÔ˶¯¿ØÖÆÆ÷¡£Õâ¾Í´î½¨³öÁËÒ»¸öËÄ×ã»úÆ÷È˵Ľ»»¥Ê½ÏµÍ³£¨SayTap£©£¬ÔÊÐíÓû§Áé»îµØÖÆ×÷¸÷ÖÖÔ˶¯ÐÐΪ¡£Ñо¿ÍŶÓÉè¼ÆÁËÒ»¸öLLM ÌáʾÉè¼Æ£¬Ò»¸ö½±Àøº¯Êý£¬ÒÔ¼°Ò»ÖÖ½«¿ØÖÆÆ÷±©Â¶¸ø½Ó´¥Ä£Ê½µÄ¿ÉÐзֲ¼µÄ·½·¨¡£
×îÖյĽá¹ûÏÔʾ£¬ËùÌá³öµÄ·½·¨Ê¹ËÄ×ã»úÆ÷È˼ÈÄܹ»×ñÑÖ±½ÓºÍ¾«È·µÄÃüÁͬʱ»¹ÄÜ×ñÑ×ÔÈ»ÓïÑÔÖеķǽṹ»¯ºÍÄ£ºýÖ¸Á´Ó¶ø´Ù½øÈË»ú½»»¥¡£ÀýÈ磬µ±Ñо¿Õ߸ø³ö¡°ºÃÏûÏ¢£¬ÎÒÃÇҪȥҰ²ÍÁË£¡¡±µÄÖ¸Áîʱ£¬»úÆ÷È˱íÏÖ³öÉÏ´ÚÏÂÌøµÄÐÐΪ¡£µ±Ñо¿Õ߸ø³ö¡°±íÏֵõØÃæºÃÏñºÜÈȵÄÑù×Ó¡±µÄÃüÁîʱ£¬»úÆ÷ÈË¿ìËÙµØÒƶ¯£¬½Å¼¸ºõ²»×ŵء£ÕâЩ·´Ó¦´ó¶àÓëÔ¤ÆÚÒ»Ö¡£ÎÒÃÇÈÏΪ£¬ÕâÒ»Ñо¿½á¹ûչʾÁËδÀ´»úÆ÷ÈËÓ¦ÓõĹãÀ«¿ÉÄÜÐÔ£¬ÀýÈ糡¾°±íÑÝ¡¢ÈËÀà°éÂÂÉõÖÁ¹¤ÒµºÍ¼ÒÍ¥ÖÐÐí¶à¸ü¾ß´´ÔìÐÔµÄÈÎÎñ£¬°éËæ´óÄ£Ð͵ijÖÐøµü´ú£¬Î´À´»úÆ÷È˺ÍÀí½âÈËÀàÒâͼµÄÄÜÁ¦Ò²ÓÐÍû³ÖÐøÌáÉý£¬¾ßÉíÖÇÄÜ·¢Õ¹Ò²ÓÐÍû³ÖÐøÉ¡£
·çÏÕÌáʾ£ºAI ¼¼Êõ·¢Õ¹²»¼°Ô¤ÆÚµÄ·çÏÕ£¬ÐÐÒµÓ¦Óò»¼°Ô¤ÆÚ¡£