摘要
本報告綜合專利布局、學術文獻及產業動態的深入剖析,提出核心論點:機器人產業正經歷一場從「精密機械主導」向「軟體定義硬體(Software-Defined Hardware)」的典範轉移。生成式 AI 與視覺-語言-行動(VLA)模型的崛起,不僅提升了機器人的認知能力(大腦),更根本性地改變了機械結構的設計哲學(軀幹),使得過去依賴高精密昂貴組件才能達成的控制精度,現在可透過神經網路的即時補償與模擬來實現。此外擴增實境(AR/VR)與零樣本學習(Zero-shot Learning)等新興訓練範式如何解決數據匱乏的痛點,並結合台灣在 2025 年推動的「AI 新科技-智慧機器人計畫」,分析地緣政治下的供應鏈重組機遇。
被 AI 重塑的「軀幹」:機械結構的軟體定義化
過去四十年來,工業機器人的發展邏輯始終遵循著「機械精度優先」的原則。為了在焊接、組裝或搬運中達到微米級的重複定位精度,工程師們致力於消除機械結構中的非線性因素——齒輪背隙(Backlash)、摩擦力(Friction)以及結構彈性(Elasticity)。這導致了機器人硬體成本的高昂,依賴諧波減速機(Harmonic Drive)與高剛性鑄件成為了行業標準。然而,隨著 2025 年生成式 AI 技術的全面滲透,這一邏輯正在被顛覆。
根據
Derwent Innovation 的專利數據顯示(如下圖申請趨勢所示1)近年來機器人技術的焦點已由「身體」機構(B25J)擴張至「腦袋」運算(G06)。這並非單純的軟體升級,而是一種「認知耦合(Cognitive Coupling)」現象:AI 控制演算法的進步允許硬體設計回歸簡約與有機。透過深度強化學習(Deep Reinforcement Learning, DRL)與端到端(End-to-End)神經網路,現代機器人能夠「學會」補償機械缺陷,從而使得較低成本的致動器也能展現出高效能的運動控制。
這種轉變催生了「實體人工智慧(Physical AI)」的概念。NVIDIA 執行長黃仁勳於 2025 年指出,Physical AI 是下一波 AI 浪潮的核心,它要求模型不僅理解語言與影像,更需理解物理定律並直接在三維世界中執行任務。在此背景下,硬體不再是限制軟體的瓶頸,反而是軟體賦予了硬體超越其物理極限的能力。這種能力使得機器人製造商能夠使用成本較低的行星減速機或擺線針輪減速機,來替代昂貴的諧波減速機。甚至在某些應用中,可以使用更便宜的伺服馬達,依賴 AI 來修正其響應延遲與非線性特性。這正是 Tesla Optimus 能夠宣稱將人形機器人成本壓低至 2-3 萬美元的關鍵技術邏輯之一--用算力換取機械精度。
AI「大腦」的進化:從指令執行到物理理解
傳統機器人系統將感知(Perception)、規劃(Planning)與控制(Control)視為獨立的模組。這種「流水線」架構雖然穩定,但在模組間的訊息傳遞過程中會遺失大量語義與物理細節。VLA 模型則採用「端到端」的架構,直接將感知輸入映射為動作輸出。
●
多模態融合與語義理解:VLA 模型(如 Google 的 RT-2 或 Figure 的 Helix)能夠同時處理視覺訊號與自然語言指令。這意味著機器人不再需要將「杯子」識別為一個圓柱體的幾何座標,而是理解「杯子」具備「可抓取」、「可盛水」的語義屬性(Affordance)。根據 IEEE 與 arXiv 的研究顯示,VLA 模型透過大規模預訓練,能夠在未見過的場景中展現出驚人的泛化能力。
●
端到端的優勢:透過消除中間層的符號轉換,VLA 模型保留了原始感測數據中的高維特徵。例如,在處理一個形狀不規則的物體時,傳統演算法可能會將其簡化為邊界框(Bounding Box),導致抓取失敗;而 VLA 模型則能根據物體的紋理、光影變化,隱式地推斷出最佳的接觸點與施力方向。
●
在機器人運算「腦」感知、理解、決策、規劃的發展上,從近期美國 Tesla 的技術發展路線可知,儘管公司過去在此技術的布局甚少(如下圖技術分布所示1),但這些公司的發展與過去傳統發展機器人以機構為主軸的技術發展路徑不同,如Tesla把自駕車AI腦直接移植到人形機器人。預計 2026 年開始銷售其 Optimus 機器人,優勢是視覺感知、任務理解和學習。由於 OpenAI 重新定義了「理解、規劃、生成」的能力上限,在市場上出現幾家值得關注的公司,雖不一定是自己賣機器人,但定義機器人怎麼思考與學習,包括 NVIDIA 開發的 Isaac 機器人專用 AI 平台、Brain Corp 開發的 BrainOS (自主導航、任務規劃)被 SoftBank Whiz 及多款商用清潔/服務機器人採用。
未來訓練方法:AR/VR 與零樣本學習(Zero-shot Learning)
AI「大腦」的聰明程度取決於數據的品質與數量,高品質的機器人操作數據極為稀缺。因此,產業正轉向三種創新的數據獲取與訓練範式:AR/VR 遙控操作、Sim2Real 模擬訓練,以及零樣本學習。
AR/VR:從「看」到「做」的數據橋樑
人類是教導機器人最好的老師,但傳統的示教器或搖桿難以傳遞人類手指的靈巧操作。AR/VR 的意外地成為了人與機器人互動數據採集的革命性工具。
●
高保真遙操作(Teleoperation):研究人員利用
AR/VR 精準的手部與頭部追蹤功能,讓操作者以第一人稱視角(Egocentric View)「附身」於機器人,或直接錄製人類手部操作物體的過程。這種方式採集的數據包含了人類操作時的細微動作(如旋轉螺絲時的手腕角度、擠壓軟物體時的力度),解決了機器人學習中最困難的「靈巧操作」數據瓶頸。
●
ARMADA 系統與 Mobile ALOHA:Apple 推出的
ARMADA 系統與史丹佛的 Mobile ALOHA 專案,均展示了如何利用 VR 設備快速收集數百小時的高品質演示數據。這些數據被用於訓練 VLA 模型,使機器人能夠通過模仿學習(Imitation Learning)掌握摺衣服、烹飪等複雜家務。
●
數據規模化:相比於昂貴的動態捕捉實驗室,AR/VR 設備使得分散式數據收集成為可能。未來,全球數百萬用戶可能透過玩 VR 遊戲的方式,不知不覺中為機器人訓練提供海量的操作數據。
Sim2Real 與數位孿生:在母體中進化
由於實體機器人訓練速度慢且容易損壞,模擬環境(Simulation)成為了 AI 的主要修練場。
●
Isaac Sim 與 Omniverse:NVIDIA 構建的
Isaac Sim 平台利用 GPU 加速物理引擎,能夠同時模擬數萬個機器人在不同場景下的運作。這不僅加速了訓練過程(模擬一小時等於現實數天),還允許開發者進行「域隨機化(Domain Randomization)」,即在模擬中隨機改變摩擦力、光照、物體質量等參數,迫使 AI 學習出魯棒性極強的通用策略。
●
零樣本 Sim2Real 轉移(Zero-Shot
Sim2Real):這是目前的技術前沿。透過極高保真的物理模擬與感測器渲染,研究人員已能實現在模擬中訓練好的模型,直接部署到真實機器人上並成功運行,無需任何真實數據微調。這對於那些難以獲取真實數據的極端環境(如太空、深海或核電站)機器人尤為重要。
近5年運用神經網路進行影像學習的技術發展上有哪些重要專利,整理如下:
2020-2025 G06V 10/82 High Technology Impact Patent
G06V 10/82 - 利用神經網路,學習影像或影片辨識或理解(本調查限縮於機器人相關)
說明:本報導採用的高影響力專利,係以Derwent Innovation的影響力指標計算
US20240312219A1 Method
for performing temporal fusion with respect to machine learning model for
autonomous or semi-autonomous vehicle, involves generating outputs based on
temporally fused feature map and performing operations by machine based on
outputs. (DWPI標題);NVIDIA申請的專利家族:US、DE (預計到期日
2043-03-15) 存活
US20220072707A1 Method
of creating grip database for use by robot, involves performing physical
environment simulation to generate grasping database, and identifying grasping
pose to be applied to clump of objects using one of quality grasping motions.
(DWPI標題);FANUC申請的專利家族:US、JP、DE、CN (預計到期日 2041-08-24) 存活
US20210209785A1 Method
of determining sizes of objects, involves associating bounding region
identifying first object and determining estimated three-dimensional position
and estimated size of first object detected in image using bounding region and
map point. (DWPI標題);QUALCOMM申請的專利家族:US、EP、JP、KR、BR、CN、PH、TW、IN、TH (預計到期日 2041-05-15) 存活
US20230139772A1 Method
for three-dimensional surface estimation, involves accessing a first data,
where the first data has simulated image data and at least one of
classification data corresponding to the simulated image or range data. (DWPI標題);NVIDIA申請的專利家族:US、JP、DE、CN (預計到期日 2042-04-04) 存活
US20220101047A1 Background
filtering-based data augmentation method for increasing robustness of trained
neural networks involves training neural network to perform prediction task
using generated second image containing object with second background. (DWPI標題);NVIDIA申請的專利家族:US、JP、DE、CN (預計到期日 2041-06-24) 存活
US20220284624A1 Processor
for use in system for performing three-dimensional (3D) pose estimation in
robotics applications, has circuits that is configured to provide image data
represent first portion of object in field of view of sensor in environment.
(DWPI標題);NVIDIA申請的專利家族:US (預計到期日 2042-01-22) 存活
US20220266453A1 Autonomous
robotic welding system, has controller for identifying seam on part in
workspace based on multiple images, planning path for robot to follow when
welding seam, and instructing robot to weld seam according to planned path.
(DWPI標題);PATH ROBOTICS申請的專利家族:US、EP、CA、JP、KR、MX (預計到期日 2042-02-24) 存活
US20220092415A1 Computer-implemented
method for training deep state space model using machine learning, involves
decoding approximated latent state vectors using emission model to provide
synthetic observations, and outputting trained deep state model. (DWPI標題);Robert Bosch申請的專利家族:US、EP、CN (預計到期日 2044-04-10) 存活
US11553636B1 Method
for controlling actions of robot for conducting agricultural tasks by using
spacing-aware plant detection model, involves outputting control signal for
robotic action based on output from plant detection model, and conducting
robotic action for agricultural task in response to control signal. (DWPI標題);FARMWISE LABS申請的專利家族:US (預計到期日 2042-06-21) 存活
值得關注之公司介紹
AIMMO Inc.(에이모)是一家韓國AI數據服務,成立於2016年。提供一套名為AIMMO Core 的AI數據平台,支援數據收集、標註、清洗與優化等完整資料處理流程,為各類AI模型提供高品質、可用性高的訓練資料。應用於多個產業領域,包括:自駕車為車輛感知與決策模型提供影像與感測器資料處理、支援機器人環境理解與行為辨識的資料處理、智慧工廠、智慧城市等。
l AIMMO被選入「2025 Emerging AI+X TOP 100」名單,該榜單由韓國人工智慧情報產業協會主辦,目的在表彰具備未來產業創新潛力的AI技術企業。AIMMO在「AI 開發環境」與「AI 數據基礎架構」相關類別中獲得肯定。(2025/1/14 Venturesquare)
l AIMMO相關專利:
KR2025120476A Method
for updating real-time machine learning model, involves generating and
transmitting event using object recognition and driving scenario machine
learning models, and recognizing object using the object recognition learning
model. (DWPI標題);專利家族KR (預計到期日 2044-02-01) 存活
KR2025119873A Method
for generating driving scenario machine learning model of autonomous vehicle
using generative adversarial network machine learning process in facility
management, involves generating attribute map of true driving scenario. (DWPI標題);專利家族KR (預計到期日
2044-02-01) 存活
KR2781494B1 Method
for generating object recognition machine learning model from video image,
involves recognizing object feature from video image, and generating object
recognition machine learning model by using graph neural network learning
process. (DWPI標題);專利家族KR (預計到期日 2044-02-01) 存活
KR202415649A Method
for providing virtual video image based on driving scenario machine learning
model, involves receiving virtual object and location information selected by
user, and generating object recognition embedding vector. (DWPI標題);專利家族KR (預計到期日
2044-02-01) 存活
國內相關研究團隊至2025計畫
工業技術研究院▶ 胡竹生
l 智慧機器人與製造應用AI系統開發計畫
國立臺灣大學▶ 郭重顯、黃漢邦、藍兆杰、林沛群、包傑奇、許閔傑、王偉彥
l Taiwan
Bot:人機協作之智慧人形機器人系統
l 具多維力覺感測與控制之機器人夾爪開發
l 多機器人空間認知與運動規劃系統之發展
l 台灣手語互動與全人形手語機器人之開發
l 由感測決策和運動生成三面向整合發展足式機器人多地形上之動態跑動
l 智慧型人形機器人複雜動作規劃
l 基於多感知交叉專注模型於機器人思考框架及其AI加速晶片設計
l 跨機器人型態之通用型生成式AI系統
l 可跳過障礙物並拾取和放置敏感物體之自平衡雙臂機器人
l 鋼索驅動移動式自動充電機器人系統開發與場域示範運行
國立臺灣科技大學▶
林柏廷、林紀穎、莊景崴、溫振廷、顏家鈺
l 基於多模態感知及AI路徑生成的多人多機協作避碰動作規劃
l 運用設計最佳化及深度神經網路生成干涉排除的空間連桿機構
l 具主動式抓握姿態補償機制之橫向凸桿抓技機器人與複合運動模式開發
l 以數位孿生為基礎之機器人輔助小樑網組織手術技術研發
l 基於物理導引類神經網路的機器人觸覺伺服控制與動態建模
l 分散式自主地空機器人任務分配與協作
國立成功大學▶ 劉彥辰、劉建聖、藍兆杰、郭秉寰、李祖聖
l 智慧化AI驅動室外農業自主機器人技術開發與整合
l 適用於戶外非結構化環境之多功能自主除草機器人系統開發
l 智能配置協作型機器人與無人化加工系統開發
l 觸覺回饋技術於網宇實體架構之遠端人機互動控制系統
l 發展高動態阻抗反應之協作型機器人於人機智慧合作
l 多樣態機器人協作策略與動作控制系統之研發
l 智慧人形機器人之研製及其於人機協作場域之應用
l 生成式
AI 服務型機器人之研製及其於陪伴與居家服務之應用
國立清華大學▶ 張禎元、陳榮順、黃靖欹、葉廷仁
l 模仿式學習-手眼力複合智慧機器人操作與組裝
l 異構多機器人於雜亂環境之取放物技術及協調合作系統開發
l 基於數位孿生的虛擬實境遠端智慧醫療機器人協作與自動化框架
l 基於運動學與動力學混成控制雙足機器人之平衡、訓練與導航
l 基於人類技能移轉之雙手機器人應用於去毛邊技術
l 多機器人、多任務之路徑規劃及取物技術系統之研發
資料來源:政府研究資訊系統GRB
引用著作
1.
本公司繪製
2.
The silent revolution of heavy-duty robots in mechanical
engineering: Why AI is now making the difference for the strongest robots -
Xpert.Digital, https://xpert.digital/en/heavy-duty-robots-in-mechanical-engineering/
3.
Transforming Robotics with Next-Gen Integrated Motion Devices -
A Synapticon and Nexperia Case Study - Wevolver, https://www.wevolver.com/article/transforming-robotics-with-next-gen-integrated-motion-devices-a-synapticon-and-nexperia-case-study
4.
Vision-Language-Action Models for Robotics: A Review Towards
Real-World Applications - IEEE Xplore, https://ieeexplore.ieee.org/iel8/6287639/10820123/11164279.pdf
5.
How Vision-Language-Action Models are Revolutionizing Robotic
Control - Medium, https://medium.com/@LawrencewleKnight/how-vision-language-action-models-are-revolutionizing-robotic-control-a627bbc0c249
6.
Towards Accessible Physical AI: LoRA-Based Fine-Tuning of VLA
Models for Real-World Robot Control - arXiv, https://arxiv.org/html/2512.11921v1
7.
Helix: A Vision-Language-Action Model for Generalist Humanoid
Control - Figure AI, https://www.figure.ai/news/helix
8.
The Helix robot demonstrate a new level of dexterity in handling
everyday objects, https://www.rdworldonline.com/say-goodbye-to-chores-the-helix-robot-demonstrate-a-new-level-of-dexterity-in-handling-everyday-objects/
9.
A Complete Review Of Tesla's Optimus Robot - Brian D. Colwell, https://briandcolwell.com/a-complete-review-of-teslas-optimus-robot/
10. (PDF) A Gear Backlash
Identification and Compensation Method to Enhance Industrial Robot
Repeatability - ResearchGate,
https://www.researchgate.net/publication/398643236_A_Gear_Backlash_Identification_and_Compensation_Method_to_Enhance_Industrial_Robot_Repeatability
11. Full article: Experimental
assessment and feedforward control of backlash and stiction in industrial
serial robots for low-speed operations - Taylor & Francis Online, https://www.tandfonline.com/doi/full/10.1080/0951192X.2022.2090609
12. Robotic Motor Backlash: A Key
Factor in Precision Mechanical Control - CubeMars, https://www.cubemars.com/robotic-motor-backlash.html
13. Tesla robot price in 2026:
Everything you need to know about Optimus - Standard Bots, https://standardbots.com/blog/tesla-robot
14. Software-defined hardware in
the age of AI | McKinsey, https://www.mckinsey.com/features/mckinsey-center-for-future-mobility/our-insights/software-defined-hardware-in-the-age-of-ai
15. Generative AI for Robotics:
Revolutionizing the Future of Automation - Synergy Labs, https://www.synlabs.io/post/generative-ai-for-robotics-revolutionizing-the-future-of-automation
16. Review article: enhancing the
power of artificial intelligence in mechanical design, https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/39/e3sconf_transsiberia2023_03042.pdf
17. The Role of Artificial
Intelligence in the Design of Mechanical Parts - Incognito Inventions, https://incognitoinventions.com/2025/05/the-role-of-artificial-intelligence-in-the-design-of-mechanical-parts/
18. Backlash-Compensated Active
Disturbance Rejection Control of Nonlinear Multi-Input Series Elastic Actuators, http://biomechatronics.ca/publications/publies/DeBoonICRA2020.pdf
19. 1x vs Figure: Two different
ways of bringing humanoids to market, https://humanoid.guide/1x-vs-figure/
20. Tesla AI Day: Optimus Bot Was
Better Than Anyone Expected | Towards Data Science, https://towardsdatascience.com/tesla-ai-day-optimus-bot-was-better-than-anyone-expected-b271409b71f9/
21. Tesla Optimus Robot:
Engineering Breakdown and Real-World Applications, https://thinkrobotics.com/blogs/indepths/tesla-optimus-robot-engineering-breakdown-and-real-world-applications
22. AI & Robotics | Tesla, https://www.tesla.com/AI
23. ARMADA: Augmented Reality for
Robot Manipulation and Robot-Free Data Acquisition, https://machinelearning.apple.com/research/armada-augmented-reality
24. Apple details combined
training method for humanoid robots - AppleInsider, https://appleinsider.com/articles/25/05/21/apple-used-human-instructors-with-apple-vision-pros-to-train-humanoid-robots
25. Bunny-VisionPro: Real-Time
Bimanual Dexterous Teleoperation for Imitation Learning, https://arxiv.org/html/2407.03162v1
26. XR Dataset for Physical AI
Training - General Discussion - NVIDIA Developer Forums, https://forums.developer.nvidia.com/t/xr-dataset-for-physical-ai-training/329056
27. Teaching Robots with an Apple
Vision Pro and Synthetic Data – NVIDIA GR00T Mimic with @LycheeAI
- YouTube, https://www.youtube.com/watch?v=tTqD-HA74PU
28. Training Sim-to-Real
Transferable Robotic Assembly Skills over Diverse Geometries, https://developer.nvidia.com/blog/training-sim-to-real-transferable-robotic-assembly-skills-over-diverse-geometries/
29. Digital Twin (DT)-CycleGAN:
Enabling Zero-Shot Sim-to-Real Transfer of Visual Grasping Models - IEEE Xplore, https://ieeexplore.ieee.org/document/10064001/
30. Closing the Reality Gap:
Zero-Shot Sim-to-Real Deployment for Dexterous Force-Based Grasping and
Manipulation - arXiv, https://arxiv.org/html/2601.02778v1
31. Optimizing the quantum stack:
a machine learning approach - Chalmers Research, https://research.chalmers.se/publication/543087/file/543087_Fulltext.pdf
32. AlphaGo Zero - Wikipedia, https://en.wikipedia.org/wiki/AlphaGo_Zero
33. Towards Generalizable
Zero-Shot Manipulation via Translating Human Interaction Plans, https://ieeexplore.ieee.org/document/10610288/
34. [D] What is the difference
between few-, one- and zero-shot learning? : r/MachineLearning, https://www.reddit.com/r/MachineLearning/comments/boitjj/d_what_is_the_difference_between_few_one_and/
35. The End of Tabula Rasa: How
Pre-Trained World Models are Redefining Reinforcement Learning - Unite.AI, https://www.unite.ai/the-end-of-tabula-rasa-how-pre-trained-world-models-are-redefining-reinforcement-learning/
36. Crossing the Gap: A Deep Dive
into Zero-Shot Sim-to-Real Transfer for Dynamics, https://www.robot-learning.uk/crossing-the-gap
37. Taiwan to launch five-year
robots plan - Taipei Times https://www.taipeitimes.com/News/front/archives/2025/05/16/2003836965
38. 百億銀彈進場!國發基金押注「智慧機器人」:政策怎麼推?誰會受惠? - 方格子, https://vocus.cc/article/69119c2cfd8978000179b060
39. 卓揆:運用4項策略發展AI新科技打造臺灣成為人工智慧之島, https://ncsd.ndc.gov.tw/Fore/News_detail/c5ad62dd-8fa9-4048-95cb-7d943818d6b7
40. Smart robot program to drive
development of new AI tech - NSTC, https://ostp.nstc.gov.tw/NewsContent.aspx?id=48
41. AI Robot Industry Alliance
Formed: Six Major Associations Collaborate with Industrial Technology Research
Institute, Institute for Information Industry, and Leading Manufacturers |
Taiwan Industry Updates | CENS.com, https://www.cens.com/cens/html/en/news/news_inner_62704.html
42. 台灣機器人大聯盟明成立! 上銀、和大、所羅門、東台等組隊打群架 - Yahoo奇摩新聞, https://tw.news.yahoo.com/%E6%97%97%E8%89%A6%E6%89%8B%E6%A9%9F%E6%99%B6%E7%89%87%E6%96%B0%E6%99%82%E4%BB%A3-%E9%AB%98%E9%80%9A%E8%81%AF%E7%99%BC%E7%A7%91%E9%96%8B%E6%89%933%E5%A5%88%E7%B1%B3%E5%A4%A7%E6%88%B0-%E8%AA%B0%E5%8B%9D%E5%87%BA-061942341.html
43. NTU ME ASR Lab. Humanoid
Robot Demonstration Video for Future Tech Pavilion 2025, Taipei. - YouTube, https://www.youtube.com/watch?v=Sq2_3S9mlVE
44. Taiwan university develops
robot that can ride scooter | Taiwan News | Oct. 14, 2025 16:43, https://taiwannews.com.tw/news/6219613
45. NTU Collaborates with
National Science and Technology Council to Develop AI Robotic Dogs - Spotlight
- News - College of Engineering, National Taiwan University, https://www.eng.ntu.edu.tw/en/news_in.aspx?id=2137&chk=6a00ae0b-1654-427b-be4f-bca6a2cb90cc&mid=72
46. AIMMO Germany GmbH - ASAM eV, https://www.asam.net/members/detail/aimmo/
0 留言