RNEWS — Robot NEWS

How to use

Each card carries a 2–3 sentence LLM-written summary of the source — read the gist here, click the title for the original. Summaries can occasionally be inaccurate, so verify details at the source.

Search by keyword, or click chips to filter. Method and Platform stack (e.g. #VLA + #Humanoid = VLA work on humanoids). Source and Time apply on top.

Sort by — Score (default): Code & Papers rank by relevance (topic match, released code, GitHub stars/forks, real-robot results, recency); News is grouped by priority tier (High → Mid → Low), newest within each. Created = newest first, Pushed = most recently updated, Stars = most-starred. (When the News source is selected, Pushed/Stars are hidden.)

Priority — the colored bar on each card's left edge reflects how relevant the item is (research-topic match, released code, real-robot results, recency, GitHub stars):

High — strong signal: code or real-robot results and a close fit to robot-learning topics.
Mid — moderately relevant; worth a glance.
Low — weak or only loosely related.

Survey — pick Survey in the Source row to see only curated awesome-list / paper-list / survey / notes GitHub repos (purple SURVEY badge). They're maps of the field rather than runnable code, so they don't carry a High/Mid badge and the GitHub source view excludes them so it stays code-only.

More filters

701 of 701 items

News & articles (90)

Industry pulse — company announcements, funding, demos, journalism. Most relevant first, newest within each tier.

Torc Robotics Announces First-Ever Autonomous-Trucking Partnership at Mila to Advance Physical AI news High

Published 2026-05-27

#Other

Torc Robotics is joining Mila’s Montreal AI ecosystem as its first autonomous-trucking partner, giving the company direct access to Mila researchers, faculty, and students. The partnership is aimed at advancing physical AI for self-driving trucks, tying Torc’s freight autonomy work to one of Canada’s major academic AI hubs.

FANUC America Showcases Physical AI and AI-Enabled Robotics Demos at Automate 2026 news High

Published 2026-05-27

#Other

FANUC America is using Automate 2026 to show a set of factory robotics demos centered on “physical AI,” including collaborative robots, high-payload industrial arms, vision-guided systems, and digital tools. The demos are framed around making manufacturing cells more flexible and scalable while improving precision, suggesting FANUC is pushing AI-enabled perception and control from isolated features into broader production workflows.

Figure ramps up humanoid robot manufacturing at unprecedented speed news High

Sam Francis · Published 2026-05-27

#Humanoid

Figure AI says it is moving humanoid robot production beyond staged demos and prototypes into repeatable manufacturing, with the goal of building hundreds of units reliably and economically. The significance is less about a single new robot behavior than about whether the company can industrialize humanoids at a pace that would make real deployment and fleet learning plausible.

Figure partners with Catalyst Brands to deploy humanoid robots in logistics operations news High

Sam Francis · Published 2026-05-27

#Humanoid

Figure AI signed a commercial agreement with Catalyst Brands to deploy humanoid robots in the retailer’s distribution and logistics network, starting at a Reno, Nevada distribution center. The robots will target physically demanding supply-chain work, making this a concrete commercial test of humanoids in retail logistics rather than a lab demonstration.

China to assign digital ID numbers to humanoid robots for lifecycle tracking news High

Sam Francis · Published 2026-05-27

#Humanoid

China is rolling out a national digital ID system that will give each humanoid robot a unique number covering its full lifecycle, from manufacturing and deployment through recycling or disposal. The move ties robot traceability to safety oversight and sector standardization, signaling that authorities expect humanoids to become common enough that they need infrastructure more like regulated machines than ordinary consumer devices.

현대로템, K-방산 피지컬 AI 기술 주권 본격화...무인로봇 핵심 국책개발 과제 수주 news High

전미준 기자 · Published 2026-05-26

#Other

Hyundai Rotem has been selected for two Korean national R&D projects on Physical AI for unmanned defense robots, covering natural-language control of heterogeneous multi-robot teams and an integrated simulator with a modular robot system. The work matters because it pushes command interfaces beyond one-operator, one-robot control toward unified supervision of multiple unmanned platforms through spoken or written human commands, a core piece of domestic autonomy capability for K-defense systems.

바클레이즈 “휴머노이드 로봇, 중국 노동력 감소분 60% 메운다” news High

박찬 기자 · Published 2026-05-26

#Humanoid

Barclays Research projects that humanoid robots could offset as much as 60% of China’s working-age population decline over the next decade. The argument is that “physical AI” is moving beyond software into the real economy, making robotics a potentially major force in productivity, labor markets, geopolitics, and long-term asset allocation.

中 엔진AI, 휴머노이드 로봇 ‘T800’ 양산 개시…“年 1만 대 목표” news High

유효정 · Published 2026-05-26

#Humanoid

China’s EngineAI has started formal operation of a humanoid robot factory and shipped the first units of its full-size general-purpose T800 model, which was announced in December 2025. The line is designed to produce one humanoid every 15 minutes, combining intelligent manufacturing with R&D and testing, and the company says it is scaling toward annual deliveries of 10,000 units.

3D-printable humanoid legs let robotics experiments run wild news High

Jeremy Hsu · Published 2026-05-26

#Humanoid

Hugging Face debuts $2,500 bipedal robot project for builders and researchers.

중국, 모든 휴머노이드 로봇에 디지털 ID 부여..."전 생애주기 추적" news High

박찬 기자 · Published 2026-05-26

#Humanoid

China is building a national lifecycle-management platform for humanoid robots, giving every domestically produced humanoid a permanent digital ID that follows it from R&D and manufacturing through sales, retirement, disposal, and recycling. The move, launched by the MIIT-affiliated HEIS standardization committee, points to treating embodied AI hardware as regulated infrastructure, with traceability and accountability baked into the robot supply chain from the start.

美 로타쿠, 400만원대 휴머노이드 로봇 예약 판매 나섰다 news High

백승일 · Published 2026-05-26

#Humanoid

Rotaku.ai has opened preorders for Domo, a humanoid robot platform aimed at developers and researchers, with the entry-level Domo Basic priced at $2,999 and higher developer models at $3,998 and $9,899. The pitch is a relatively low-cost full-body control and AI learning platform rather than a demo-only robot, positioning it as an accessible entry point for humanoid robotics research and experimentation.

Rotaku launches Domo humanoid robot platform starting at $2,999 for developers news High

Sam Francis · Published 2026-05-26

#Manipulator#Humanoid

Rotaku has opened reservations for Domo, a compact humanoid robot platform starting at $2,999 for developers, makers, educators, and robotics teams. It is aimed at making real humanoid hardware more accessible for work on motion control, teleoperation, manipulation, human-robot interaction, and embodied AI, giving teams a lower-cost platform for experiments that usually require much more expensive robots.

로보터블, 美 ‘NRA 쇼’서 F&B 양팔 휴머노이드 플랫폼 '제스트' 공개 news High

최지호 · Published 2026-05-25

#Manipulator#Bimanual#Humanoid

Robotable unveiled Zest, a dual-arm humanoid platform for food and beverage work, at the 2026 NRA Show in Chicago. Built on the open-source Openarm manipulator, it uses two 8-DOF arms and a multimodal sensor stack combining RGB, depth, thermal imaging, an electronic nose, and HD microphones, pointing toward restaurant robots that can perceive cooking environments through more than vision alone.

中 엘리트로봇, 신형 휴머노이드 발표…정밀 부품 작업 특화 news High

유효정 · Published 2026-05-25

#Manipulator#Humanoid

China’s Elite Robot introduced Centaur-G1, a wheeled humanoid aimed at both industrial and home settings, alongside an upgraded embodied large-model platform called Yuanqi Primo. The robot extends the company’s “one brain, multiple forms” strategy and is positioned for precision component work, including optical-module production, where mobility plus humanoid manipulation could matter on factory floors.

美 라이트휠, 피지컬 AI 로봇 인프라로 1분기 1억달러 수주 news High

이재구 · Published 2026-05-25

#Other

Lightwheel says it booked $100 million in first-quarter orders for physical-AI robot infrastructure, framing the demand as a shift from robotics experiments toward deployment-scale systems. The company builds simulation, synthetic-data, evaluation, and deployment tooling for training and scaling robots in real-world environments, so the order volume suggests industrial customers are starting to buy the infrastructure layer needed to operationalize physical AI.

"집안일은 어렵지만 길 안내는 척척"…UB테크, '반구조화 공간' 노린 휴머노이드 공개 news High

박찬 기자 · Published 2026-05-25

#Humanoid

UBTech unveiled Walker C1, an upgraded commercial humanoid aimed at semi-structured public spaces such as exhibitions, airports, and hotels, where it can guide and interact with visitors rather than handle messy household chores. The launch demo emphasized whole-body coordination, balance, and precise motion control by having the robot perform ballet movements from Swan Lake with human dancers, positioning it as a service robot whose mobility and presentation skills matter in real-world customer-facing settings.

Kawasaki launches Silicon Valley hub to accelerate deployment of physical AI news High

Sam Francis · Published 2026-05-22

#Other

Kawasaki Heavy Industries has opened the Kawasaki Physical AI Center San Jose in Silicon Valley to speed up development of AI systems that operate in the physical world, especially at the intersection of robotics, semiconductors, and industrial automation. The hub is meant to deepen Japan-US collaboration and move physical AI from lab prototypes toward deployable real-world applications.

Video Friday: Atlas Versus a Fridge news High

Evan Ackerman · Published 2026-05-22

#RL#Humanoid

IEEE Spectrum’s Video Friday highlights Boston Dynamics’ new Atlas lifting and maneuvering a mini-fridge, using reinforcement learning and whole-body control to brace against the object’s mass and inertia rather than treating manipulation as just a hand task. The clip is framed as a sign that humanoids are moving toward more adaptable industrial work, where balance, strength, unusual range of motion, and real-time control matter as much as the headline feat.

Gecko Robotics tests Ouster’s next-generation color lidar to enhance AI-powered infrastructure inspections news High

Sam Francis · Published 2026-05-22

#Other

Gecko Robotics is testing Ouster’s new Rev8 digital lidar sensors inside its Cantilever platform, adding color lidar data to the robots it uses for industrial infrastructure inspections. The move matters because richer sensing can improve navigation and give Gecko’s AI more detailed physical data about hard-to-access assets, potentially making inspection records more useful for maintenance and reliability decisions.

‘2026 피지컬 AI 산업 전망 컨퍼런스’ 성료 news High

박경일 · Published 2026-05-22

#RL#Humanoid

SeminarHub held the 2026 Physical AI Industry Outlook Conference in Seoul on May 19, with support from the Korea AI & Robot Industry Association and Robotidly. The event framed physical AI as a new competitive axis for manufacturing and robotics, covering humanoids, manufacturing-focused physical AI, autonomous manufacturing, AI semiconductors, and robot software platforms as AI moves from language and image processing into embodied systems that perceive and act in the real world.

Doozy Robotics launches global expansion to scale AI-powered humanoid workforce for factories news High

Sam Francis · Published 2026-05-22

#Humanoid

Singapore-based Doozy Robotics is expanding into the United States, the GCC, and Asia as it tries to scale its AI-powered humanoid workforce for industrial factory settings ahead of a planned Series A round. The announcement positions the company’s “Physical AI” humanoids as a route to autonomous factory labor, but the available text gives little detail on deployments, technical architecture, or measured performance.

Brain Corp and UC San Diego partner to advance the foundational intelligence layer for physical AI news High

Sam Francis · Published 2026-05-22

#Other

Brain Corp is expanding its collaboration with UC San Diego to develop semantic mapping and contextual intelligence for autonomous robots in commercial and industrial settings. The work targets the “foundational intelligence” layer for physical AI: giving robots richer understanding of places, objects, and operational context so they can navigate and act more reliably in complex real-world environments.

Humanoid secures partnership with manufacturing giant Bosch following a successful proof of concept news High

David Edwards · Published 2026-05-22

#Humanoid

Humanoid has partnered with Bosch after a March 2026 proof of concept showed its humanoid robot platform could handle a complex industrial workflow. The deal moves the UK robotics company from validation toward scaled production, with Bosch’s manufacturing expertise potentially helping turn the platform into deployable factory robots.

950억 투자 위로보틱스, 차별점은 '인간 데이터'로 휴머노이드 고도화 news High

장세민 기자 · Published 2026-05-21

#Humanoid

WIRobotics says its recent 95 billion won Series B was driven by core technology for safe human-robot interaction in dynamic environments. The company is positioning human data as its differentiator for advancing humanoids, suggesting a strategy centered on learning from real human behavior rather than only building robot hardware.

Open-Source Software Is Starting to Help Robots Think news High

Jackie Snow · Published 2026-05-21

#RL#VLA#Other

Open-source robotics is moving beyond the ROS-era plumbing of maps, paths, logging, and hardware interfaces into higher-level AI for reasoning and action, with Hugging Face, Nvidia, and Alibaba releasing models, datasets, and tooling for robot training and deployment. Nvidia’s Cosmos, GR00T, and Isaac stack targets synthetic data, task reasoning, and orchestration, while Hugging Face’s LeRobot has helped robotics datasets on the Hub grow from 1,145 at the end of 2024 to more than 58,000. The shift could lower the barrier to building capable robots much as open-source AI lowered the barrier to building AI applications, though the article notes that today’s biggest contributors have strong platform incentives rather than the mostly academic motivations behind ROS.

Humanoid Secures Partnership with Bosch Following a Successful POC news High

Published 2026-05-21

#Humanoid

Humanoid completed a proof of concept with Bosch in Bühl, Germany, where HMND 01 humanoid robots moved boxes autonomously from a conveyor to a trolley in a live, changing intralogistics setting. The result led to a partnership with Bosch, suggesting the robots were able to handle a practical warehouse-style transfer task well enough to move beyond a demo toward industrial collaboration.

Doozy Robotics Announces Global Expansion with Seed Funding to Scale Physical AI Industrial Workforce news High

Published 2026-05-21

#Humanoid

Doozy Robotics says it has raised seed funding to expand its Singapore-based humanoid robotics business globally, with live deployments already running across two continents and a qualified pipeline ahead of a planned Series A. The company is positioning its humanoids as a “physical AI” industrial workforce, so the significance is less a lab demo than an attempt to scale embodied AI into real operational settings.

모든 휴머노이드의 생애를 추적한다… 디지털 ID 부여하는 중국 news Mid

조예주 기자 · Published 2026-05-28

#Humanoid

AI타임스 조예주 기자 joyejuoffice@aitimes.com

Pilot Institute to Bring Drone Pilot Education to Every Attendee at Commercial UAV Expo 2026 news Mid

Published 2026-05-27

#Other

Pilot Institute will provide eight practitioner-led drone education sessions to all attendees at Commercial UAV Expo 2026, spanning business growth, thermography, construction safety, enterprise UAS operations, and related topics. The move broadens access to hands-on commercial drone training at the expo, making the event more useful for operators and organizations looking to turn UAV capabilities into safer, more scalable field workflows.

일론 머스크가 콕 집은 '충칭동역'...“로봇으로 38개월만에 건설” news Mid

유효정 · Published 2026-05-27

#Other

China’s Chongqing East Station, a 1.22 million-square-meter high-speed rail hub opened in June 2025, was reportedly built in 38 months using robots in construction work. The claim drew wider attention after Elon Musk shared footage of the project, because comparable stations of this scale are typically described as taking 5–10 years to complete.

광운대, 세계적 로봇대회 ‘로보페스트 월드 챔피언십 2027’ 유치 news Mid

최지호 · Published 2026-05-27

#RL#Other

Kwangwoon University will host the 2027 Robofest World Championship in Seoul, bringing the international youth robotics competition outside the United States for the first time since it began in 1999. The event is scheduled for May 2027 on Kwangwoon’s campus, following an official agreement with Lawrence Technological University in Michigan, which signals a broader internationalization of a long-running educational robotics contest.

FORT Robotics Acquires Mapless AI to Expand Its Trust Platform with Remote Supervision and Active Safety Capabilities news Mid

Published 2026-05-27

#Other

FORT Robotics acquired Mapless AI to add supervised autonomy, remote supervision, and active safety capabilities to its Trust Platform. The move extends FORT beyond safe remote control toward human-in-the-loop oversight for physical AI systems, aiming to make autonomous robots and machines safer to deploy and monitor in real-world operations.

소프트뱅크,에너지·로봇 자회사 미국 상장 준비...'AI IPO' 러시에 합류 news Mid

박찬 기자 · Published 2026-05-27

#Other

SoftBank is reportedly preparing U.S. IPOs for SB Energy, its energy and infrastructure development arm, and Roze, its autonomous robotics subsidiary, with both listings potentially targeted as early as September. The move ties SoftBank’s AI strategy to the physical infrastructure behind it: power capacity, data-center-adjacent energy development, and robotics businesses that could benefit from the current rush of investor demand around AI-linked companies.

Handle with care: Soft robot gripper picks ripe fruit without bruising news Mid

Cornell University · Published 2026-05-27

#Other

Cornell researchers built a soft robotic gripper with stretchable fiber-optic sensors that can judge strawberry ripeness by touch, mimicking the way humans assess fruit firmness without damaging it. The gripper’s ability to handle ripe fruit gently while extracting useful tactile information points toward more practical robotic harvesting systems for delicate crops.

中 하이얼, 세계에서 가장 가벼운 'AI 스포츠 외골격 로봇' 출시 news Mid

유효정 · Published 2026-05-27

#Other

Haier has launched the W3, billed as the world’s lightest AI sports exoskeleton, using a carbon-fiber and titanium-alloy body that weighs 1.75 kg. It combines Haier’s AI gait algorithm 3.0 with multidimensional sensors to infer movement intent at millisecond scale, while dual high-torque motors and a high-energy battery provide up to 16 Nm of assistance per leg and reduce the user’s load by about 5 kg.

Inbolt Launches Vision-Enabled Robot Programming, Closing the Loop from CAD to Factory Floor news Mid

Published 2026-05-27

#Other

Inbolt launched a vision-enabled robot programming workflow that lets engineers generate robot paths directly from CAD models, then uses the Inbolt Vision Model at runtime to locate the physical part and execute the planned path on the factory floor. The pitch is that commissioning work that normally takes weeks of iterative adjustment can be collapsed into a single programming step, with live Automate 2026 demos including FANUC integrations and a broader US expansion.

Nanoloy Unveils RoboX news Mid

Published 2026-05-27

#Other

Nanoloy announced third-party validation of its high-energy battery platform aimed at drones, robotics, and autonomous systems. The news matters mainly as a commercialization signal: an India/Netherlands deep-tech battery company is positioning its chemistry for platforms where energy density, weight, and runtime directly constrain deployment.

Flexiv to Offer Exclusive Preview of Next-Generation Robots at ICRA news Mid

Published 2026-05-27

#Other

Flexiv is previewing its next-generation robots at ICRA, with a first-look demonstration centered on whole-body touch sensitivity before a planned commercial launch in Q3 2026. The news points to tactile sensing as the differentiator: robots that can perceive contact across the body, not just at the end effector, which could make physical interaction and manipulation safer and more adaptive.

日 도쿄대, 와이어 활용 차세대 로봇 플랫폼 개발 news Mid

백승일 · Published 2026-05-27

#Other

도쿄대 JSK로봇연구실이 공개한 WiXus는 바퀴 달린 다리 구조에 와이어 구동을 결합해, 이동 중인 다리를 팔처럼 써서 작업까지 수행하도록 만든 로봇 플랫폼이다. 기존 wheeled-legged robot이 주로 빠르고 안정적인 이동에 초점을 맞췄다면, WiXus는 같은 하드웨어로 구조 작업이나 도구 사용 같은 조작 기능까지 노린다는 점에서 이동 로봇과 작업 로봇의 경계를 좁힌다.

中 유니트리, IPO 앞두고 1분기 순익 52% 급감 news Mid

백승일 · Published 2026-05-27

#Humanoid

Unitree Robotics reported a sharp profitability decline just ahead of its Shanghai IPO review, with first-quarter 2026 revenue rising 68% year over year to 422.8 million yuan while net profit reportedly fell 52%. The mismatch between fast sales growth and weakening earnings highlights investor concerns that China’s humanoid robotics market is becoming crowded, costly, and harder to monetize despite strong demand signals.

Greenpeace robot stages deepest-ever seabed protest news Mid

Sam Francis · Published 2026-05-27

#Other

Greenpeace used an underwater robot during a scientific survey of vulnerable Arctic Mid-Ocean Ridge ecosystems to stage a seabed banner protest 2,300 meters below the surface. The action, billed as the deepest-ever seabed protest, was meant to push global leaders to heed scientific warnings about deep-sea ecosystems and their protection.

Virginia Tech researchers control soft robotics with ‘AI’s cousin’: ‘Reservoir computing’ news Mid

Sam Francis · Published 2026-05-27

#Other

Virginia Tech researchers are using reservoir computing, a lightweight relative of AI, to control soft robots whose flexible bodies are hard to model with conventional controllers. The approach treats the robot’s own deformable material dynamics as part of the computation, potentially making soft machines easier to steer in messy real-world tasks like harvesting delicate produce or search and rescue.

The power of the open-source community in robotics: Collaboration and shared innovation news Mid

Sam Francis · Published 2026-05-27

#Other

Open-source robotics has shifted from a small, fragmented specialist domain into a major engine of robot-AI development, lowering the cost and complexity of building mobile robots and shared software stacks. The article frames community collaboration and reusable platforms as the reason robotics can now move faster across research and industry, turning what once took years of custom engineering into more broadly accessible development.

Musk says US military suicide drones used Starlink in violation of SpaceX rules news Mid

Jon Brodkin · Published 2026-05-26

#Other

Elon Musk said U.S. military suicide drones were using Starlink connectivity rather than SpaceX’s military-focused Starshield service, which he described as a violation of company rules. He blamed a military contractor for the misuse, raising questions about how commercial satellite networks are being routed into battlefield autonomous systems and who is accountable when procurement or integration bypasses intended controls.

NūMove surpasses 20 semi-automatic mixed palletizer deployments news Mid

Sam Francis · Published 2026-05-26

#Other

NūMove Robotics & Vision says it now has 22 semi-automatic mixed palletizing systems running across nine U.S. states, up from its first deployment in 2021. The systems are being adopted by wholesale alcoholic beverage distributors, where mixed palletizing is a labor-intensive warehouse task that benefits from partial automation without requiring a fully automated facility.

“재활로봇 연구에 AI 기술 접목”…장애인 재활치료 논의의 장 news Mid

최지호 · Published 2026-05-26

#Other

Korea’s National Rehabilitation Center held an AI and rehabilitation robotics workshop on May 26 to discuss how recent AI advances can be incorporated into rehabilitation robot research. The focus was on translating AI into practical robotic therapies that could better support people with disabilities, as well as patients at risk of disability after illness or injury.

DARPA prepares robotic satellite servicing mission for launch in 2026 news Mid

Sam Francis · Published 2026-05-26

#Other

DARPA and industry partners are preparing to launch RSGS, a robotic servicing mission aimed at demonstrating on-orbit maintenance and upgrades for satellites in geosynchronous orbit, in summer 2026. The mission matters because GEO spacecraft are expensive, hard to reach, and traditionally treated as non-serviceable once deployed, so successful robotic repair or upgrade operations could extend satellite lifetimes and change how operators manage orbital infrastructure.

Redwire delivers lunar robotic arm prototype to European Space Agency news Mid

Sam Francis · Published 2026-05-26

#RL#Manipulator

Redwire has delivered MANUS, a prototype robotic manipulator for ESA’s Argonaut lunar lander program, developed with Added Value Solutions to support payload handling and unloading on future Moon missions. The delivery marks a concrete hardware step toward giving Europe’s lunar lander the ability to deploy and service payloads autonomously or semi-autonomously on the surface.

유니트리, 내달 1일 中 상하이증시 A주 상장 심사 받는다 news Mid

유효정 · Published 2026-05-26

#Humanoid

Unitree is expected to face a Shanghai Stock Exchange listing review on June 1, 2026, moving toward an A-share IPO. The Hangzhou-based company, founded in 2016 with registered capital of 364 million yuan, is being framed in Chinese media as a potential first humanoid-robot specialist to list on China’s A-share market, making the review a useful signal for how public markets may value China’s robotics sector.

휴로틱스, 한양대 간호학과 돌봄로봇연구팀에 ‘H-Medi’ 공급 news Mid

최지호 · Published 2026-05-26

#Other

휴로틱스가 한양대 간호학과 돌봄로봇연구팀에 보행재활용 소프트 웨어러블 로봇 H-Medi를 공급했다. 연구팀은 요양시설 고령층 돌봄 현장에서 쓸 수 있는 로봇 기술을 검토하는 과정에서, 단단한 프레임과 관절축을 쓰는 기존 외골격보다 유연착용형 H-Medi가 일상 착용과 반복 훈련에 더 적합하다고 본 것으로 보인다.

Allient Inc. to Demonstrate Advanced Motion Solutions at Robotics Summit & Expo 2026 news Mid

Published 2026-05-26

#Other

현대로템, K-방산 피지컬 AI 국책 R&D 과제 2건 연이어 수주 news Mid

박경일 · Published 2026-05-26

#Other

Hyundai Rotem has been selected for two Korean national R&D projects on physical AI for unmanned defense robots, one from the Ministry of Trade, Industry and Energy and one from the Agency for Defense Development. The projects focus on a natural-language command system for integrated control of heterogeneous multi-robot fleets, plus an integrated simulator and modular robot system, aiming to move operators beyond one-device-per-robot remote control toward language-driven coordination of multiple unmanned platforms.

Mitsubishi Electric and Chiba Institute of Technology to Co-Research and Develop Homegrown Physical AI news Mid

Published 2026-05-26

#Other

Mitsubishi Electric and Chiba Institute of Technology are launching a Co-Creation Center to jointly research and develop Japan-made physical AI for robotics, with an eye toward commercial deployments in both public- and private-sector settings. The effort matters less as a single technical result than as an industry-academic push to turn embodied AI and robot intelligence into practical domestic infrastructure, spanning research, development, and commercialization.

엔닷라이트, 27일 개막 美 보스턴 ‘로보틱스 서밋&엑스포’ 참가 news Mid

최지호 · Published 2026-05-26

#Other

NdotLight is presenting its AI-based 3D CAD generation technology at the 2026 Robotics Summit & Expo in Boston, with CTO and co-founder Suntae Kim speaking on sim-ready asset generation and real use cases. The focus is on turning 3D CAD AI into assets that can be used directly in robotics simulation, which matters because usable simulated environments and objects are a practical bottleneck for robotics development.

테솔로, 미국 ‘로보틱스 서밋&엑스포’·유럽 ‘IEEE ICRA 2026’ 참가 news Mid

최지호 · Published 2026-05-26

#VLA#Manipulator#Humanoid

Tesollo is taking its new DG-5F-S robot hand to Robotics Summit & Expo 2026 in Boston and IEEE ICRA 2026 in Vienna as it pushes into the global humanoid hand market. The demos are aimed at showing the hand in higher-level manipulation settings, including teleoperation, in-hand manipulation, and vision-language-action-based control, rather than just presenting it as a standalone gripper.

Why Network Security is Critical in the Age of Robotics and Automation news Mid

Sam Francis · Published 2026-05-26

#Other

Network security is being reframed for robotics and automation because connected robots, conveyors, sensors, and industrial controllers now carry commands that directly affect physical systems. The piece argues that as automation expands across warehouses and factories, compromised networks can move from data breaches to real-world disruption, making secure connectivity a core requirement for safe robotic operations rather than an IT afterthought.

Segway Navimow wins two Red Dot awards for next-generation robotic lawn mowers news Mid

Sam Francis · Published 2026-05-26

#Other

Segway Navimow’s X4 and H2 robotic lawn mowers received Red Dot Design Awards for product design, highlighting the company’s push into premium, user-focused smart gardening hardware. The announcement is light on technical detail, but it positions the next-generation Navimow line around refined industrial design and engineering for autonomous lawn care.

Machine Vision Systems Are Expanding the Need for Scalable Media Infrastructure news Mid

Sam Francis · Published 2026-05-26

#RL#Other

Machine vision is pushing industrial sites from simple camera deployments toward large-scale media infrastructure that can ingest, store, and analyze continuous visual data from factories, warehouses, logistics centers, and robotics platforms. The piece frames cameras and AI-assisted visual analysis as operational infrastructure for quality inspection, inventory tracking, predictive maintenance, and safety monitoring, with scalability becoming central as these systems spread across production environments.

日 화낙-구글, 피지컬 AI 분야 전략적 제휴 news Mid

백승일 · Published 2026-05-25

#Other

FANUC is partnering with Google to bring Google’s latest AI technologies into industrial robot systems, aiming to push factory automation toward “physical AI” where robots can perceive their surroundings, make decisions, and execute tasks more autonomously. The collaboration matters because it links a major industrial robot supplier with generative AI capabilities, suggesting a move from tightly scripted automation toward robots that can adapt more flexibly on manufacturing floors.

또 투자 받은 中 로봇 대여 플랫폼 '봇쉐어', 기업가치 1.5조원 달해 news Mid

유효정 · Published 2026-05-25

#Humanoid

China’s robot-rental platform BOTSHARE has raised Series A and A+ funding worth hundreds of millions of yuan, bringing its valuation close to 7 billion yuan, or about 1.55 trillion won. The company rents robots for cultural tourism, commercial performances, exhibitions, sales-service settings, and other event-heavy interaction scenarios, and its unicorn valuation signals how quickly humanoid-robot commercialization is turning rental platforms into a serious business layer.

MIT, 자석으로 작동하는 마이크로 로봇 '마그노봇' 개발 news Mid

백승일 · Published 2026-05-25

#Other

MIT researchers built a micrometer-scale soft robot, called a magno-bot, that can be actuated purely with external magnets, eliminating the need for onboard batteries or wires. The device has a lollipop-like structure, with a sub-millimeter rod and a tiny magnetized sphere, and can reportedly be moved with something as simple as a refrigerator magnet, pointing toward uses such as magnetic valves for opening and closing fluid flows.

日 시오스·도요타 L&F 구마모토, 대기업에 AMR 공급 news Mid

백승일 · Published 2026-05-25

#Other

SEAOS, working with Toyota L&F Kumamoto, has supplied 14 additional TUGBOT2 autonomous mobile robots to a major Japanese manufacturer’s production sites. The expansion follows an earlier 2025 deployment of four AMRs that reportedly reduced operational workload, pointing to growing use of mobile robots in Japanese factories facing labor shortages. TUGBOT2 is CE-marked for human-robot shared environments and uses advanced sensors to detect and avoid obstacles safely.

Electromate Announces Availability of Dobot Educational Robots and Accessories in Canada news Mid

Published 2026-05-25

#Other

Electromate is now distributing Dobot’s educational robots and accessories across Canada, giving schools, training centers, and labs easier access to robot arms and supporting hardware for programming, automation, and mechatronics instruction. The news is mainly about availability rather than a technical advance, but it matters because it broadens access to a packaged robotics teaching ecosystem for Canadian robotics education and applied training.

Do Robotics Firms Need Microsoft 365 Backups news Mid

Sam Francis · Published 2026-05-25

#Other

The article argues that robotics firms risk losing important engineering context when Microsoft 365 accounts are deleted, because project knowledge often lives across emails, chats, and other informal collaboration trails rather than in neatly archived documents. It frames Microsoft 365 backup as continuity infrastructure for robotics teams, where recovering the surrounding communication history can matter as much as restoring files.

Robot.com turns autonomous robots into mobile advertising network with launch of R-ads platform news Mid

Sam Francis · Published 2026-05-25

#Other

Robot.com launched R-ads, a platform that turns autonomous robots into a measurable out-of-home advertising network. The company says it is building on more than 100 brand activations across 20-plus countries, positioning mobile robots as programmable media surfaces for venues, events, and public spaces where campaigns can be deployed and tracked at scale.

딥 로보틱스, AI 기반 산업용 소형 휠-레그 로봇 ‘린크스 S10’ 공개…“경량 지능형 작업 새 기준 제시” news Mid

최광민 기자 · Published 2026-05-22

#Other

Deep Robotics has launched the Lynx S10, a compact industrial wheeled-legged robot built around embodied AI for field work such as power patrols, security monitoring, emergency response, and outdoor exploration. The company positions it as a lightweight but rugged platform combining high mobility, all-around perception, and industrial durability, aimed at bringing more capable autonomous robots into demanding inspection and operations settings.

How to Add AI to Your Robot in Minutes news Mid

Published 2026-05-22

#Other

The piece argues that adding AI to robots no longer has to mean months of custom integration work by experienced developers. It points to faster tooling that can make a robot AI-capable in minutes, framing the shift as a practical way to move from prototype complexity toward field-deployable robotic systems more quickly.

“로봇이 수도시설 현장관리”...한국수자원공사, 점검체계 전환 추진 news Mid

최지호 · Published 2026-05-22

#Other

K-water is piloting quadruped inspection robots at regional water treatment plants starting this year to help or replace workers in high-risk, repetitive checks. The robots are aimed at underground spaces, night patrol routes, narrow passages, and stairs, where they can improve worker safety while collecting more quantitative facility-condition data for operations management.

딥로보틱스, 산업용 4족보행 소형 휠-레그 로봇 ‘Lynx S10’ 출시 news Mid

최지호 · Published 2026-05-22

#Other

Deep Robotics has launched the Lynx S10, a compact industrial wheeled-leg robot aimed at tight spaces and lighter-duty field work. The company positions it around agile movement, strong performance, all-around perception, and protective reliability, targeting uses such as power infrastructure patrols, security monitoring, and emergency search.

news

주간 로봇 기업 주가 동향(05/18~05/22) news Mid

박경일 · Published 2026-05-22

#Other

엔도로보틱스, 유럽 최대 내시경 학회 ‘ESGE 데이즈’ 전시 성황리 마쳐 news Mid

최지호 · Published 2026-05-22

#Other

Endorobotics, a Korean startup developing robotic systems for endoscopic surgery, exhibited at ESGE Days 2026 in Milan, Europe’s major gastrointestinal endoscopy meeting. The news is mainly about international visibility: the company put its technology in front of gastroenterologists, researchers, and medical-device firms gathered around advanced endoscopy.

Plus One Robotics streams eight hours of live warehouse automation performance news Mid

Sam Francis · Published 2026-05-22

#Other

Plus One Robotics ran an eight-hour livestream of its AI-powered parcel induction system operating continuously in a warehouse-style setting. The demonstration was meant to show real operational behavior rather than a polished demo clip, giving viewers a clearer view of how large-scale parcel-handling robotics performs over sustained use.

NūMove surpasses 20 mixed palletizer deployments as US beverage distributors accelerate warehouse automation news Mid

Sam Francis · Published 2026-05-22

#Other

NūMove Robotics & Vision says its mixed palletizing systems have moved from early adoption to broader deployment in U.S. beverage distribution, with 22 semi-automatic units now operating across nine states after a first installation in 2021. The uptake by wholesale alcoholic beverage distributors points to growing demand for warehouse automation that can handle mixed-SKU pallet building, a labor-intensive bottleneck in beverage logistics.

news

2025 로봇 사진 공모전 수상작 news Mid

로봇신문사 · Published 2026-05-22

#Other

中 딥로보틱스, 상하이증권거래소 스타마켓 IPO 심사 통과 news Mid

유효정 · Published 2026-05-22

#Other

Deep Robotics, a Chinese quadruped robot company, has passed the Shanghai STAR Market IPO review, with China Securities sponsoring the listing. Its prospectus says the company turned profitable for the first time last year, driven by embodied-intelligence robot revenue of 322 million yuan, up 260% year over year.

美 패러데이퓨처, 글로벌 로봇 사업 자금 380억원 확보 news Mid

이재구 · Published 2026-05-22

#RL#Humanoid

Faraday Future Intelligent Electric raised $25 million through convertible notes, adding to a $45 million financing announced in April. The Los Angeles-based EV company says the capital will support its AI strategy and help it move into delivery of humanoid and bionic robots, signaling a push to extend its robotics ambitions beyond vehicles.

Rajant Health (RHI) and Chord Robotics Expand Cowbell Platform to Enable Scalable, Multi-Domain Collaborative Autonomy news Mid

Published 2026-05-22

#Other

Rajant Health and Chord Robotics are expanding the Cowbell platform into Flying Cowbell, a mobility-native autonomy and distributed compute fabric meant to coordinate robotic and AI systems across air, ground, maritime, and other domains. The announcement frames Cowbell as infrastructure for scalable collaborative autonomy, where compute, networking, and control can move with the robots rather than depending on fixed connectivity or centralized command systems.

Robot Talk Episode 157 – Generating new robot designs, with Josie Hughes news Mid

Robot Talk · Published 2026-05-22

#Manipulator

Claire speaks with EPFL’s Josie Hughes about using AI to generate new designs for robotic manipulators, drawing on her work in bio-inspired robotics and the CREATE Lab she founded in 2021. The item points to AI as a design partner for exploring manipulator morphologies, rather than just controlling fixed robot hardware, but the provided text does not give technical details or results.

“LG CNS, AI·로봇 전환 수혜 본격화…하반기 강력한 성장 모멘텀 기대” news Mid

백승일 · Published 2026-05-22

#Other

Mirae Asset Securities expects LG CNS to benefit from rising demand for AI transformation and smart engineering tied to robotics and automation. The report projects revenue rising from about 6.01 trillion won this year to 7.35 trillion won in 2027, with operating profit increasing from 513 billion won in 2024 to 710 billion won in 2027, making cloud, AI, and automation-related work a central growth driver.

엑스와이지 바리스타 로봇 ‘바리스’, 누적 식음료 제조 100만건 돌파 news Mid

최지호 · Published 2026-05-22

#Other

XYZ’s AI barista robot Baris has passed 1 million cumulative drink and food preparations, automating the retail flow from ordering through preparation to pickup. The company frames the milestone as evidence from real commercial deployments rather than a lab demo, with operating data from domestic and overseas sites feeding cloud-based refinement of the system.

Flytrex opens drone manufacturing facility in Dallas news Mid

Sam Francis · Published 2026-05-22

#RL#Other

Flytrex has opened a drone manufacturing and maintenance facility in Pilot Point, Texas, to support its expansion of autonomous food delivery across Dallas-Fort Worth. The site can assemble thousands of drones per year and is tied to Flytrex’s plan to operate 60 delivery locations in the metro area by mid-2027, making the announcement less about a single facility than about scaling the physical infrastructure behind suburban drone delivery.

‘인천 로봇 플래그쉽 지역거점’ 청라 로봇타워에 문열었다 news Mid

최지호 · Published 2026-05-21

#Other

Incheon has opened a logistics-robot demonstration hub at Cheongna Robot Tower, giving local robotics companies a testbed for manufacturing and logistics robots. The 1.9 billion won project, backed by national and city funding after Incheon was selected in a Ministry of Trade, Industry and Energy call, is meant to help firms validate heterogeneous multi-robot systems, collect field data, and move closer to commercialization.

Aetina Demonstrates Edge AI Platforms for Robotics, Vision AI and Enterprise Automation at COMPUTEX 2026 news Mid

Published 2026-05-21

#Other

Aetina is showing NVIDIA-based edge AI platforms at COMPUTEX 2026, with live demos spanning robotic automation, Vision AI, lightweight VLM deployment, and agentic AI workflows. The emphasis is on running real-time AI processing close to cameras, robots, and enterprise systems rather than in the cloud, which matters for latency-sensitive automation and on-site perception workloads.

한국로봇융합연구원, 'AI 특화 공동훈련센터' 개소 news Mid

박경일 · Published 2026-05-21

#RL#Other

KIRO opened an AI-specialized joint training center at the Robot Vocational Innovation Center in Gumi, backed by Korea’s Ministry of Employment and Labor as part of an industrial transition support program. Over the next three years it will use 1.5 billion won in national funding to train 360 AI and robotics specialists per year for more than 40 partner companies in the Daegu-Gyeongbuk region, with industry-centered PBL training as a core element.

Robot.com Launches R-ads, Redefining Out-of-Home Advertising With Its Autonomous Robot Media Network news Mid

Published 2026-05-21

#Other

Robot.com launched R-ads, an autonomous robot-based out-of-home advertising network, with early agreements including the Ad Council. The pitch is that mobile physical ads can combine street-level visibility with digital-style measurement, making campaigns easier to track while putting branded messages in places people are more likely to notice.

中 VR 헤드셋 기업 DPVR, 로봇용 데이터 수집 솔루션 공개 news Mid

유효정 · Published 2026-05-21

#RL#Other

DPVR, a major Chinese VR headset maker, has introduced RoboPilot, an embodied-robotics solution aimed at data collection and model development for robots. The system uses DPVR’s VR strengths in optical display and spatial positioning to support remote robot control, human-motion data capture, and intelligent model training and validation, marking the company’s formal move from VR hardware into robotics infrastructure.

브릴스, 인하대와 MOU 체결…첨단 분야 인재 양성에 힘쓴다 news Low

최지호 · Published 2026-05-27

#Other

브릴스가 인하대학교와 로봇·AI 분야 실무형 인재 양성을 위한 일학습병행 기반 산학협력 MOU를 체결했다. 브릴스의 로봇 모듈화 플랫폼 솔루션 기술을 교육·현장 훈련과 연결해, 첨단 산업 현장에서 바로 활용할 수 있는 인력을 함께 키우려는 협력이다.

Festo introduces two-finger pneumatic gripper for cobot applications news Low

Sam Francis · Published 2026-05-27

#Other

Festo introduced the HPPH two-finger pneumatic parallel gripper for cobots, aimed at reducing the weight, wiring, and footprint that come from external valves, sensors, and routing on compact robot arms. By integrating control, sensing, and collaborative safety into the gripper itself, it makes pneumatic gripping easier to mount and deploy in payload-constrained collaborative applications.

美 GE 버노바, 로보테크 오토메이션 인수 계약 news Low

이재구 · Published 2026-05-26

#Other

GE Vernova has agreed to acquire Robotech Automation, a Quebec-based systems integrator with about 35 employees that designs, engineers, and integrates robotic and automation systems through a manufacturing partner network. The deal signals GE Vernova’s push to build more in-house capability around robotics and automation, likely to support manufacturing and energy-sector operations where reliable integration matters as much as the robot hardware itself.

AI warfare is already here news Low

Hayden Field · Published 2026-05-26

#Other

Lethal autonomous weapons have moved from UN hypotheticals to active battlefields, with AI-enabled drones and targeting systems already shaping conflicts such as Ukraine and Gaza. The piece argues that diplomacy and weapons-control forums are lagging behind a fast-moving military reality, where autonomy is being deployed before there is meaningful international agreement on limits, accountability, or human control.

Code & repos (232)

GitHub repositories and papers that release code or a project page. The actionable stuff.

Jiaaqiliu/Awesome-VLA-Robotics github SURVEY

Jiaaqiliu · ★ 473 · Created 2025-04-21 · Active 2026-03-23

#VLA#Manipulator#MobileManipulator#Humanoid#Survey

Jiaaqiliu/Awesome-VLA-Robotics is a curated survey-style resource for Vision-Language-Action models in robotics, organizing papers, models, datasets, benchmarks, and related tools across manipulation, navigation, mobile manipulation, HRI, planning, humanoids, and other embodied-AI settings. It is useful as a field map for tracking how VLA systems extend VLMs with robot action generation, including common building blocks such as vision encoders, LLM-based language understanding, action decoders or policies, and modality-alignment mechanisms.

github

YanjieZe/awesome-humanoid-robot-learning github SURVEY

YanjieZe · ★ 2343 · Python · Created 2024-10-30 · Active 2026-05-24

#Humanoid#Survey

A Paper List for Humanoid Robot Learning.

jonyzhang2023/awesome-embodied-vla-va-vln github SURVEY

jonyzhang2023 · ★ 3151 · Created 2025-01-16 · Active 2026-05-25

#VLA#Other#Survey

jonyzhang2023/awesome-embodied-vla-va-vln is a curated reading list for embodied AI work around vision-language-action models, vision-language navigation, and related multimodal learning. It is useful as a research map rather than a runnable system, helping track papers, models, datasets, and benchmarks across robot learning and embodied navigation.

github

starVLA/starVLA github High

starVLA · ★ 2567 · Python · Created 2025-10-09 · Active 2026-05-22

#VLA#Other

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing

github

NVlabs/ProtoMotions github High

NVlabs · ★ 1670 · Python · Created 2024-09-24 · Active 2026-05-19

#VLA#Humanoid

GPU-accelerated ProtoMotions3 trains physically simulated humanoids on motion corpora, claiming AMASS-scale skill learning in 12 hours on 4 A100s. Distinctive pieces are PyRoki one-command retargeting, IsaacGym/Newton/MuJoCo sim-to-sim testing, and ONNX sim-to-real deployment for Unitree G1.

huggingface/lerobot github High

huggingface · ★ 24337 · Python · Created 2024-01-26 · Active 2026-05-25

#Other

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

OpenHelix-Team/VLA-Adapter github High

OpenHelix-Team · ★ 2164 · Python · Created 2025-09-20 · Active 2026-03-19

#RL#VLA#Bimanual

VLA-Adapter provides code, checkpoints, and training/evaluation scripts for adapting vision-language-action models on LIBERO and CALVIN, with configs spanning 10GB consumer GPUs to 80GB accelerators. It also adds real-world ALOHA/Cobot Magic deployment support.

github

isaac-sim/IsaacLab github High

isaac-sim · ★ 7253 · Python · Created 2022-11-16 · Active 2026-05-25

#Other

Unified framework for robot learning built on NVIDIA Isaac Sim

github

Denghaoyuan123/Awesome-RL-VLA github SURVEY

Denghaoyuan123 · ★ 711 · Created 2025-11-13 · Active 2026-05-18

#RL#VLA#Manipulator#Survey

Curated RL-VLA reading list for robotic manipulation, organizing papers by offline, online, offline+online, and test-time RL regimes. It adds comparison tables for action type, reward sparsity, MF/MB status, sim/real validation, base VLA, and policy class, but no new algorithm.

unitreerobotics/unitree_sim_isaaclab github High

unitreerobotics · ★ 471 · Python · Created 2025-06-24 · Active 2026-03-30

#Manipulator#Humanoid

Unitree’s Isaac Lab simulator runs G1 and H1-2 humanoid manipulation tasks such as pick-place, block stacking, and whole-body object moving, with configurations for grippers, Dex3, and Inspire hands. It mirrors the real robots’ DDS communication topics, making it useful for testing control code, collecting or replaying teleoperation data with xr_teleoperate, and validating policies in simulation before moving to hardware.

github

leggedrobotics/pace-sim2real github High

leggedrobotics · ★ 490 · Python · Created 2025-09-05 · Active 2026-05-22

#Other

PACE is a sim-to-real pipeline for legged robots that estimates actuator and joint dynamics using only standard joint encoder data, avoiding extra sensing or specialized identification hardware. The repo is aimed at making legged-robot simulation models match real hardware more closely, so policies trained in simulation transfer more reliably to physical robots.

NVlabs/GR00T-VisualSim2Real github High

NVlabs · ★ 250 · Python · Created 2026-04-07 · Active 2026-04-20

#VLA#Manipulator#Humanoid

GR00T-VisualSim2Real VIRAL Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation DoorMan Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer Overview This repository contains the application code for VIRAL (Visual Sim-to-Real for Humanoid Loco-Manipulation) and DoorMan .

github

google-deepmind/mujoco_playground github High

google-deepmind · ★ 1953 · Python · Created 2024-12-03 · Active 2026-05-19

#RL#Manipulator#Humanoid

GPU-accelerated MuJoCo MJX environments for robot RL and sim-to-real let researchers train on classic control, quadruped/biped locomotion, and dexterous or non-prehensile manipulation tasks. Distinctive support includes JAX MJX plus MuJoCo Warp backends and vision via MJWarp Batch Renderer.

rohanpsingh/LearningHumanoidWalking github High

rohanpsingh · ★ 1142 · Python · Created 2022-07-13 · Active 2026-05-03

#RL#Humanoid

Train and evaluate deep-RL humanoid locomotion policies in MuJoCo, with H1/JVRC walking, footstep-tracking, terrain, and cartpole test environments. Its value is released code behind three humanoid-walking papers: current-feedback/back-EMF randomization, planned-footstep walking, and compliant/uneven-terrain robustness.

jonyzhang2023/awesome-humanoid-learning github SURVEY

jonyzhang2023 · ★ 921 · Created 2024-01-16 · Active 2026-03-16

#Manipulator#Humanoid#Survey

Curated index of humanoid and bipedal robot learning resources covering locomotion, manipulation, whole-body control, physics-based animation, robot models, papers, news, and related lists. Its value is aggregation and model metadata, not a new algorithm or benchmark.

github

mithi/robotics-coursework github High

mithi · ★ 4654 · Created 2017-06-24 · Active 2026-05-11

#Other

🤖 Places where you can learn robotics (and stuff like that) online 🤖

github

leggedrobotics/rsl_rl github High

leggedrobotics · ★ 2624 · Python · Created 2021-10-18 · Active 2026-05-19

#Other

A fast and simple implementation of learning algorithms for robotics.

github

Genesis-Embodied-AI/genesis-world github High

Genesis-Embodied-AI · ★ 28844 · Python · Created 2023-10-31 · Active 2026-05-25

#Other

A generative world for general-purpose robotics & embodied AI learning.

github

unitreerobotics/unitree_rl_lab github High

unitreerobotics · ★ 1030 · Python · Created 2025-06-05 · Active 2026-05-18

#RL#Other

IsaacLab-based RL environments let researchers train policies for Unitree Go2, H1, and G1-29dof robots, with standalone tasks and robot assets via USD or URDF. The repo also documents MuJoCo sim2sim validation and direct sim2real deployment, indicating integration rather than a new RL method.

EXPO-FT: Sample-Efficient Reinforcement Learning Finetuning for Vision-Language-Action Models arxiv HighNEW

Perry Dong, Kuo-Han Hung, Tian Gao, Dorsa Sadigh, Chelsea Finn · Submitted 2026-05-25

#RL#VLA#Manipulator

EXPO-FT fine-tunes pretrained vision-language-action policies with reinforcement learning in a way that preserves their useful priors while making online robot learning stable and sample-efficient. On challenging manipulation tasks like routing string lights and plugging them in, striking a pool ball into a pocket, and inserting a flower into a wine bottle, it reaches 30/30 successes on every evaluated task using an average of 19.1 minutes of robot data, outperforming both RL-from-scratch and prior VLA fine-tuning baselines.

github

sou350121/VLA-Handbook github High

sou350121 · ★ 240 · HTML · Created 2025-11-23 · Active 2026-05-25

#VLA#Other

VLA-Handbook is a Chinese, practice-oriented learning and interview guide for algorithm engineers moving into Vision-Language-Action robotics. It narrows in on robotics-specific issues rather than general CV/NLP preparation, making it useful as a focused entry point for people who need to understand the concepts, workflows, and interview expectations around VLA systems.

github

DravenALG/awesome-vla-wam github SURVEY

DravenALG · ★ 445 · Created 2026-02-06 · Active 2026-05-18

#VLA#Other#Survey

Curated index of Vision-Language-Action and World Action Model robotics research, organized by VLA variants, WAM sources, policies, datasets, benchmarks, engines, and hardware. Its value is taxonomy and literature tracking; it does not introduce a model, dataset, or benchmark.

hzxie/DynamicVLA github High

hzxie · ★ 253 · Python · Created 2026-01-26 · Active 2026-05-03

#VLA#Manipulator

DynamicVLA provides training, inference, and Isaac Lab evaluation code for a vision-language-action policy aimed at dynamic object manipulation on the DOM benchmark. It bundles DOM data, 3D assets, synthetic generation, LeRobot conversion, and a pretrained checkpoint; no new model mechanism is detailed here.

github

ARISE-Initiative/robosuite github High

ARISE-Initiative · ★ 2425 · Python · Created 2018-10-25 · Active 2026-05-09

#Manipulator

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

github

Farama-Foundation/Gymnasium-Robotics github High

Farama-Foundation · ★ 909 · Python · Created 2021-10-25 · Active 2026-05-20

#RL#Other

A collection of robotics simulation environments for reinforcement learning

github

mujocolab/mjlab github High

mujocolab · ★ 2382 · Python · Created 2025-06-10 · Active 2026-05-25

#RL#Other

Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research

AgibotTech/ACoT-VLA github High

AgibotTech · ★ 175 · Python · Created 2026-02-27 · Active 2026-05-21

#VLA#Other

ACoT-VLA is the official CVPR 2026 implementation for Action Chain-of-Thought, a method for vision-language-action models that makes robot policies reason through intermediate action-oriented steps rather than mapping perception and language directly to actions. The repo likely provides the training and evaluation code needed to reproduce the paper’s experiments, making it useful for researchers studying interpretable or structured reasoning in embodied robot control.

github

2toinf/X-VLA github High

2toinf · ★ 656 · C++ · Created 2025-09-25 · Active 2026-05-06

#VLA#Manipulator

X-VLA provides pretrained and fine-tuned cross-embodiment VLA checkpoints using embodiment-specific soft prompts to steer one 0.9B Transformer policy across robot platforms. Released models cover LeRobot/server-client inference and benchmarks including LIBERO 98.1%, Google Robot 83.5% VM/76.4% VA, WidowX 95.8%, CALVIN 4.43, and RoboTwin2 70%.

arxiv

FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies arxiv HighNEW

Xintong Hu, Xuhong Huang, Jinyu Zhang, Yutong Yao, Yuchong Sun et al. · Submitted 2026-05-26

#VLA#Manipulator#Bimanual

FineVLA targets the mismatch between robot policies that need to follow execution-style instructions and datasets that usually only say the goal, by adding action-aligned language about details like which arm to use, approach direction, pose, color, and contact region. It builds a 47,159-trajectory human-verified dataset from 972,247 trajectories across 10 robot datasets, plus a 500-video benchmark and a robotics-specialized VLM annotator for scaling fine-grained labels. Policies trained with fine-grained plus raw goal instructions performed best, reaching 86.8%/82.5% in RoboTwin and 62.7/100 on real dual-arm manipulation, with especially large gains on steerable factors that goal-only language leaves unspecified.

github

fan-ziqi/rl_sar github High

fan-ziqi · ★ 1314 · C++ · Created 2024-03-06 · Active 2026-05-27

#RL#Humanoid

rl_sar collects simulation-to-real reinforcement learning workflows for deploying locomotion policies across quadruped, wheeled, and humanoid robots. It is aimed at validating robot RL algorithms in simulation and carrying them through to physical hardware, making it useful as a practical bridge between training experiments and real-world deployment.

iLearn-Lab/NeurIPS25-CogVLA github High

iLearn-Lab · ★ 182 · Python · Created 2025-08-18 · Active 2026-05-27

#VLA#Other

CogVLA is a NeurIPS 2025 vision-language-action model that aligns robot control with cognition by using instruction-driven routing and sparsification, so different parts of the model are selectively activated depending on the task command. The repo appears to package the method for researchers interested in efficient VLA policies that can condition robot behavior on language while avoiding dense, one-size-fits-all computation.

github

ustcwhy/BitVLA github High

ustcwhy · ★ 152 · Python · Created 2025-06-09 · Active 2026-03-02

#VLA#Manipulator

BitVLA provides 1.58-bit/1-bit vision-language-action checkpoints and evaluation code for robotics manipulation, including VQA and LIBERO workflows. The 3.0B model reports 1.4GB memory use and 96.0 avg LIBERO success after VL/VLA pre-training, near OpenVLA-OFT’s 97.1 at 15.4GB.

github

Farama-Foundation/Metaworld github High

Farama-Foundation · ★ 1821 · Python · Created 2019-09-09 · Active 2026-05-17

#RL#Manipulator

Meta-World is an open source benchmark for developing and evaluating multi-task and meta reinforcement learning algorithms for continuous control robotic manipulation environments, with various benchmarks to evaluate different aspects of reinforcement learning algorithms.

OpenBMB/DeepThinkVLA github High

OpenBMB · ★ 523 · Python · Created 2025-10-13 · Active 2026-04-16

#RL#VLA#Other

DeepThinkVLA provides code, data/checkpoints, and LIBERO evals for a 2.9B pi0-FAST-derived hybrid VLA decoder that writes CoT reasoning before parallel action chunks. It reports 97.0% average LIBERO success, +15.5 points over naive autoregressive CoT, with Masked-CoT inference at 0.175x pi0-FAST autoregressive latency.

NVIDIA/warp github High

NVIDIA · ★ 6689 · Python · Created 2022-03-18 · Active 2026-05-26

#Other

NVIDIA Warp is a Python framework for writing GPU-accelerated simulation code aimed at robotics, machine learning, and physically based modeling. It is useful when researchers need high-performance differentiable or parallel simulation kernels without dropping fully into low-level CUDA, especially for workflows that mix simulation with learning systems.

vla-safe/SAFE github High

vla-safe · ★ 71 · Python · Created 2025-06-13 · Active 2026-05-21

#VLA#Other

SAFE is the official codebase for a NeurIPS 2025 project on multitask failure detection in vision-language-action robot models. It focuses on detecting when VLA policies are likely to fail across tasks, which is useful for evaluating and adding safety checks around embodied AI systems before or during deployment.

RobotControlStack/robot-control-stack github High

RobotControlStack · ★ 98 · Python · Created 2024-02-29 · Active 2026-05-22

#RL#VLA#Manipulator

RobotControlStack is a lightweight sim-to-real stack for training and deploying vision-language-action models and reinforcement-learning agents without depending on ROS. It provides native MuJoCo/Gymnasium wrappers with synchronous execution across common robot platforms including Franka, UR5e, xArm, and SO101, making it useful for researchers who want a leaner control and simulation pipeline for VLA or RL experiments.

ginwind/VLA-JEPA github High

ginwind · ★ 245 · Python · Created 2026-02-10 · Active 2026-05-02

#RL#VLA#Other

VLA-JEPA provides training/evaluation code for augmenting a Qwen3-VL-2B VLA policy with a V-JEPA2 latent world/video model, built on starVLA. It supports LeRobot v2.1 robot data plus human-video training and includes LIBERO, LIBERO-Plus, and SimplerEnv eval scripts; no results are given.

github

isaac-sim/Sim-to-Real-SO-101-Workshop github High

isaac-sim · ★ 38 · Python · Created 2026-03-17 · Active 2026-05-13

#Other

Train an SO-101 Robot From Sim-to-Real With NVIDIA Isaac Welcome to this workshop on sim-to-real transfer for the SO-101 robot! This repository contains the assets and code to accompany this learning content. The rest of this README will help you setup the environment and ensure everything is installed correctly.

shihao1895/MemoryVLA github High

shihao1895 · ★ 244 · Python · Created 2025-08-24 · Active 2026-04-27

#RL#VLA#Manipulator

MemoryVLA adds a hippocampal-like perceptual-cognitive memory module to VLA policies, targeting long-horizon manipulation from third-person RGB and language only. The repo releases OpenVLA-based MemoryVLA and Dexbotic-based MemoryVLA+, with checkpoints/logs; reported averages include 97.1 on LIBERO for MemoryVLA+ mix and 84.4 on Bridge.

unitreerobotics/unitree_rl_mjlab github High

unitreerobotics · ★ 375 · C++ · Created 2026-01-27 · Active 2026-04-13

#RL#Other

Unitree RL Mjlab ✳️ Overview Unitree RL Mjlab is a reinforcement learning project built upon the mjlab, using MuJoCo as its physics simulation backend, currently supporting Unitree Go2, A2, As2, G1, R1, H1 2 and H2.

github

IRMVLab/awesome-robot-learning-from-human-videos github SURVEY

IRMVLab · ★ 110 · Created 2026-04-14 · Active 2026-05-22

#VLA#Other#Survey

Paper list for robot learning from human videos (LfHV)

github

MilkClouds/awesome-vla-study github SURVEY

MilkClouds · ★ 250 · Created 2026-02-13 · Active 2026-03-21

#RL#VLA#Other#Survey

Curated VLA reading syllabus organizes papers in a 14-week order from diffusion/flow-matching foundations through RT-1/RT-2, Octo/OpenVLA, current VLAs, data scaling, efficient inference, RL fine-tuning, reasoning, and world models. It is a study guide, not a new model, dataset, or benchmark.

PKU-Alignment/VLA-Arena github High

PKU-Alignment · ★ 165 · Python · Created 2025-09-29 · Active 2026-03-14

#VLA#Other

🤖 VLA-Arena: An Open-Source Framework for Benchmarking Vision-Language-Action Models VLA-Arena is an open-source benchmark for systematic evaluation of Vision-Language-Action (VLA) models. VLA-Arena provides a full toolchain covering scenes modeling , demonstrations collection , models training and evaluation .

github

keon/awesome-physical-ai github SURVEY

keon · ★ 253 · Created 2026-01-12 · Active 2026-03-30

#RL#VLA#Other#Survey

Awesome Physical AI A curated list of academic papers and resources on Physical AI — focusing on Vision-Language-Action (VLA) models, world models, embodied ai, and robotic foundation models.

PKU-Alignment/SafeVLA github High

PKU-Alignment · ★ 139 · Python · Created 2025-02-27 · Active 2026-03-31

#VLA#Other

SafeVLA is a GitHub repo for safety-aligning vision-language-action models using constrained learning. It is tied to a NeurIPS 2025 Spotlight paper, giving researchers a source entry point to inspect the method.

github

Tsunami-kun/awesome-humanoid-manipulation github SURVEY

Tsunami-kun · ★ 131 · Created 2024-10-19 · Active 2026-03-18

#Manipulator#Bimanual#Humanoid#Survey

GitHub awesome list collecting papers and resources on humanoid, dexterous, bimanual, in-hand, and humanlike manipulation. Useful for literature discovery; no code, benchmark, or robot results are specified.

proroklab/VectorizedMultiAgentSimulator github High

proroklab · ★ 569 · Python · Created 2022-05-12 · Active 2026-05-19

#RL#Other

VMAS is a PyTorch-based, vectorized differentiable 2D physics simulator built for efficient multi-agent reinforcement learning experiments. It provides challenging multi-robot scenarios for benchmarking and a modular interface for adding new ones, making it useful when researchers need fast batched simulation rather than one environment rollout at a time.

iit-DLSLab/basic-locomotion-isaaclab github High

iit-DLSLab · ★ 85 · Python · Created 2025-06-20 · Active 2026-05-24

#Other

DLSLab’s `basic-locomotion-isaaclab` adds IsaacLab-based locomotion environments and training utilities for multiple quadruped robots. It is aimed at getting basic quadruped policies through the full path from simulation training to sim-to-sim checks and eventual sim-to-real transfer, making it useful as a practical starting point for robot locomotion experiments rather than a standalone algorithmic contribution.

isaac-sim/IsaacLab-Arena github High

isaac-sim · ★ 410 · Python · Created 2025-08-15 · Active 2026-05-27

#Other

IsaacLab-Arena extends NVIDIA Isaac Lab with a composable setup for building robotics simulation environments from interchangeable robots, objects, and scenes. It is aimed at quickly prototyping robot learning tasks and evaluating policies across different embodiments and environment configurations, making it useful for scalable simulation studies rather than one-off task definitions.

github

yliu-cs/MMaDA-VLA github High

yliu-cs · ★ 48 · Python · Created 2025-05-23 · Active 2026-05-14

#VLA#Other

GitHub repo for MMaDA-VLA, an arXiv 2026 large diffusion vision-language-action model. It claims unified multimodal instruction and generation; no benchmarks or robot results are given in the snippet.

OpenHelix-Team/LLaVA-VLA github High

OpenHelix-Team · ★ 187 · Python · Created 2025-06-16 · Active 2026-03-12

#VLA#Other

LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model Our repo is based on popular VLM LLaVA, aiming to open-source a VLA with simple structure, strong performance, and easy extensibility, serving as a good baseline for beginners and senior researchers. We will continue to maintain this repo.

github

showlab/Awesome-Robotics-Diffusion github SURVEY

showlab · ★ 333 · Created 2024-12-23 · Active 2026-05-18

#Manipulator#Survey

Awesome Robot Diffusion A curated list of recent robot learning papers incorporating diffusion models for manipulation, navigation, planning etc. The paper list is structured so that each paper falls into only one place. While some methods could fit into multiple places, we place each one in the most relevant class.

github

LoveJu1y/LaRA-VLA github High

LoveJu1y · ★ 46 · Python · Created 2026-01-31 · Active 2026-05-18

#VLA#Other

GitHub repo for LaRA-VLA, an ICML 2026 project on latent reasoning for vision-language-action models. It claims latent thinking and prediction for VLAs; the snippet gives no benchmarks or robot results.

github

abmoRobotics/RLRoverLab github High

abmoRobotics · ★ 118 · Python · Created 2024-09-17 · Active 2026-05-22

#RL#Other

RLRoverLab provides rover and space-oriented reinforcement learning environments built on NVIDIA Isaac Sim and Isaac Lab. It is aimed at training and testing embodied RL policies for extraterrestrial robotics scenarios, giving researchers a simulation-ready starting point for rover control and related space robotics experiments.

open-gigaai/giga-brain-0 github High

open-gigaai · ★ 2518 · Python · Created 2025-09-26 · Active 2026-03-10

#RL#VLA#Other

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

github

isaac-sim2real/sage github High

isaac-sim2real · ★ 89 · Python · Created 2025-09-05 · Active 2026-04-28

#Humanoid

GitHub framework for measuring sim-to-real gaps in robot joint motions across humanoid platforms. It pairs physics simulation, real hardware data collection, and statistical analysis for actionable motion-mismatch benchmarking.

OpenHelix-Team/ReconVLA github High

OpenHelix-Team · ★ 257 · Python · Created 2025-07-21 · Active 2026-04-01

#VLA#Other

ReconVLA code trains/evaluates a VLA policy coupling action-token prediction with denoising reconstruction of gaze-region image tokens to force implicit grounding on manipulated targets. It targets BridgeData V2, LIBERO, and CALVIN, claiming 100k+ trajectories/2M+ samples for generalization.

iit-DLSLab/sim2real-robot-identification github High

iit-DLSLab · ★ 31 · Python · Created 2025-07-05 · Active 2026-05-27

#Manipulator

Provides a joint-calibration routine for quadrupeds and manipulators aimed at reducing the mismatch between simulated and real robot kinematics. It is meant to support sim-to-real transfer from Isaac Lab and MuJoCo by identifying or correcting joint parameters before deploying policies or controllers on hardware.

MINT-SJTU/VLA-Pruner github High

MINT-SJTU · ★ 58 · Python · Created 2025-11-24 · Active 2026-05-26

#VLA#Other

VLA-Pruner is the official code for a token-pruning method aimed at making vision-language-action models cheaper at inference time. It prunes visual tokens at two levels while accounting for temporal structure, targeting the redundant video/observation tokens that slow down robot policy execution without changing the overall VLA setup.

github

Toni-SM/skrl github High

Toni-SM · ★ 1053 · Python · Created 2021-10-18 · Active 2026-05-11

#RL#Other

skrl is a modular reinforcement-learning library implemented in PyTorch, JAX, and NVIDIA Warp. It supports Gymnasium/Gym, NVIDIA Isaac Lab, MuJoCo Playground, and other environments, with code released on GitHub.

YuZhaoshu/Efficient-VLAs-Survey github SURVEY

YuZhaoshu · ★ 155 · Created 2025-09-12 · Active 2026-04-29

#VLA#Other#Survey

GitHub repo curating research for a survey on efficient vision-language-action models. It is maintained as a live list, but the blurb gives no specific code, benchmarks, or robot results.

leggedrobotics/robotic_world_model github High

leggedrobotics · ★ 629 · Python · Created 2025-11-24 · Active 2026-04-08

#RL#Other

Isaac Lab extension for training and evaluating RWM/RWM-U pipelines: learn neural dynamics alongside PPO, then train policies from imagined rollouts online or fully offline without a simulator. It targets legged-robot model-based RL with ensemble uncertainty, autoregressive rollout visualization, and model-free comparisons.

github

RoboVerseOrg/RoboVerse github High

RoboVerseOrg · ★ 1742 · Python · Created 2025-04-04 · Active 2026-05-18

#IL#Manipulator

GitHub repo for RoboVerse, a unified robot-learning simulation platform, synthetic dataset and benchmark. It claims 510.5k trajectories across 276 manipulation task categories and 5.5k assets, with >50M state transitions. Code, docs and data are linked; benchmarks cover imitation learning and RL.

github

behnamasadi/robotic_notes github High

behnamasadi · ★ 223 · C++ · Created 2022-11-01 · Active 2026-05-17

#Other

GitHub repo of robotics notes, snippets, and tutorials spanning Lie groups/algebra, robot configuration, IMUs, ROS2-Gazebo, state estimation, VIO/LIO, and deep-learning SLAM. Actionable as released tutorial/code snippets; no benchmark or real-robot validation is stated.

github

microsoft/VITRA github High

microsoft · ★ 384 · Python · Created 2025-10-26 · Active 2026-05-12

#VLA#Manipulator

VITRA is a GitHub repo for an ICRA 2026 vision-language-action pretraining method for robotic manipulation. It focuses on scaling robot learning from real-life human activity videos; the repo is the artifact to inspect for implementation details.

AoqunJin/Awesome-VLA-Post-Training github SURVEY

AoqunJin · ★ 178 · Created 2025-05-23 · Active 2026-04-30

#VLA#Other#Survey

A collection of vision-language-action model post-training methods.

github

VectorRobotics/vector-os-nano github High

VectorRobotics · ★ 140 · Python · Created 2026-03-19 · Active 2026-05-18

#Other

GitHub repo for Vector OS, a cross-embodiment robot OS from CMU RI for Unitree Go2 and SO-ARM101. It claims industrial-grade autonomous navigation, natural-language control, and sim-to-real transfer on specific robot platforms.

github

BAAI-Humanoid/MOSAIC github High

BAAI-Humanoid · ★ 120 · Python · Created 2026-02-10 · Active 2026-03-07

#Humanoid

MOSAIC trains humanoid teleoperation policies for whole-body motion tracking across multiple input interfaces, using rewards tuned for global motion consistency so behaviors stay stable over long horizons. It combines multi-source motion data from AMASS, OMOMO, optical and inertial MoCap, and GENMO-generated motions with rapid residual adaptation, adding interface-specific corrections without giving up a general tracker. The repo is mainly the Isaac Lab training pipeline for the teleoperation-oriented tracker, adaptor training, and multi-teacher residual distillation, with deployment handled separately through RobotBridge.

github

jellyho/TwinVLA github High

jellyho · ★ 10 · Python · Created 2026-03-01 · Active 2026-05-09

#VLA#Manipulator#Bimanual

TwinVLA targets data-efficient bimanual manipulation by composing two single-arm vision-language-action models into a coordinated two-arm policy. The interesting angle is that it tries to reuse and adapt single-arm VLA capability rather than requiring a fully separate bimanual model trained from scratch, aiming to make dual-arm robot learning practical with less task data.

showlab/ShowUI github High

showlab · ★ 1832 · Python · Created 2024-10-31 · Active 2026-04-24

#VLA#Other

ShowUI is a GitHub repo for a CVPR 2025 open-source, end-to-end vision-language-action model aimed at GUI agents and computer use. The actionable hook is released code for experimenting with GUI-control agents; the snippet gives no metrics.

dartsim/dart github High

dartsim · ★ 1086 · C++ · Created 2011-09-19 · Active 2026-05-26

#Other

DART is a C++20 physics engine aimed at robotics, animation, and machine learning research, with Python bindings for easier integration into experimental workflows. It is useful when you need simulation infrastructure that can support robot dynamics and control studies while still being accessible from both native C++ and Python.

github

DeepRoboticsLab/Lite3_rl_deploy github High

DeepRoboticsLab · ★ 122 · C++ · Created 2024-09-12 · Active 2026-04-25

#Other

Sim-to-sim and sim-to-real deployment for Lite3 Robot

github

dexmal/dexbotic github High

dexmal · ★ 1083 · Python · Created 2025-10-17 · Active 2026-05-18

#VLA#Other

Dexbotic: Open-Source Vision-Language-Action Toolbox

github

fracapuano/robot-learning-tutorial github High

fracapuano · ★ 530 · TeX · Created 2025-08-10 · Active 2026-04-09

#Other

GitHub repo for all source code accompanying Robot Learning: A Tutorial. Code is released and open to contributions; contributors may be featured in the next iteration.

pypose/pypose github High

pypose · ★ 1536 · Python · Created 2021-11-11 · Active 2026-05-05

#Other

A library for differentiable robotics on manifolds.

hanruihua/ir-sim github High

hanruihua · ★ 1083 · Python · Created 2022-05-30 · Active 2026-05-23

#Other

ir-sim is a lightweight Python robot simulator aimed at navigation, control, and learning experiments. It is likely useful for quickly prototyping mobile robot behaviors and testing algorithms in simulation without the overhead of a heavier robotics stack.

jaykorea/Isaac-RL-Two-wheel-Legged-Bot github High

jaykorea · ★ 304 · Python · Created 2023-08-31 · Active 2026-03-26

#RL#Humanoid

two wheel legged robot for IsaacLab - reinforcement learning

FALCON-VLA/FALCON github High

FALCON-VLA · ★ 25 · Python · Created 2026-03-11 · Active 2026-05-24

#VLA#Manipulator

FALCON is a vision-language-action model for robotic manipulation that feeds rich 3D spatial tokens directly into the action head, giving the policy more explicit spatial structure when choosing actions. The repo presents the ICLR 2026 implementation and emphasizes robust spatial understanding with reported state-of-the-art performance across diverse manipulation tasks.

GWxuan/GesVLA github HighNEW

GWxuan · ★ 16 · Python · Created 2026-05-11 · Active 2026-05-22

#VLA#Other

GesVLA is a gesture-aware vision-language-action model that embeds human gesture representations directly into a VLA policy, aiming to let robots interpret not just language and visual context but also embodied human cues. The repository appears to center on using gesture information as an additional control signal for robot action prediction, making it relevant for human-robot interaction settings where pointing, indicating, or demonstrating intent matters.

github

UARK-AICV/OBEYED_VLA github High

UARK-AICV · ★ 12 · Python · Created 2025-12-28 · Active 2026-05-08

#VLA#Manipulator

GitHub repo for OBEYED-VLA, a vision-language-action approach for robot manipulation in clutter. It claims clutter resistance via object-centric and geometry grounding; no metrics or test details are given.

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models arxiv High

Jean Mercat, Sedrick Keh, Kushal Arora, Isabella Huang, Paarth Shah et al. · Submitted 2026-04-21

#VLA#Other

VLA Foundry is an open-source training framework for vision-language-action models. It unifies LLM, VLM, and VLA training in one codebase, targeting fragmented pretraining and action-training pipelines.

OASIS: Observation-Action Space Alignment via SE(3) Trajectory Prediction for Robotic Manipulation arxiv HighNEW

Xinzhe Chen, Sihua Ren, Liqi Huang, Haowen Sun, Mingyang Li et al. · Submitted 2026-05-25

#VLA#Manipulator

OASIS trains a visuomotor policy whose intermediate representation is supervised to predict an SE(3) end-effector trajectory, so the model learns rigid-body action geometry directly instead of leaving it for the action decoder to infer from visual features alone. It combines vision-language and metric-depth features in a 3D-aware encoder, predicts camera-frame poses, and then conditions chunked action generation on those pose-supervised hidden states. In simulation and real robot experiments, this alignment improves success rates and out-of-distribution generalization over VLA and world-action-model baselines.

arpitg1304/forge github High

arpitg1304 · ★ 127 · Python · Created 2026-01-22 · Active 2026-05-26

#IL#VLA#Other

forge is a robotics dataset utility for converting between RLDS, LeRobot v2/v3, Zarr, HDF5, and Rosbag formats while also supporting inspection, visualization, and analysis. It is aimed at making data plumbing easier across common robot-learning workflows, especially OpenVLA, Octo, LeRobot, and Diffusion Policy pipelines, with HuggingFace Hub integration for sharing and reuse.

github

yubohann/Awesome-World-Model-Flow-RL-Multi-Agent-Robotic-Object-Centric github SURVEYNEW

yubohann · ★ 24 · Python · Created 2026-05-03 · Active 2026-05-14

#RL#Other#Survey

GitHub repo documenting an object-centric world-model and flow-policy RL approach for multi-agent robotics in IsaacLab and ROS2. It traces the engineering to a national top-three RoboCup China visual challenge entry, useful as an implementation reference.

github

vikashplus/robohive github High

vikashplus · ★ 625 · Python · Created 2018-07-06 · Active 2026-05-13

#Other

A unified framework for robot learning

github

ncbdrck/UniROS github High

ncbdrck · ★ 110 · Python · Created 2023-12-07 · Active 2026-05-23

#Other

UniROS provides a robotics reinforcement-learning workflow that can train policies in simulation and transfer them to real robots. It is useful as an end-to-end setup for experimenting with robot learning across sim and real-world environments, rather than treating those stages as separate pipelines.

github

Holiday-Robot/FlashSAC github High

Holiday-Robot · ★ 246 · Python · Created 2026-04-06 · Active 2026-04-09

#RL#Other

FlashSAC is a GitHub repo for fast, stable off-policy reinforcement learning in high-dimensional robot control. The snippet gives no benchmark numbers; actionable value is the released code.

github

isri-aist/RoboManipBaselines github High

isri-aist · ★ 370 · Python · Created 2024-03-23 · Active 2026-05-12

#Manipulator

RoboManipBaselines is a GitHub software framework for robotic manipulation baselines. It integrates multiple imitation-learning methods with benchmark environments, giving researchers code to run and compare methods in one setup.

github

opendr-eu/opendr github High

opendr-eu · ★ 725 · Python · Created 2020-09-08 · Active 2026-05-11

#Other

GitHub repo for OpenDR, a modular open toolkit for core robotic functions using deep learning. Actionable as released non-proprietary code; no benchmarks or real-robot results are stated.

github

DeepRoboticsLab/sdk_deploy github High

DeepRoboticsLab · ★ 23 · C++ · Created 2025-11-08 · Active 2026-05-22

#RL#Other

DeepRoboticsLab/sdk_deploy provides deployment support for taking robot policies from simulation into both sim-to-sim testing and real Deep Robotics hardware. It currently targets the M20 and Lite3 robots, making it useful as a bridge between trained controllers and practical evaluation on supported quadruped platforms.

clmoro/awesome-robotics-genai-reinforcement-learning-integration github SURVEY

clmoro · ★ 79 · Created 2024-08-17 · Active 2026-04-23

#RL#Other#Survey

GitHub curation of papers on combining reinforcement learning with generative AI for robotics. Actionable via categorized Excel summaries tracking frameworks, applications, and experiment evaluation metrics.

github

InternRobotics/Re3Sim github High

InternRobotics · ★ 154 · Jupyter Notebook · Created 2025-02-12 · Active 2026-03-16

#Manipulator

Re3Sim is an ICRA 2026 GitHub repo for a real-to-sim method targeting robotic manipulation. It claims high-fidelity simulation data generation via 3D-photorealistic real-to-sim, with project materials available in the repo.

mekion/the-bimo-project github High

mekion · ★ 138 · Python · Created 2025-12-22 · Active 2026-05-21

#Humanoid

Bimo is a fully 3D-printable bipedal robot platform with a Python API for controlling the hardware and experimenting with locomotion. The repo also includes an Isaac Lab environment with demonstrated sim-to-real transfer, making it useful as a low-cost, reproducible testbed for bipedal robotics research.

github

lupinjia/LeggedGym-Ex github High

lupinjia · ★ 293 · Python · Created 2024-12-21 · Active 2026-05-20

#RL#Other

Legged Robot environments for reinforcement learning in Multiple Simulators

RobustFieldAutonomyLab/Multi_Robot_Distributional_RL_Navigation github High

RobustFieldAutonomyLab · ★ 114 · Python · Created 2024-01-31 · Active 2026-02-19

#RL#Other

ICRA 2024 GitHub repo for decentralized multi-robot navigation of autonomous surface vehicles using distributional reinforcement learning. Actionable as a code release; no metrics are given in the source text.

sharpa-robotics/sharpa-rl-lab github High

sharpa-robotics · ★ 60 · Python · Created 2026-03-30 · Active 2026-04-03

#RL#Other

Sharpa reinforcement learning example in Isaac Lab

allenai/molmospaces github High

allenai · ★ 339 · Python · Created 2026-02-02 · Active 2026-05-25

#Other

An end-to-end open ecosystem for robot learning

github

emNavi/AirGym github High

emNavi · ★ 153 · Python · Created 2024-03-02 · Active 2026-05-17

#RL#Other

AirGym is a GitHub repo for high-performance drone deep reinforcement learning built on Isaac Gym. Actionable as released code; no benchmarks or real-robot results are given in the snippet.

aalmuzairee/squint github High

aalmuzairee · ★ 54 · Python · Created 2026-02-21 · Active 2026-03-04

#RL#Manipulator

PyTorch GitHub repo for Squint, a fast visual reinforcement learning approach for sim-to-real robotics. Targets SO-101 robot arm workflows with ManiSkill3, making it actionable for adapting visual RL experiments.

surgical-robotics-ai/isaac-sim-surgical-robotics-challenge github High

surgical-robotics-ai · ★ 15 · Python · Created 2024-08-28 · Active 2026-05-26

#Other

Isaac Sim implementation of the AMBF Surgical Robotics Challenge from Johns Hopkins LCSR Lab, bringing the challenge environment into NVIDIA’s robotics simulation stack. It is useful as a simulation and development setup for surgical robotics research, especially for testing perception, control, and learning pipelines against a shared challenge-style task environment.

github

MINT-SJTU/Evo-1 github High

MINT-SJTU · ★ 283 · Python · Created 2025-11-06 · Active 2026-05-12

#VLA#Other

Evo-1 is a GitHub project for a lightweight vision-language-action model focused on preserving semantic alignment. No metrics, benchmarks, or real-robot evidence are stated; actionable mainly as a repo to inspect.

GaTech-RL2/EgoVerse github High

GaTech-RL2 · ★ 395 · Python · Created 2025-02-11 · Active 2026-05-26

#Other

EgoVerse: Egocentric Data for Robot Learning from Around the World

github

zita-ch/bipedal-robot-learning-collection github SURVEY

zita-ch · ★ 457 · Created 2022-05-03 · Active 2026-04-06

#Humanoid#Survey

Collection of high-quality robo learning papers for bipedal robots.

github

zhaozijie2022/LocoLeggedWheel github High

zhaozijie2022 · ★ 124 · Python · Created 2026-03-06 · Active 2026-05-24

#Other

RL-based Legged-Wheeled Robot locomotion sim-to-real based on NVIDIA Isaac Lab

github

nakamotoo/dsrl_pi0 github High

nakamotoo · ★ 254 · Python · Created 2025-08-05 · Active 2026-04-27

#RL#Other

Official GitHub implementation of DSRL for steering pi0 diffusion policies with latent-space reinforcement learning. Code is released for the CoRL 2025 work, but the provided text gives no results, benchmarks, or robot-test details.

github

YanjieZe/Paper-List github SURVEY

YanjieZe · ★ 547 · Created 2022-09-20 · Active 2026-05-19

#Other#Survey

A paper list of my history reading. Robotics, Learning, Vision.

github

LightwheelAI/LW-BenchHub github High

LightwheelAI · ★ 152 · Python · Created 2025-07-26 · Active 2026-05-18

#RL#Other

GitHub repo for LW-BenchHub, a unified embodied-AI benchmark hub built on Isaac Lab-Arena. It provides consistent interfaces, realistic environments, multi-robot support, and large-scale evaluation.

github

TJU-Aerial-Robotics/YOPO-Rally github High

TJU-Aerial-Robotics · ★ 51 · Python · Created 2025-05-20 · Active 2026-05-09

#Other

A Sim-to-Real Single-Stage Planner for Off-Road Terrain

D-Robotics/hobot_stereonet github High

D-Robotics · ★ 55 · C++ · Created 2024-07-20 · Active 2026-05-15

#Other

GitHub repo for Hobot StereoNet, a deep-learning stereo depth estimator from D-Robotics. It turns stereo image pairs into depth maps in real time for robotics and 3D perception.

github

BlackOtters/SonicStar github HighNEW

BlackOtters · ★ 23 · Python · Created 2026-05-22 · Active 2026-05-23

#VLA#Humanoid

SonicStar is an open-source Vision-Language-Action stack for the Unitree G1 that covers the full path from teleoperation data collection through SonicLatent training, simulation, and real-time whole-body policy deployment. It is useful as an end-to-end codebase for researchers working on humanoid robot control and embodied AI, especially if they need a practical pipeline for collecting demonstrations and turning them into deployable policies, though real-world deployment is still marked as TBD.

reiniscimurs/DRL-robot-navigation-IR-SIM github High

reiniscimurs · ★ 314 · Python · Created 2025-02-11 · Active 2026-03-30

#RL#Other

GitHub repo for mobile robot navigation in IR-SIM using SAC, TD3, PPO, and DDPG. The simulated robot learns to reach random goals while avoiding obstacles; code is released for running the training setup.

xiaoxiaoxh/reactive_diffusion_policy github High

xiaoxiaoxh · ★ 330 · Python · Created 2025-03-30 · Active 2026-04-12

#IL#Manipulator

GitHub repo for Reactive Diffusion Policy, an RSS 2025 slow-fast visual-tactile diffusion policy for contact-rich manipulation. It targets reactive control in contact tasks; no metrics are given in the snippet.

NeuracoreAI/neuracore github High

NeuracoreAI · ★ 269 · Python · Created 2025-01-07 · Active 2026-05-26

#Other

The Cloud Platform for Robot Learning

MuammerBay/isaac_so_arm101 github High

MuammerBay · ★ 242 · Python · Created 2025-04-18 · Active 2026-02-18

#Other

Isaac Lab external project for SO-ARM100/101 arm robot.

github

zitongbai/legged_lab github High

zitongbai · ★ 331 · Python · Created 2025-05-10 · Active 2026-05-13

#Other

Isaac Lab extension for legged robots.

github

ShengqianChen/DreamWaQ_Go2W github High

ShengqianChen · ★ 89 · Python · Created 2025-05-26 · Active 2026-05-12

#RL#Other

GitHub repo for RL locomotion on the Go2W legged-wheel robot using NVIDIA Isaac Gym. Code is available, but the snippet gives no benchmarks or real-robot results.

github

toxuandung/DRL_Navigation_Robot_ROS2_Foxy github High

toxuandung · ★ 41 · Python · Created 2023-05-30 · Active 2026-05-15

#Other

Deep-Reinforcement-Learning-Navigation-Robot-ROS2-Foxy

github

AsterisCrack/BipedRobot github High

AsterisCrack · ★ 15 · Python · Created 2022-07-26 · Active 2026-05-17

#RL#Humanoid

GitHub repo for biped robot locomotion with deep RL in Isaac Lab and MuJoCo. Includes custom PPO, SAC, DDPG, D4PG, and MPO agents with MLP/LSTM/Transformer policies, motion imitation, and sim-to-real tooling.

haozhang04/leggedskill github High

haozhang04 · ★ 71 · Python · Created 2026-01-23 · Active 2026-05-23

#Humanoid

LeggedSkill provides a deployment framework for reinforcement-learning motion controllers across several legged robot morphologies, including bipeds, quadrupeds, wheeled-bipeds, and wheeled-quadrupeds. It is aimed at taking learned locomotion policies into real robot use, making it useful as an integration layer for testing and deploying RL-based motion control on heterogeneous platforms.

OpenHelix-Team/Spatial-Forcing github High

OpenHelix-Team · ★ 238 · Python · Created 2025-10-11 · Active 2026-05-26

#VLA#Other

Spatial-Forcing is the official implementation for an ICLR 2026 method that aligns implicit spatial representations inside vision-language-action models. It targets robot policies that need stronger spatial grounding, using representation alignment rather than only language or action supervision to improve how the model connects visual structure to control.

hanruihua/NeuPAN github High

hanruihua · ★ 977 · Python · Created 2024-02-06 · Active 2026-02-27

#Other

GitHub repo for NeuPAN, a TRO 2025 method for direct point robot navigation using end-to-end model-based learning. The blurb signals code access, but gives no metrics, benchmarks, or real-robot results.

CURT1S03/quadruped-drl-platform github High

CURT1S03 · ★ 42 · Python · Created 2026-04-17

#RL#Other

GitHub repo for training a Unitree Go2 quadruped with PPO-based deep reinforcement learning in NVIDIA Isaac Lab. It packages a full-stack DRL workflow as released code for Go2 training experiments.

github

BoosterRobotics/booster_train github High

BoosterRobotics · ★ 43 · Python · Created 2025-12-05 · Active 2026-04-02

#Other

GitHub repo with reinforcement-learning tasks for Booster robots built on Isaac Lab. Actionable code for training or adapting Booster policies; no benchmarks or real-robot validation are stated.

github

lasgroup/safe-learning github High

lasgroup · ★ 26 · Python · Created 2024-07-02 · Active 2026-05-19

#Other

lasgroup/safe-learning collects algorithms and experiment tooling for studying safe sim-to-real transfer in robotics. It is meant as a practical codebase for running and comparing safe learning methods, where the emphasis is on moving policies from simulation toward real robotic systems while managing safety constraints during learning and deployment.

github

ROBOTIS-GIT/robotis_lab github High

ROBOTIS-GIT · ★ 102 · Python · Created 2025-07-14 · Active 2026-04-29

#RL#Other

robotis_lab is a GitHub repo with RL and imitation-learning tutorials for ROBOTIS robots. It includes Sim2Real support for deploying learned policies on real hardware.

github

lachlanhurst/balance-robot-mujoco-rl github High

lachlanhurst · ★ 15 · Python · Created 2024-09-14 · Active 2026-05-12

#RL#Other

GitHub repo for training a self-balancing robot controller with reinforcement learning in MuJoCo using Stable-Baselines3 and PyTorch. Actionable as released code, but the description gives no metrics or real-robot validation.

github

gbionics/jaxsim github High

gbionics · ★ 196 · Python · Created 2022-01-31 · Active 2026-05-08

#Other

gbionics/jaxsim is a GitHub repo for differentiable physics and multibody dynamics. It targets control and robot-learning workflows; code is released, but no benchmarks or real-robot results are stated.

github

kyegomez/RT-X github High

kyegomez · ★ 241 · Python · Created 2023-10-04 · Active 2026-05-20

#Other

PyTorch repo implementing RT-1-X and RT-2-X from Open X-Embodiment: Robotic Learning Datasets and RT-X Models. Actionable as released model code; the snippet gives no benchmark numbers or robot-test details.

github

strands-labs/robots-sim github High

strands-labs · ★ 28 · Python · Created 2026-02-19 · Active 2026-05-26

#RL#VLA#Other

Simulated environments for robot agent evaluation and reinforcement learning.

github

ncbdrck/realros github High

ncbdrck · ★ 22 · Python · Created 2023-07-18 · Active 2026-05-23

#RL#Other

RealROS is a Python framework that connects reinforcement learning workflows directly to ROS so agents can train in real time on physical robots rather than only in simulation. It is organized as a modular toolkit for building real-world robotics environments, making it useful for researchers who want reusable RL infrastructure around ROS-controlled hardware.

GlimmerLab/Awesome-Embodied-AI-Robot github SURVEY

GlimmerLab · ★ 68 · Python · Created 2024-10-29 · Active 2026-05-11

#Other#Survey

GitHub resource list for embodied AI and robot learning, branded RoboLLM Hub. It curates recent papers, code, and tools for finding robot-AI work rather than offering a single benchmarked method.

martin-sedlacek/REALM github High

martin-sedlacek · ★ 55 · Python · Created 2025-12-22 · Active 2026-05-22

#Manipulator

REALM is an IEEE RA-L benchmark for testing how well robotic manipulation policies generalize, with real-to-sim validation built into the evaluation rather than relying only on simulated task variation. The repository appears to package the benchmark around the martin-sedlacek/REALM codebase, making it useful for researchers who want a concrete manipulation testbed where simulation results are anchored against real-world behavior.

darshmenon/pickplace-rl-mobile-manipulator github High

darshmenon · ★ 25 · C++ · Created 2025-08-31 · Active 2026-05-17

#RL#Manipulator#MobileManipulator

A UR3 arm mounted on a differential-drive mobile base is trained end-to-end for pick-and-place in Gazebo/ROS2 using Truncated Quantile Critics reinforcement learning. The repo is useful as a simulation and training setup for mobile manipulation policies that jointly coordinate base motion and arm control, rather than treating navigation and grasping as separate hand-engineered stages.

HorizonRobotics/RoboTransfer github High

HorizonRobotics · ★ 36 · Python · Created 2025-07-18 · Active 2026-04-17

#Other

Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

brysonjones/multitask_dit_policy github High

brysonjones · ★ 123 · Python · Created 2025-11-24 · Active 2026-03-29

#Manipulator#Humanoid

Open-source GitHub implementation of a Multitask Diffusion Transformer policy for robot manipulation. It targets multitask dexterous manipulation and is tied to TRI’s 2025 LBM study and Boston Dynamics Atlas humanoid demos.

github

HyperbolicCurve/Awesome-World-Action-Model github SURVEY

HyperbolicCurve · ★ 24 · Python · Created 2026-04-01 · Active 2026-05-24

#RL#VLA#Other#Survey

HyperbolicCurve/Awesome-World-Action-Model is a curated reading list for researchers tracking Vision-Language-Action models and World Action Models, collecting papers and related resources in one place. It is useful as a survey hub rather than a runnable system: the value is in mapping the fast-moving VLA/WAM literature and giving researchers a starting point for following methods that connect perception, language, action, and predictive world modeling.

RajatDandekar/sim-engine github High

RajatDandekar · ★ 10 · Python · Created 2026-03-27

#IL#VLA#Manipulator

MuJoCo-based SO-101 robot arm simulator aimed at training and testing manipulation policies such as ACT, Diffusion Policy, and SmolVLA. It is useful as a lightweight environment for experimenting with robot learning pipelines before moving to hardware, especially for comparing policy architectures on simulated arm-control tasks.

BDX-R/BDX-R-IsaacLab github High

BDX-R · ★ 34 · Python · Created 2025-08-02 · Active 2026-03-16

#RL#Other

Isaac Lab reinforcement learning for the BDX-R robot

TX-Leo/HumanEgo github HighNEW

TX-Leo · ★ 94 · Python · Created 2026-05-23 · Active 2026-05-26

#Other

HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos

github

BDX-R/BDX-R-MjLab github High

BDX-R · ★ 64 · Python · Created 2025-09-29 · Active 2026-05-18

#RL#Other

mjlab reinforcement learning for the BDX-R robot

github

ncbdrck/multiros github High

ncbdrck · ★ 39 · Python · Created 2022-03-18 · Active 2026-05-24

#RL#Other

MultiROS is a ROS-based simulation environment aimed at running concurrent deep reinforcement learning workflows for robotics. It gives researchers a scalable setup for training and evaluating RL agents on complex robotic tasks, making it useful when experiments need parallelism and repeatable simulation rather than a single hand-built environment.

weqwoueu/Isaac-Legged-Locomotion github High

weqwoueu · ★ 18 · Python · Created 2026-03-03 · Active 2026-03-04

#RL#Humanoid

Isaac-Legged-Locomotion trains PPO gait policies from scratch in NVIDIA Omniverse/Isaac Lab for current legged platforms including Unitree G1/H1 humanoids, Unitree Go1, and ANYmal-C. It focuses on large-scale parallel tensor simulation with domain randomization and hardware-aware reward shaping, producing proprioception-only controllers that can keep dynamic balance on rough terrain and highly slippery ice without visual input.

A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model arxiv High

Kaidong Zhang, Jian Zhang, Rongtao Xu, Yu Sun, Shuoshuo Xue et al. · Submitted 2026-04-07 · Updated 2026-04-15

#RL#VLA#Manipulator

A1 targets the deployment bottleneck in vision-language-action robot policies by making the model adaptive and truncated, reducing the cost of both the large VLM backbone and the iterative action head used in many VLA systems. The emphasis is on a fully transparent open-source design that can support real-time open-world manipulation on commodity hardware rather than requiring billion-scale inference budgets.

acl21/diwa github High

acl21 · ★ 77 · Python · Created 2025-05-26 · Active 2026-04-24

#RL#IL#Other

DiWA: Diffusion Policy Adaptation with World Models

Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models arxiv HighNEW

Ming Sun, Rui Wang, Xingrui Yu, Lihua Jing, Hangyu Du et al. · Submitted 2026-05-09

#VLA#Other

Backdoor-based ownership verification approach for vision-language-action robot models. It targets shared or adapted VLAs, aiming to prove ownership for secure deployment and responsible open-source use; code, benchmarks, and robot tests are not specified.

felipemohr/IsaacLab-Quadruped-Tasks github High

felipemohr · ★ 53 · Python · Created 2024-07-02 · Active 2026-02-23

#Other

Quadruped Tasks extension based on Isaac Lab.

PointsCoder/OpenReal2Sim github High

PointsCoder · ★ 208 · Python · Created 2025-09-29 · Active 2026-03-31

#Other

A toolbox for real-to-sim reconstruction and robotic simulation

marmotlab/MARVEL github High

marmotlab · ★ 60 · Python · Created 2025-04-23 · Active 2026-04-28

#RL#Other

GitHub repo for MARVEL, an ICRA 2025 multi-agent RL approach to FOV-constrained multi-robot exploration in large-scale environments. Actionable as an implementation/code source for this exploration setup.

enactic/openarm_isaac_lab github High

enactic · ★ 93 · Python · Created 2025-07-23 · Active 2026-02-18

#Other

OpenArm Isaac Lab Simulation

arxiv

Dexora: Open-source VLA for High-DoF Bimanual Dexterity arxiv HighNEW

Zongzheng Zhang, Jingrui Pang, Zhuo Yang, Kun Li, Minwen Liao et al. · Submitted 2026-05-18

#VLA#Manipulator#Bimanual

Dexora is an open-source vision-language-action model for high-DoF bimanual dexterity. It targets a gap in VLAs now largely limited to dual grippers or single-arm dexterous hands.

When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards arxiv HighNEW

Li Wang, Xiaodong Lu, Xiaohan Wang, Yikun Ban, Jiajun Chai et al. · Submitted 2026-05-25

#RL#Other

RLAVR makes reinforcement learning with verifiable rewards less dependent on expensive ground-truth labels by actively querying a small set of high-value examples while using pseudo-labels for the rest. Its Corrective Advantage Gap metric analyzes which samples would most improve supervision, and CARE turns that oracle-style signal into a practical acquisition policy before querying. Across domains, model families, and scales, this mixed-label strategy stabilizes training that would otherwise collapse under unsupervised RLVR and improves performance under limited annotation budgets.

arxiv

TapSampling: Inference-Time Sampling with a Task-Progress-Understanding Verifier for Robotic Manipulation arxiv HighNEW

Sizhe Zhao, Shengping Zhang, Shuo Yang, Weiyu Zhao, Shuigen Wang et al. · Submitted 2026-05-25

#Manipulator

TapSampling improves robotic manipulation policies at inference time by drawing multiple candidate actions from a learned Action-VAE latent space instead of committing to a single generative sample. It then selects among those candidates with a verifier trained to predict task-progress outcomes from sequential robot data, giving the selection step a more semantic notion of whether an action advances the task. The method is policy-agnostic and improves several generalist policies in both simulation and real-world experiments without additional policy finetuning.

Robust Koopman Control Barrier Filters for Safe Actor-Critic Reinforcement Learning arxiv HighNEW

Dhruv S. Kushwaha, Zoleikha A. Biron · Submitted 2026-05-26

#RL#Other

Robust Koopman-CBF SAC adds a quadratic-program safety filter to actor-critic RL by learning a finite-dimensional Koopman predictor from data, building affine barrier constraints in the lifted space, and tightening them with a residual margin estimated from held-out rollouts. It achieves zero constraint violations on CartPole stabilization and tracking while matching or exceeding unconstrained SAC returns, but its mixed Safety Gymnasium locomotion results show where first-order velocity barriers and linear EDMD models start to break down.

github

Mayankm96/isaac-spinning-up github High

Mayankm96 · ★ 134 · Python · Created 2025-12-06 · Active 2026-03-08

#Other

Educational Resource for Isaac Lab

github

Motphys/MotrixLab github High

Motphys · ★ 104 · Python · Created 2025-11-18 · Active 2026-05-25

#Other

A general-purpose machine learning architecture designed for robot training

github

sharpa-robotics/sharpa-tacmap github High

sharpa-robotics · ★ 23 · Python · Created 2026-03-26 · Active 2026-04-08

#Other

Sharpa tacmap tactile sensor in Isaac Lab.

github

yifan-hou/adaptive_compliance_policy github High

yifan-hou · ★ 124 · Python · Created 2024-09-15 · Active 2026-04-18

#Other

Official GitHub implementation of Adaptive Compliance Policy Learning for diffusion-guided robot control. Code is released, but the description gives no results, benchmarks, or real-robot details.

otr-ebla/mujoco-rl-nav github High

otr-ebla · ★ 10 · Python · Created 2025-05-16 · Active 2026-05-13

#Other

GitHub repo for reinforcement-learning-based human-aware mobile robot navigation in MuJoCo using laser-based perception. Actionable as released code for MuJoCo environments; no benchmarks or real-robot results are stated.

RUCKBReasoning/From_Pixels_to_Tokens github High

RUCKBReasoning · ★ 19 · Python · Created 2026-03-31 · Active 2026-05-21

#RL#VLA#Other

From Pixels to Tokens studies how latent action supervision can be used to train vision-language-action models, turning raw visual interaction traces into token-like action representations that can support robot control. The ICML 2026 framing suggests a systematic comparison of how these latent action signals affect VLA learning, making the repo most useful for researchers probing the bridge between pixel observations, discretized action abstractions, and policy performance.

Fediory/Grid-Sampler github HighNEW

Fediory · ★ 49 · Python · Created 2026-05-06 · Active 2026-05-14

#VLA#Other

Official GitHub implementation for Grid-Sampler, a differentiable grid sample pruning method for generalizable vision-language-action models. Actionable as code for an ICML 2026 paper; no metrics or robot benchmark details are given in the snippet.

github

Czy213hd/Go2_ARX_mjlab github HighNEW

Czy213hd · ★ 29 · Python · Created 2026-05-18

#RL#Other

MJLAB reinforcement learning for Go2 quadruple robot with ARX-L5 arm.

linchangyi1/LocoTouch github High

linchangyi1 · ★ 58 · Python · Created 2025-05-16 · Active 2026-05-15

#Other

LocoTouch is a GitHub repo for a CoRL 2025 IsaacLab project on perceptive learning for legged robots. It focuses on dynamic quadrupedal transport with tactile sensing; the snippet gives no metrics or real-robot claims.

GigaAI-research/ViVa github High

GigaAI-research · ★ 61 · Python · Created 2026-04-09 · Active 2026-04-15

#RL#Other

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

MarcDcls/mjlab_upkie github High

MarcDcls · ★ 30 · Python · Created 2025-11-05 · Active 2026-05-26

#RL#Humanoid

RL environments for the Upkie wheeled biped robot, packaged around training and evaluating reinforcement-learning controllers for balancing and locomotion. It is useful as a simulation and experimentation layer for researchers working on Upkie-specific control policies, though the provided description does not specify particular tasks, algorithms, or results.

github

LRMbbj/DIPOLE github High

LRMbbj · ★ 37 · Created 2025-12-15 · Active 2026-05-02

#IL#Other

Official GitHub implementation for the ICLR 2026 method Dichotomous Diffusion Policy Optimization. Actionable as released code; the provided description gives no benchmark results or robot-testing details.

github

DH-Ng/TacEx github High

DH-Ng · ★ 142 · Python · Created 2025-06-29 · Active 2026-04-07

#Other

Tactile Extension for Isaac Sim/ Isaac Lab.

github

ManUtdMoon/ZPRL github High

ManUtdMoon · ★ 22 · Python · Created 2026-03-24 · Active 2026-05-20

#RL#Manipulator

ManUtdMoon/ZPRL implements Bottleneck Latent Reinforcement Learning for steering robot manipulation policies beyond simple action residuals. The code accompanies the paper on using a constrained latent bottleneck to adapt or guide manipulation behavior, giving researchers a concrete implementation to inspect or build on rather than just the paper description.

github

ci-group/ariel github High

ci-group · ★ 37 · Python · Created 2025-07-22 · Active 2026-05-14

#Other

ARIEL: Autonomous Robots through Integrated Evolution and Learning

IsaacZH/himloco_lab github High

IsaacZH · ★ 39 · Python · Created 2025-10-28 · Active 2026-03-18

#Other

train, export, and deploy HimLoco policies in the Isaac Lab environment

BeingBeyond/Being-H0 github High

BeingBeyond · ★ 41 · Python · Created 2026-01-19 · Active 2026-05-04

#VLA#Other

Being-H0 pretrains a vision-language-action model from large-scale human videos, aiming to transfer the structure of everyday human behavior into robot control. The interesting part is the supervision source: instead of relying only on robot demonstrations, it uses abundant human video data to learn action-relevant visual and language representations for downstream embodied tasks.

github

Jiarui-Xie/AMP_Running_baseline github High

Jiarui-Xie · ★ 15 · Python · Created 2026-04-01 · Active 2026-05-06

#Humanoid

Locomotion training code for the Unitree G1 robot aimed at the 2026 Beijing Yizhuang Half Marathon Robot Race, built on Isaac Lab for sim-to-hardware running. It uses Adversarial Motion Priors and motion imitation to train running behaviors that are intended to transfer onto real hardware, making it most useful as a baseline for researchers developing legged-locomotion policies for competitive humanoid running.

github

shuosha/Residual_Copilot github High

shuosha · ★ 21 · Python · Created 2026-02-17 · Active 2026-05-15

#Other

Residual_Copilot is the open-source implementation for Efficient and Reliable Teleoperation through Real-to-Sim-to-Real Shared Autonomy, a shared-autonomy approach that improves robot teleoperation by learning residual assistance through a real-to-sim-to-real pipeline. It is aimed at making remote robot control more efficient and reliable by using simulation-derived assistance while keeping the human operator in the loop.

yechen056/UR5e-DP-Family github High

yechen056 · ★ 22 · Python · Created 2026-04-18 · Active 2026-04-19

#IL#Manipulator

Practical deployment code for running the Diffusion Policy family on UR5e robot arms, aimed at moving imitation-learning policies from research implementations onto real hardware. It should be useful for researchers or engineers working with UR5e setups who want a concrete starting point for testing diffusion-based manipulation policies rather than rebuilding the robot integration stack themselves.

Cutting the Cord: System Architecture for Low-Cost, GPU-Accelerated Bimanual Mobile Manipulation arxiv High

Artemis Shaw, Chen Liu, Justin Costa, Rane Gray, Alina Skowronek et al. · Submitted 2026-03-10 · Updated 2026-03-21

#Manipulator#Bimanual#MobileManipulator

A bimanual mobile manipulator based on the open-source XLeRobot is redesigned to run with integrated onboard GPU compute, removing dependence on tethered external hardware. The system targets low-cost deployment, with the full platform coming in under $1300, making mobile two-arm manipulation experiments more accessible for labs working with constrained budgets.

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases arxiv HighNEW

Dongyoon Hahm, Dylan Hadfield-Menell, Kimin Lee · Submitted 2026-05-26

#RL#Other

Alignment tampering describes a failure mode in RLHF where the model being aligned can shape the preference data through its own outputs, so annotators may reward responses that are higher quality but also carry unwanted biases. Because pairwise preference labels do not say whether a response won due to quality, bias, persuasion, or some mixture, the reward model can learn the wrong signal and RL or best-of-N sampling can amplify it. The paper demonstrates this across keyword bias, sexist propaganda, brand promotion, and instrumental goal-seeking, and finds that existing robust RLHF mitigations do not remove the problem without hurting response quality.

Youqiang-Gui/SeedPolicy github High

Youqiang-Gui · ★ 31 · Python · Created 2026-01-30 · Active 2026-03-15

#IL#Manipulator

SeedPolicy is the official code release for a robot manipulation method that scales diffusion-policy control over longer horizons by letting the policy self-evolve rather than relying only on fixed demonstrations. The repo is mainly useful for reproducing and building on the SeedPolicy experiments from Horizon Scaling via Self-Evolving Diffusion Policy for Robot Manipulation.

MIND: Multi-Scale Intent Diffusion for Text-Driven Physics-Based Humanoid Control arxiv HighNEW

Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang et al. · Submitted 2026-05-25

#Humanoid

MIND tackles text-driven physics-based humanoid control by using humanoid state trajectories as an intermediate notion of behavioral intent, rather than mapping language straight to low-level actions or tracking a separately generated kinematic motion. Its diffusion controller combines a holistic intent predictor for global motion structure with an immediate intent predictor for step-by-step refinement, with states encoded in a latent space to make the intent representation more semantically aligned with language. In experiments, the method produces more coherent, physically plausible, and text-aligned humanoid behaviors than existing two-stage and end-to-end imitation baselines.

Can LLMs Time Travel? Enhancing Temporal Consistency in Legal Agentic Search through Reinforcement Learning arxiv HighNEW

Wei Fan, Yining Zhou, Mufan Zhang, Yanbing Weng, Yiran HU et al. · Submitted 2026-05-25

#RL#Other

LegalSearch-R1 tackles a failure mode in legal LLM agents where they answer with statutes or precedents from the wrong time period, often because search queries ignore the case’s temporal context. It trains a 7B agent with reinforcement learning over temporally indexed legal data, combining local statute RAG for exact article matching with web search for broader context. On a 13-task legal benchmark, it beats deep-research and legal-LLM baselines by 12.9% to 29.8% overall and improves temporal consistency by 57.7% to 80.3%.

Latent Representation Alignment for Offline Goal-Conditioned Reinforcement Learning arxiv HighNEW

Hyungkyu Kang, Byeongchan Kim, Min-hwan Oh · Submitted 2026-05-25

#RL#Other

Latent-Aligned Value Learning (LAVL) tackles offline goal-conditioned RL by reducing erroneous generalization in goal-conditioned value functions, using latent-representation-based value generalization together with hierarchical planning. On OGBench, it achieves the best performance on 20 of 22 datasets, with especially strong gains on long-horizon and trajectory-stitching settings where prior offline GCRL methods degrade.

Reinforcement Learning from Denoising Feedback arxiv HighNEW

Qi He, Huan Chen, Ya Guo, Huijia Zhu, Yi R. Fung et al. · Submitted 2026-05-25

#RL#Other

RLDF trains diffusion language models with reinforcement learning by using denoising feedback from rollouts and training to estimate policy loss more accurately. It optimizes toward a clipped clean-state estimate from intermediate noisy states, with weighted timestep sampling to manage the efficiency–accuracy tradeoff. Across LLaDA and Dream, it improves performance and generalization on multiple reasoning benchmarks, and the authors release the Drift training framework for dLLMs.

github

phamtrongthang123/drifting_policy github High

phamtrongthang123 · ★ 19 · Python · Created 2026-02-11 · Active 2026-03-25

#IL#Manipulator

An unofficial implementation of Generative Modeling via Drifting applies the method to the PushT manipulation benchmark on top of the Diffusion Policy codebase. It is mainly useful as a practical reproduction or starting point for studying drifting-based generative policies in a familiar robot imitation-learning setup.

jgillick/genesis-forge github High

jgillick · ★ 16 · Python · Created 2025-09-03 · Active 2026-04-28

#Other

Genesis Forge is a modular training setup for building robot learning environments on top of Genesis, with design cues from Isaac Lab and Gymnasium. It is meant to make Genesis-based robot training more structured and reusable, likely by giving researchers familiar environment abstractions for defining tasks, running simulations, and iterating on policies.

elle-miller/multimodal_rl github High

elle-miller · ★ 14 · Python · Created 2025-06-03 · Active 2026-05-27

#RL#Other

An RL library for training multimodal robotic agents in Isaac Lab.

noxrick91/WobbleGo github High

noxrick91 · ★ 21 · Python · Created 2026-02-02 · Active 2026-03-04

#RL#Manipulator

WobbleGo is a beginner-oriented Isaac Lab project built around a flywheel inverted pendulum, a compact control problem where a reaction wheel is used to balance an unstable body. It looks useful as a hands-on entry point for learning robot simulation, reinforcement learning/control workflows, and Isaac Lab project structure without the complexity of a full legged or manipulator system.

CarloRomeo427/ARC_RL github High

CarloRomeo427 · ★ 10 · Python · Created 2026-04-16 · Active 2026-05-21

#RL#Other

ARC_RL is a reinforcement learning playground themed around ARC Raiders-style robots. The provided description is sparse, so the safest read is that it is meant as an experimental sandbox for training or testing RL agents in robot-inspired scenarios rather than a polished benchmark or deployed robotics system.

github

osrbot/guguji_isaaclab github High

osrbot · ★ 14 · Python · Created 2026-04-20 · Active 2026-05-07

#Other

External extenstion based on Isaac Lab for guguji

github

Xiawenlong-bug/ISS-Policy github High

Xiawenlong-bug · ★ 13 · Python · Created 2025-12-18 · Active 2026-03-05

#IL#Manipulator

ISS-Policy is the official code release for “ISS Policy: Scalable Diffusion Policy with Implicit Scene Supervision,” a robot manipulation policy built around diffusion-based action generation. The available description does not give benchmark results or implementation details, but the title suggests the method uses implicit scene-level supervision to make diffusion policies scale better across robotic scenes.

ACT: Automated CPS Testing for Open-Source Robotic Platforms arxiv High

Aditya A. Krishnan, Donghoon Kim, Hokeun Kim · Submitted 2026-04-13

#Other

ACT targets the testing gap in open-source cyber-physical robotics software, where independently developed modules can interact in ways that leave serious platform-level errors hidden. It automates CPS testing for open-source robotic platforms, aiming to expose failures that ordinary module-level or contributor-driven testing may miss.

github

ChanghongHeya/Embodied-AI-Simulator-Notes github SURVEY

ChanghongHeya · ★ 66 · Created 2025-10-22 · Active 2026-04-06

#Other#Survey

记录 MuJoCo、Gazebo、Isaac Lab 等具身智能仿真器的中文学习内容。

Dynamic Neural Koopman Distillation for Real-Time Robot Control Using Diffusion Models arxiv HighNEW

Lei Zheng, Peiqi Yu, Zengqi Peng, Changliu Liu, Armin Lederer · Submitted 2026-05-24

#Manipulator

Dynamic Neural Koopman Distillation compresses a multistep diffusion policy into a single forward pass for high-rate robot control, using a Factorized Dynamic Koopman layer with state-dependent modal gains to mimic the denoising dynamics while preserving multimodal trajectory generation. On D4RL MuJoCo locomotion tasks it outperforms existing one-step distillation baselines, and on a physical Kinova arm it brings inference down to millisecond latency while maintaining smooth closed-loop execution and comparable task accuracy.

github

NVIDIA/omniperf github High

NVIDIA · ★ 32 · Python · Created 2026-03-06 · Active 2026-05-14

#Other

Tracking Isaac Lab performance across GPUs and simulation backends.

github

thanhndv212/figaroh-plus github High

thanhndv212 · ★ 38 · Python · Created 2024-02-16 · Active 2026-04-21

#Manipulator#Humanoid

A system identification framework for robots.

maniparena/maniparena-sim github High

maniparena · ★ 16 · Python · Created 2026-03-15 · Active 2026-05-14

#RL#Manipulator#Bimanual

ManipArena-Sim provides the Isaac Lab-based simulation stack behind ManipArena, a real-robot benchmark for bimanual manipulation. It supports collecting data, replaying demonstrations or runs, and evaluating policies for a bimanual robot, making it useful for developing and testing manipulation methods before or alongside real-robot experiments.

github

Chen-Suyi/SIRA_Pytorch github High

Chen-Suyi · ★ 26 · Python · Created 2023-08-06 · Active 2026-02-24

#Other

SIRA_Pytorch is a PyTorch implementation of SIRA-PCR, an ICCV 2023 method for sim-to-real adaptation in 3D point cloud registration. It targets the gap between synthetic and real point clouds, making it useful for robotics or perception pipelines that need registration models trained in simulation to transfer more reliably to real sensor data.

ky-ji/SAG github HighNEW

ky-ji · ★ 11 · Python · Created 2026-05-13

#IL#Other

Sparse ActionGen targets the runtime bottleneck in diffusion-policy robot control by pruning action generation in real time, so the policy can spend computation only where it is useful rather than denoising a full action sequence at every step. The repository appears to accompany the ICML 2026 paper and is likely most useful for researchers trying to deploy diffusion policies under tighter latency constraints or compare pruning-based acceleration against standard diffusion policy inference.

arxiv

ManiDreams: An Open-Source Library for Robust Object Manipulation via Uncertainty-aware Task-specific Intuitive Physics arxiv High

Gaotian Wang, Kejia Ren, Andrew S. Morgan, Kaiyu Hang · Submitted 2026-03-18 · Updated 2026-03-24

#RL#Manipulator

ManiDreams is an open-source library for robotic object manipulation that treats uncertainty in real-world dynamics as a first-class part of planning, rather than just trying to make a world model more accurate. It centers on task-specific intuitive physics, aiming to make manipulation policies more robust when objects, contacts, and outcomes are hard to predict exactly.

github

NJU-R-L-Group-Embodied-Lab/lavira-code github High

NJU-R-L-Group-Embodied-Lab · ★ 26 · Python · Created 2026-03-03 · Active 2026-05-14

#Other

LaViRA targets zero-shot vision-and-language navigation in continuous environments by translating language and visual context into executable robot actions. The repository is the code release for an ICRA 2026 project, useful for researchers looking at embodied navigation methods that bridge high-level instructions and low-level robot control without task-specific training examples.

github

Msornerrrr/in-hand-rotation-mjlab github High

Msornerrrr · ★ 30 · Python · Created 2026-02-20 · Active 2026-02-21

#RL#Other

Sim-to-real RL for in-hand cube rotation with the LEAP Hand, built on Mjlab.

github

Rin-Li/diffusion_motion github High

Rin-Li · ★ 10 · Python · Created 2025-06-05 · Active 2026-04-25

#IL#Other

Diffusion policy for path planning

github

Chen-Wendi/ImplicitRDP github High

Chen-Wendi · ★ 34 · Python · Created 2026-02-22 · Active 2026-02-23

#Manipulator

ImplicitRDP is the official implementation of an end-to-end robot manipulation policy that uses diffusion over visual and force inputs, aiming to couple perception with contact-rich control. Its distinctive angle is structural slow-fast learning, suggesting a policy architecture that separates slower visual/structural reasoning from faster force-reactive behavior for more responsive manipulation.

arxiv

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems arxiv HighNEW

Binghao Huang, Yunzhu Li · Submitted 2026-04-30

#Manipulator

FlexiTac is an open-source piezoresistive tactile sensor platform aimed at making touch sensing cheap and scalable for robotic end-effectors. The available description is sparse, but the emphasis is on a practical hardware solution that can be reproduced and expanded across different gripper or manipulator designs rather than a single custom sensor.

Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning arxiv High

Xinqi Lucas Liu, Ruoxi Hu, Alejandro Ojeda Olarte, Zhuoran Chen, Kenny Ma et al. · Submitted 2026-03-27 · Updated 2026-03-30

#Manipulator

Ruka-v2 is an open-source, tendon-driven dexterous hand aimed at making capable robot-learning hardware more accessible. It builds on the original Ruka hand, which offered 11 degrees of freedom with two joints per finger and three at the thumb, and was designed to be built for under $1,300.

MEVIUS2: Practical Open-Source Quadruped Robot with Sheet Metal Welding and Multimodal Perception arxiv High

Kento Kawaharazuka, Keita Yoneda, Shintaro Inoue, Temma Suzuki, Jun Oda et al. · Submitted 2026-03-23

#Other

MEVIUS2 is an open-source quadruped designed to be practical to build and modify, using sheet-metal welding rather than more specialized fabrication. It pairs a rough-terrain locomotion platform with multimodal perception, aiming to give researchers a reproducible robot they can assemble themselves and adapt for embodied AI experiments.

github

MSSergeev/so101-lab github High

MSSergeev · ★ 16 · Python · Created 2026-03-17 · Active 2026-03-22

#Manipulator

SO-ARM101 manipulation tasks with Isaac Lab

Prior Policy Guided Dual-Agent Coordinated Manipulation Planning of Spacecraft-Manipulator System arxiv HighNEW

Yuhui Hu, Dong Zhou, Kaihong Ouyang, Zhongliang Yu, Jianfeng Lv et al. · Submitted 2026-05-25

#RL#Manipulator

DACMP coordinates a 6-DoF space manipulator and its spacecraft base as two agents, planning motions that reach the end-effector target while keeping the base attitude stable despite the strong manipulator-base coupling. Its prior-policy-guided deep RL setup uses Timestep-level Expert Switching Guidance to improve convergence and task success, and the experiments report better success rates and control precision than baseline DRL methods, including under constraints, disturbances, and perception uncertainty.

github

rai-opensource/lottery_tickets github High

rai-opensource · ★ 17 · Python · Created 2025-12-02 · Active 2026-05-04

#Other

rai-opensource/lottery_tickets implements the Lottery Ticket Hypothesis for pretrained robot diffusion and flow policies, aiming to find sparse subnetworks within large pretrained policy models that retain or improve performance after pruning and fine-tuning. The interesting angle is applying lottery-ticket-style model compression to modern generative robot policies, where reducing policy size or isolating effective subnetworks could make deployment and adaptation more practical without training from scratch.

Grow-Prune-Freeze Networks: Adaptive & Continual Learning Technique for Olfactory Navigation arxiv HighNEW

Kordel K. France, Ovidiu Daescu · Submitted 2026-05-24

#RL#Other

Grow-Prune-Freeze networks let an olfactory navigation agent adapt its policy online by adding, removing, and freezing early network layers as the environment’s complexity changes. The paper grounds this mechanism in non-linear random matrix theory, extending Pennington and Worth’s single-hidden-layer analysis to continual n-layer models, and reports 94% success with Expected SARSA on turbulent plume navigation, a partially observable and non-stationary robotics task.

MuJoCoUni:Persistent Batched Runtime Primitives for MuJoCo arxiv HighNEW

Yufei Jia, Junzhe Wu · Submitted 2026-05-24

#Other

MuJoCoUni extends MuJoCo with persistent batched runtime primitives aimed at online robot learning, where many stateful environments need to run in parallel without giving up upstream MuJoCo behavior for contacts, constraints, sensors, and dynamics. Its BatchEnvPool executor manages per-environment model copies, per-thread data workers, and an internal thread pool to support short stepping, sparse resets, lifecycle domain randomization, batched sensor forwarding, Jacobians, and height-field queries from the Python binding layer rather than modifying MuJoCo’s core solver.

arxiv

OpenRC: An Open-Source Robotic Colonoscopy Framework for Multimodal Data Acquisition and Autonomy Research arxiv High

Siddhartha Kapuria, Mohammad Rafiee Javazm, Naruhiko Ikoma, Joga Ivatury, Mohammad Ali Nasseri et al. · Submitted 2026-04-04

#Other

OpenRC is an open-source robotic colonoscopy platform aimed at capturing synchronized multimodal data from the full procedure loop: operator inputs, scope motion, and visual feedback. It is built to make colonoscopy a more reproducible robotics research problem, giving researchers a shared framework for studying teleoperation dynamics, autonomy algorithms, and perception-control coupling in endoscopic navigation.

SubTGraph: Large-Scale Subterranean Environment Synthesis with Controllable Topological Variability for Robotic Autonomy Validation arxiv HighNEW

F. Labra Caso, A. Saradagi, S. Fredriksson, S. Nordström, A. Koval et al. · Submitted 2026-05-20

#Other

SubTGraph procedurally generates large, multi-level subterranean simulation worlds by turning user-specified structural constraints into a cost matrix that guides Dijkstra-based assembly of DARPA World Generator topometric tiles. It can vary topology, dimensionality, and textures to create mines, caves, and lava tubes, with an open-source release plus 150 generated worlds for statistical autonomy testing. The authors demonstrate its use on structural semantic segmentation against topometric ground truth, multi-agent path planning trend analysis, and LIO SLAM stress tests that expose failure cases in difficult underground sections.

arxiv

An Open Source Computer Vision and Machine Learning Framework for Affordable Life Science Robotic Automation arxiv High

Zachary Logan, Andrew Dudash, Daniel Negrón · Submitted 2026-03-20

#Other

An open-source robotics stack combines computer vision with machine-learning-based inverse kinematics to make life-science lab automation cheaper and more accessible. It targets concrete benchtop tasks such as colony picking and liquid handling, suggesting a route to repurpose affordable robotic hardware for workflows that usually require more specialized automation systems.

arxiv

Introducing M: A Modular, Modifiable Social Robot arxiv High

Victor Nikhil Antony, Zhili Gong, Yoonjae Kim, Chien-Ming Huang · Submitted 2026-03-19

#Other

M is an open-source, low-cost social robot platform meant to make social robotics studies easier to reproduce, customize, and run outside the lab. Its emphasis is on reducing the practical friction around building and modifying robot hardware, giving researchers a more accessible platform for deployments in real-world social settings.

Scaling, Benchmarking, and Reasoning of Vision-Language Agents for Mobile GUI Navigation arxiv HighNEW

Heng Qu, Yike Liu, Renren Jin, Wenzong Zhang, Pengzhi Gao et al. · Submitted 2026-05-26

#RL#Other

HyperTrack provides a large-scale testbed for mobile GUI navigation, with over 16,000 real-world tasks spanning more than 650 Chinese apps, while GUIEvalKit standardizes offline benchmarking for VLM agents. Using this setup, the authors show that scaling data helps both supervised and reinforcement finetuning, but reinforcement-based finetuning is consistently stronger, especially out of domain. They also use GUIEvalKit to probe how interaction history and reasoning affect task completion, making the study useful both as a benchmark release and as evidence for where VLM GUI agents improve with scale.

OSMa-Bench++: Toward Open-Ended Benchmarking of Semantic Mapping for Manipulation with Prompt-Generated Synthetic Scenes arxiv HighNEW

Regina Kurkova, Maxim Popov, Sergey Kolyubin · Submitted 2026-05-26

#RL#Manipulator

OSMa-Bench++ turns prompt-generated indoor scenes into OSMa-Bench-compatible simulation environments for evaluating semantic maps used in robotic manipulation. It uses the known scene-generation prompt as an extra semantic specification, adding prompt-grounded VQA while enabling targeted stress tests around clutter, small objects, occlusion, lighting variation, and other manipulation-relevant corner cases.

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents arxiv HighNEW

Bowen Wang, Dunjie Lu, Junli Wang, Tianyi Bai, Shixuan Liu et al. · Submitted 2026-05-25

#RL#Other

CUA-Gym scales RLVR data for computer-use agents by having generator and discriminator agents co-create task instructions, initial and golden environment states, and executable reward functions, with iterative execution plus filtering by LLM voting and agent rollouts. The resulting dataset has 32,112 verified training tuples across 110 mock web environments in CUA-Gym-Hub, and GSPO-trained 3B and 17B agents reach 62.1% and 72.6% on OSWorld-Verified while also transferring to WebArena.

arxiv

An Open-Source Robotics Research Platform for Autonomous Laparoscopic Surgery arxiv High

Ariel Rodriguez, Lorenzo Mazza, Martin Lelis, Rayan Younis, Sebastian Bodenstedt et al. · Submitted 2026-03-09

#Other

An open-source robotics platform is introduced for studying autonomous laparoscopic surgery, with an emphasis on the precision, safety, and kinematic constraints that make minimally invasive procedures hard to automate. From the provided text, the work appears to focus less on a single surgical policy result and more on giving researchers a reliable hardware/software basis for developing and evaluating autonomy in robot-assisted surgery.

AcroRL: Learning Aggressive Quadrotor Inversion using Bidirectional Thrust arxiv HighNEW

Gabriel Rodriguez, Henri Sayag, Abhishek Rathod, John Stecklein, Siddharth Saha et al. · Submitted 2026-05-23

#RL#Other

AcroRL uses reinforcement learning to modulate a constant reference trajectory so bidirectional-thrust quadrotors can execute compact, position-constrained flips between nominal and inverted flight without relying on hand-tuned thrust posture schedules. Separate policies handle nominal-to-inverted and inverted-to-nominal transitions, and in JAX simulation they cut position RMSE by 32% and settling time by 57% versus the strongest optimization baseline. Hardware tests show inversions across multiple yaw configurations with under 0.35 m position RMSE, plus continued compatibility with conventional trajectory generation through circular flight in both upright and inverted regimes.

OCELOT: Odometry and Contact Estimation for Legged Robots arxiv HighNEW

Emre Girgin, Cagri Kilic · Submitted 2026-05-21

#Other

OCELOT is a proprioception-only leg odometry pipeline for legged robots, built around an error-state EKF that corrects body motion using feet judged to be in stationary stance from IMU, joint encoder, and force sensor data. Its contact module fuses a debounced force-based GMM/FSM detector with a kinematic GLRT on estimated foot velocity, using the combined scores both to accept true stationary contacts and to downweight or reject slipping feet. Tested on 29 indoor and outdoor sequences totaling 2.4 km across concrete, grass, pebble, rock, and other terrains, it improves odometry robustness in slip-prone settings and is released as open-source ROS2 code.

When Search Becomes Memory: Turning Robot Design Trials into Transferable Skills arxiv HighNEW

Yunfei Wang, Xiaohao Xu, Yang Li, Xiaonan Huang · Submitted 2026-05-25

#RL#Other

Auto-Robotist turns evolutionary robot design from a memoryless search loop into an inspectable skill-building process: it distills simulator-tested morphology trials into a natural-language library of structural archetypes, evidence-backed design rules, failure cases, and supporting examples. During search, the agent retrieves these skills to guide LLM edits of elite robot bodies while keeping GA mutation for exploration, then updates the library through add, diagnose, and merge operations. On seven EvoGym tasks, this improves cold-start 5x5 design and transfers to 10x10 spaces, where reference-conditioned transfer beats a GA baseline on every task.

Learning High-Frequency Continuous Action Chunks in Latent Space arxiv HighNEW

Kunyun Wang, Yuhang Zheng, Yupeng Zheng, Jieru Zhao, Wenchao Ding · Submitted 2026-05-24

#Other

Learning High-Frequency Continuous Action Chunks in Latent Space moves 60 Hz robot action chunking into a VAE latent space, where chunks can be generated with better temporal smoothness and spatial consistency than direct action-space prediction. Its Reuse-then-Refine strategy reuses the previous chunk and refines it for the next one, improving continuity during asynchronous real-time inference. On three real-world contact-rich manipulation tasks, the resulting policy runs more continuously with fewer pauses and jerky motions.

Elevator-LIO: Robust LiDAR-Inertial Odometry for Multi-Floor Navigation under Elevator-Induced Non-Inertial Motion arxiv HighNEW

Yifan Zhang, Yudong Huang, Yuchong Zhang, Changze Li, Haoran Liu et al. · Submitted 2026-05-23

#RL#Other

Elevator-LIO keeps LiDAR-inertial odometry alive through elevator rides by separating the robot’s motion relative to the elevator from the elevator’s own non-inertial motion, then switching its Kalman-filter updates by mode. It detects elevator entry/exit from LiDAR range statistics and state estimates, adds zero-velocity and zero-acceleration updates when the elevator stops to control vertical drift, and uses adaptive voxel downsampling as scene scale changes. Across 20 real-world sequences with 79 elevator rides, including pedestrians, mirrors, large spaces, and long vertical travel, it maintained continuous localization in all runs and ended within 1 cm height error on 17 sequences while staying competitive on Hilti 2022/2023 indoor benchmarks.

arxiv

SlicerRoboTMS: An Open-Source 3D Slicer Extension for Robot-Assisted Transcranial Magnetic Stimulation arxiv HighNEW

Wenzhi Bai, Yituo Guo, Bhaskar Basu, Andrew Weightman, Zhenhong Li · Submitted 2026-04-28

#Other

SlicerRoboTMS is an open-source 3D Slicer extension for planning and guiding robot-assisted transcranial magnetic stimulation from patient imaging. It is aimed at making Robo-TMS workflows more accurate and reproducible than manual coil positioning by integrating image guidance and robotic assistance into a familiar medical-imaging platform.

The Unified Autonomy Stack: Toward a Blueprint for Generalizable Robot Autonomy arxiv HighNEW

Mihir Dharmadhikari, Nikhil Khedekar, Mihir Kulkarni, Morten Nissov, Martin Jacquet et al. · Submitted 2026-05-12

#Other

The Unified Autonomy Stack is an open-source, system-level autonomy stack designed to run resiliently across both aerial and ground robots with different morphologies. From the available abstract text, its value is mainly as a proposed blueprint for generalizable robot autonomy rather than a single task-specific planner or controller.

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research arxiv MidNEW

Dingbang Wu, Rui Hao, Haiyang Wang, Shuzhe Wu, Han Xiao et al. · Submitted 2026-05-25

#RL#Other

MobileGym is a browser-hosted mobile GUI simulation environment that represents app state as structured JSON, making tasks configurable, forkable, comparable, and judgeable without depending on proprietary app backends. Its deterministic state-based judges provide both evaluation verdicts and dense RL rewards, while the lightweight design supports hundreds of parallel instances on one server, enabling scalable online RL for everyday mobile-agent tasks. MobileGym-Bench covers 416 parameterized templates across 28 apps, and GRPO training on Qwen3-VL-4B-Instruct improves test performance by 12.8 points in simulation, with 95.1% of that gain retained on a 59-task real-device subset.

LECTOR: Joint Optimization of Scientific Reasoning Graphs and Introduction Generation arxiv MidNEW

Jiabei Xiao, Yizhou Wang, Chen Tang, Pengze Li, Wanli Ouyang et al. · Submitted 2026-05-25

#RL#Other

LECTOR treats scientific introduction writing as a grounded reasoning-and-structuring problem rather than plain text generation: it builds a logic-reasoning graph from the paper body, then uses co-reinforcement learning to preserve that structure while producing the final introduction with citations. On a Nature Communications-based dataset, it improves graph quality by 26.7%, citation quality by 8.6%, and paper consistency by 3.3%, suggesting that explicit reasoning blueprints can reduce the drift and citation hallucination common in AI-assisted paper writing.

RLVR Datasets and Where to Find Them: Tracing Data Lineage for Better Training Data arxiv MidNEW

Hsiu-Yuan Huang, Weijie Liu, Chenming Tang, Sanwoo Lee, Kai Yang et al. · Submitted 2026-05-26

#RL#Other

ATLAS traces RLVR training examples back to atomic sources, showing that 99.7% of 1.45M instances come from just 20 upstream sources and that many popular datasets are lightly modified variants with contamination risk. The authors use this lineage map to build DAPO++ via Source-level Counterfactual Attribution, scoring sources by their marginal utility against a shared base model; on Qwen3 models, DAPO++ improves held-out performance and the derived quality score predicts which RLVR datasets will train well.

REVERSE: Reinforcing Evidence Verification and Search for Agentic Image geo-localization arxiv MidNEW

Yong Li, Furong Jia, Dacheng Yin, Kang Rong, Fengyun Rao et al. · Submitted 2026-05-26

#RL#Other

REVERSE trains an image geolocalization agent to behave more like a human analyst: choose informative image regions, formulate external search queries, and decide which retrieved evidence is actually geo-informative. It builds tool-grounded trajectories with region, search, and evidence labels, then uses process rewards plus a stable offline search cache to supervise multi-turn search-and-verification behavior. With a 4B model, it beats strong retrieval-augmented baselines and approaches much larger models on Im2GPS3k and YFCC4k.

Self-Improvement Imitation with Biologically Guided Search for Protein Design Under Oracle Budgets arxiv MidNEW

Ashima Khanna, Dominik Grimm · Submitted 2026-05-26

#RL#Other

SILO tackles oracle-budgeted protein sequence optimization by learning from its own best mutation trajectories rather than estimating values, using a hierarchical edit policy that chooses mutation positions and residues separately. Candidate trajectories are generated with stochastic beam search, then filtered by a UCB proxy ensemble plus alanine-scan fitness scores to favor edits that preserve functionally important residues. Across eight reproduced protein fitness landscapes, it achieved the best maximum and top-100 mean fitness on all eight, with ablations showing that the biologically guided search components drive much of the gain.

Nori Bot: A Sub-$1,000 Floor-to-Counter Mobile Manipulator arxiv MidNEW

Antonio Li, Sungjoon Park, Wen Ni Chew · Submitted 2026-05-15

#RL#Manipulator#MobileManipulator

Nori Bot is an open-source mobile manipulator designed to stay under $1,000 while addressing practical gaps in earlier low-cost platforms: it can work from floor to counter height, supports more than purely reactive control, and includes protection against stall-induced Feetech servo burn-out. The interesting engineering angle is that it targets the failure modes and workspace limits that make cheap robots hard to use for real manipulation, rather than just minimizing the bill of materials.

Geometric Workspace Analysis and Transmission-Aware Dynamics of a Serial Spherical Tool for Microsurgery arxiv MidNEW

Anestis Mablekos-Alexiou, Lyndon da Cruz, Christos Bergeles · Submitted 2026-05-23

#RL#Other

A serial spherical microsurgery tool is analyzed with a geometric workspace formulation that lets designers choose rotation-axis orientations quickly without numerical optimization, plus a transmission-aware dynamics model for self-locking drives. The approach predicts reachable motion and torque needs for a specified workspace, and experiments on a vitreoretinal surgery robot validate the models while an open-source package supports friction identification and inverse-dynamics analysis.

FusionCore: A 23-State Unscented Kalman Filter for IMU, Wheel Encoder, GPS, and Visual SLAM Fusion in ROS 2 arxiv MidNEW

Manan Kharwar · Submitted 2026-05-24

#Other

FusionCore is an open-source ROS 2 sensor-fusion package that produces 100 Hz odometry by combining IMU, wheel encoders, GPS, and Visual SLAM in a 23-state Unscented Kalman Filter. A distinctive part of the filter is its online estimate of wheel-encoder yaw-rate bias, learned from GPS heading cross-covariance and used to reduce drift during GPS dropouts, alongside explicit IMU bias states, ECEF-native GPS handling, chi-squared outlier gating, adaptive noise, and recovery from VSLAM map resets. On twelve 55–92 minute NCLT sequences, it beats robot_localization on 10 of 12 runs with 1.2x–22.2x lower ATE on the winning cases, while the robot_localization UKF numerically diverges on all twelve.

Papers only (379)

No code or project page detected. Lower priority — but kept here for completeness.

AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation arxiv High

Yusuke Takagi, Motonari Kambara, Daichi Yashima, Koki Seno, Kento Tokura et al. · Submitted 2026-03-16

#VLA#Manipulator#MobileManipulator

AnoleVLA is a lightweight vision-language-action model using deep state space models for language-guided mobile manipulation. It targets object manipulation from vision and natural-language instructions; no code, benchmark, or robot-test details are given here.

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds arxiv High

Andrew Choi, Xinjie Wang, Zhizhong Su, Wei Xu · Submitted 2026-03-19 · Updated 2026-03-28

#RL#VLA#Other

Method for scaling RL fine-tuning of robot vision-language-action models via generative 3D simulation. The excerpt frames it against real-world VLA fine-tuning, but gives no numbers, benchmarks, code release, or robot-test details.

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation arxiv High

Ruisen Tu, Arth Shukla, Sohyun Yoo, Xuanlin Li, Junxi Li et al. · Submitted 2026-03-24

#VLA#Manipulator#MobileManipulator

SG-VLA is a spatially grounded VLA method for mobile manipulation in household scenes. It targets layout reasoning, fine geometry, and continuous high-dimensional actions; no numbers, code, or robot-test details are given here.

REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning arxiv High

Zhaoyuan Gu, Yipu Chen, Zimeng Chai, Alfred Cueva, Thong Nguyen et al. · Submitted 2026-03-14 · Updated 2026-03-17

#RL#IL#Manipulator#Humanoid

REFINE-DP fine-tunes diffusion policies with reinforcement learning for humanoid loco-manipulation. It targets coordinated motion planning and stable whole-body execution for complex, long-horizon tasks; code, benchmark, and real-robot details aren’t specified.

D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models arxiv HighNEW

Yucheng Guo, Yongjian Guo, Zhong Guan, Wen Huang, Haoran Sun et al. · Submitted 2026-05-13 · Updated 2026-05-14

#RL#VLA#Other

D-VLA is a distributed asynchronous reinforcement-learning framework for vision-language-action models. It targets high-concurrency embodied-AI training, but the excerpt gives no benchmarks, robot tests, or code-release details.

ProgressVLA: Progress-Guided Diffusion Policy for Vision-Language Robotic Manipulation arxiv High

Hongyu Yan, Qiwei Li, Jiaolong Yang, Yadong Mu · Submitted 2026-03-29

#IL#VLA#Manipulator

ProgressVLA is a progress-guided diffusion-policy method for vision-language robotic manipulation. It targets VLA models’ weak progress awareness and heuristic task termination in long-horizon cascaded sub-goal tasks; code, benchmarks, and robot tests are not specified here.

AnchorVLA4D: an Anchor-Based Spatial-Temporal Vision-Language-Action Model for Robotic Manipulation arxiv High

Juan Zhu, Zhanying Shao, Xiaoqi Li, Ethan Morgan, Jiadong Xu et al. · Submitted 2026-03-13

#VLA#Manipulator

AnchorVLA4D is an anchor-based spatial-temporal VLA method for robotic manipulation. It uses visual anchors to improve spatial perception and maintain memory during manipulation; the excerpt gives no results, code, or test details.

Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching arxiv HighNEW

Kejia Ren, Gaotian Wang, Andrew S. Morgan, Kaiyu Hang · Submitted 2026-05-10

#Manipulator

arXiv study of zero-shot sim-to-real robot learning for dexterous reactive catching. It targets physics-heavy manipulation under modeling errors and perception noise, using catching as a real-robot transfer testbed.

X-DiffVLA: X-Embodied Diffusion Action Heads for Vision-Language-Action Models arxiv HighNEW

Boyu Li, Chaoyi Xu, Haoqi Yuan, Xinrun Xu, Börje F. Karlsson et al. · Submitted 2026-05-24

#VLA#Manipulator

X-DiffVLA targets cross-embodiment robot learning where platforms share a base but differ in end-effectors, using a unified diffusion action head instead of embodiment-specific fine-tuning. It combines Embodiment Forcing, a classifier-free guidance method that steers actions toward the right embodiment-specific functional structure, with Morphological Tree Diffusion to transfer behavior across grippers and dexterous hands. Across RoboCasa and Isaac Gym it improves performance by 15.3% and 12.5%, with real-world tests showing the approach remains robust outside simulation.

Robotic Strawberry Harvesting with Robust Vision and Deep Reinforcement Learning based Sim-to-Real Control arxiv HighNEW

Al Bashir, Shao-Yang Chang, Partho Ghose, Prem Raj, Chen-Kang Huang et al. · Submitted 2026-05-22

#RL#Manipulator

HRAttnEdge-YOLO26-seg pairs a high-resolution, attention-augmented strawberry instance segmenter with a target-conditioned PPO controller trained in Isaac Lab and deployed through ROS on a UR10e, replacing much of the usual hand-tuned planner stack with closed-loop learned reaching. The vision model improves segmentation by about 10-14% on self-collected and public datasets, while greenhouse trials harvested 281 strawberries with 96.6% reaching success, 91.3% grasp-and-pull success, and 84.3% overall harvesting success.

arxiv

Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models arxiv High

Ruixing Jin, Zicheng Zhu, Ruixiang Ouyang, Sheng Xu, Bo Yue et al. · Submitted 2026-03-24

#VLA#Manipulator

Empirical arXiv study on sim-to-real generalization for dexterous manipulation with vision-language-action models. It examines simulation-generated data as a lower-cost substitute for real-world data; no code or robot benchmark details are given.

VLA-REPLICA: A Low-Cost, Reproducible Benchmark for Real-World Evaluation of Vision-Language-Action Models arxiv HighNEW

Alex S. Huang, Jiahui Zhang, Shiqing Tang, Yu Xiang · Submitted 2026-05-20

#IL#VLA#Manipulator

VLA-REPLICA is a low-cost real-world benchmark for evaluating vision-language-action manipulation models using off-the-shelf hardware that labs can assemble independently. It provides a shared task suite, a small demonstration dataset for target-domain adaptation, and protocols for both in-distribution and out-of-distribution evaluation. Experiments with imitation learning and current VLA models show where policies succeed or fail, and matching results across separately built setups suggest the benchmark is reproducible outside a single lab.

Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models arxiv High

Zhilong Zhang, Haoxiang Ren, Yihao Sun, Yifei Sheng, Haonan Wang et al. · Submitted 2026-03-21

#RL#VLA#Other

An arXiv method for world model-based RL finetuning of vision-language-action robot policies. It targets the cost and safety limits of real-world RL; no results, benchmarks, robot tests, or code release are specified in the snippet.

VILAS: A VLA-Integrated Low-cost Architecture with Soft Grasping for Robotic Manipulation arxiv HighNEW

Zijian An, Hadi Khezam, Bill Cai, Ran Yang, Shijie Geng et al. · Submitted 2026-05-03

#VLA#Manipulator

VILAS is a low-cost, modular robotic manipulation platform with soft grasping. It targets end-to-end vision-language-action policy learning and deployment on accessible hardware.

GesVLA: Gesture-Aware Vision-Language-Action Model Embedded Representations arxiv HighNEW

Wenxuan Guo, Ziyuan Li, Meng Zhang, Yichen Liu, Yimeng Dong et al. · Submitted 2026-05-21

#VLA#Manipulator

GesVLA extends vision-language-action robot policies with pointing gestures as a parallel instruction channel, embedding gesture features directly into the model’s latent space so they influence both target reasoning and action generation. It uses a dual-VLM architecture and a synthetic gesture data pipeline that renders hand models onto real scene images, then trains in two stages for gesture perception and action prediction. In real robot block manipulation, product selection, and produce selection tasks, gestures improve target grounding and interaction efficiency, especially in cluttered scenes with multiple similar objects.

Mobile UMI: Cross-View Diffusion Policy with Decoupled Kinematics for Mobile Manipulation arxiv HighNEW

Haoran Huang, Haonan Dong, Huixu Dong · Submitted 2026-05-20

#IL#Manipulator#MobileManipulator

Mobile UMI learns mobile manipulation from portable human demonstrations using separate chest and wrist cameras, then anchors their visual-inertial frames with a one-shot ChArUco calibration so hand motion can be expressed relative to the body and split into SE(3) manipulation and SE(2) base trajectories. At execution time, an asynchronous receding-horizon controller realigns each diffusion-policy action chunk to the robot’s current pose, discarding stale waypoints caused by inference latency. On four long-horizon household tasks it reached 83.8% average success over 100 trials per task, with ablations showing that chest-relative labels and online state matching account for the gains over ACT and Diffusion Policy.

AtomVLA: Scalable Post-Training for Robotic Manipulation via Predictive Latent World Models arxiv High

Xiaoquan Sun, Zetian Xu, Chen Cao, Zonghe Liu, Yihan Sun et al. · Submitted 2026-03-09

#RL#VLA#Manipulator

AtomVLA is an arXiv method for post-training VLA robot policies with predictive latent world models. It claims improved instruction grounding for complex multi-step manipulation, but the snippet gives no numbers, benchmarks, code, or real-robot results.

ReconVLA: An Uncertainty-Guided and Failure-Aware Vision-Language-Action Framework for Robotic Control arxiv High

Lingling Chen, Zongyao Lyu, William J. Beksi · Submitted 2026-04-17

#VLA#Other

ReconVLA is an arXiv VLA framework for robotic control, using uncertainty guidance and failure awareness. It maps visual observations and language instructions to continuous action sequences; no code, benchmark, or robot-test details are provided.

HyperSim: A Holistic Sim-To-Real Framework For Robust Robotic Manipulation arxiv HighNEW

Junyi Dong, Haotian Luo, Ziwei Xu, Shengwei Bian, Heng Zhang et al. · Submitted 2026-05-26

#Manipulator

HyperSim tackles robot manipulation sim-to-real transfer as a full pipeline: high-fidelity synthetic environment generation, adversarial trajectory generation to widen data coverage, and joint sim-real training to encourage domain-invariant representations. In 400 real-world executions across ACT and π₀ policies, the full system reached 80% and 95% sim-to-real success rates, and adversarially generated trajectories improved robustness under physical perturbations by 35 percentage points.

AnchorVLA: Anchored Diffusion for Efficient End-to-End Mobile Manipulation arxiv High

Jia Syuen Lim, Zhizhen Zhang, Peter Bohm, Brendan Tidd, Zi Huang et al. · Submitted 2026-04-02

#VLA#Manipulator#MobileManipulator

AnchorVLA is a method for end-to-end mobile manipulation using anchored diffusion. It targets cluttered scenes with multiple valid approach-and-grasp actions; the excerpt gives no code, benchmark, or real-robot details.

Cost-Matching Model Predictive Control for Efficient Reinforcement Learning in Humanoid Locomotion arxiv High

Wenqi Cai, Kyriakos G. Vamvoudakis, Sébastien Gros, Anthony Tzes · Submitted 2026-03-30

#RL#Humanoid

Cost-matching method for humanoid locomotion that combines MPC with reinforcement learning. It aims to make optimal locomotion learning more efficient by aligning RL costs with the MPC objective; the excerpt gives no code, robot tests, or benchmarks.

arxiv

Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection arxiv High

Junhyeok Rui Cha, Woohyun Cha, Jaeyong Shin, Donghyeon Kim, Jaeheung Park · Submitted 2026-03-23 · Updated 2026-03-25

#Humanoid

Joint torque space perturbation injection is a sim-to-real training method for humanoid locomotion control policies. The excerpt claims it as an alternative to existing transfer methods, but gives no results, code, benchmark, or real-robot validation details.

arxiv

PAPO-VLA: Planning-Aware Policy Optimization for Vision-Language-Action Models arxiv HighNEW

Peizheng Guo, Jingyao Wang, Changwen Zheng, Wenwen Qiang · Submitted 2026-05-19

#VLA#Manipulator

Vision-Language-Action (VLA) models show promising ability in language-guided robotic tasks. However, making VLA policies reliable remains challenging, because a manipulation task is completed through closed-loop interaction, where each action affects subsequent execution.

Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers arxiv High

Snehal Jauhri, Vignesh Prasad, Georgia Chalvatzaki · Submitted 2026-04-14

#RL#Manipulator#MobileManipulator

Offline RL method for whole-body mobile manipulation of articulated objects, coordinating a robot base and arms. Targets doors, drawers, and cupboards using sub-optimal controllers; no numbers, code, benchmark, or real-robot details are given.

Reinforcing VLAs in Task-Agnostic World Models arxiv HighNEW

Yucen Wang, Rui Yu, Fengming Zhang, Junjie Lu, Xinyao Qin et al. · Submitted 2026-05-12

#RL#VLA#Other

RL post-training approach for Vision-Language-Action models inside learned task-agnostic world models. Claimed to adapt VLAs to new tasks without costly real-world interaction; no numbers, code release, benchmarks, or robot tests are specified.

Risk-Aware Reinforcement Learning for Mobile Manipulation arxiv High

Michael Groom, James Wilson, Nick Hawes, Lars Kunze · Submitted 2026-03-04

#RL#Manipulator#MobileManipulator

ArXiv paper on risk-aware reinforcement learning for mobile manipulation. It targets robots that assess action risks before acting in everyday environments; no code, benchmark, or real-robot details are given.

CycleRL: Sim-to-Real Deep Reinforcement Learning for Robust Autonomous Bicycle Control arxiv High

Gelu Liu, Teng Wang, Zhijie Wu, Junliang Wu, Songyuan Li et al. · Submitted 2026-03-16 · Updated 2026-05-03

#RL#Other

CycleRL is a sim-to-real deep reinforcement learning method for autonomous bicycle control. It targets robust handling of underactuated nonlinear dynamics, model mismatch, and real-world uncertainty. The excerpt gives no numbers, code release, benchmark, or test details.

Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts arxiv HighNEW

Zhen Sun, Yongjian Guo, Haoran Sun, Luqiao Wang, Wei Lu et al. · Submitted 2026-05-21

#VLA#Other

Pre-VLA adds a fast runtime verifier in front of VLA policies and world-model rollouts, scoring candidate action chunks for both safety confidence and critic-derived advantage before they are executed or imagined. It uses modality-aware pooling plus a lightweight dual-branch head, then filters or resamples low-quality actions under a compute budget to avoid physical failures and wasted rendering. On LIBERO, it raises RynnVLA-002’s average closed-loop success rate from 30.79% to 37.62%, shortens executions, and verifies each action chunk in about 184 ms.

arxiv

RePO-VLA: Recovery-Driven Policy Optimization for Vision-Language-Action Models arxiv HighNEW

Weijia Liufu, Xiaoyu Guo, Ruiyi Chen, Jingzhi Liu, Kaidong Zhang et al. · Submitted 2026-05-10

#VLA#Manipulator

RePO-VLA is a recovery-driven optimization method for vision-language-action robot policies. It targets execution drift in long-horizon, contact-rich manipulation by using failed rollouts as supervision; code or robot benchmarks are not specified.

ForgeVLA: Federated Vision-Language-Action Learning without Language Annotations arxiv HighNEW

Yuhao Zhou, Yunpeng Zhu, Yang Zhou, Jindi Lyu, Jian Lan et al. · Submitted 2026-05-08

#VLA#Other

ForgeVLA is an arXiv method for federated vision-language-action learning without language annotations. It targets the annotated-data bottleneck in scaling general-purpose robotic VLA models. The excerpt gives no numbers, code release, real-robot tests, or benchmark details.

AT-VLA: Adaptive Tactile Injection for Enhanced Feedback Reaction in Vision-Language-Action Models arxiv HighNEW

Xiaoqi Li, Muhe Cai, Jiadong Xu, Juan Zhu, Hongwei Fan et al. · Submitted 2026-05-08

#VLA#Manipulator

AT-VLA is an arXiv method for injecting adaptive tactile feedback into vision-language-action robot models. It targets better reactions in contact-rich manipulation, but the excerpt gives no metrics, benchmark, code, or real-robot release details.

E-VLA: Event-Augmented Vision-Language-Action Model for Dark and Blurred Scenes arxiv High

Jiajun Zhai, Hao Shi, Shangwei Guo, Kailun Yang, Kaiwei Wang · Submitted 2026-04-06

#VLA#Manipulator

E-VLA is an event-augmented vision-language-action method for robotic manipulation under degraded sensing. It targets low light, motion blur, and black clipping; no metrics, code, benchmark, or real-robot details are stated.

PhysiFlow: Physics-Aware Humanoid Whole-Body VLA via Multi-Brain Latent Flow Matching and Robust Tracking arxiv High

Weikai Qin, Sichen Wu, Ci Chen, Mengfan Liu, Linxi Feng et al. · Submitted 2026-03-05

#VLA#Humanoid

PhysiFlow is a physics-aware method for humanoid whole-body vision-language-action control. It targets semantically guided real-world tasks via multi-brain latent flow matching and robust tracking; code, benchmarks, and robot tests are not specified.

QuietWalk: Physics-Informed Reinforcement Learning for Ground Reaction Force-Aware Humanoid Locomotion Under Diverse Footwear arxiv High

Hanze Hu, Luying Feng, Silu Chen, Tianjiang Zheng, Dexin Jiang et al. · Submitted 2026-04-26

#RL#Humanoid

Physics-informed RL method for humanoid locomotion that accounts for ground reaction forces across diverse footwear. It targets reduced foot-ground impact transients, vibration, noise, and hardware wear; no code or real-robot validation is stated.

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance arxiv High

Yupeng Zheng, Xiang Li, Songen Gu, Yuhang Zheng, Shuai Tian et al. · Submitted 2026-04-22 · Updated 2026-04-24

#VLA#Manipulator

PokeVLA is an arXiv method for a pocket-sized vision-language-action model for robot manipulation. It targets efficiency, high-level knowledge, and spatial awareness via world-knowledge guidance; code, benchmarks, and real-robot results are not specified in the snippet.

Multi-Gait Learning for Humanoid Robots Using Reinforcement Learning with Selective Adversarial Motion Prior arxiv High

Yuanye Wu, Keyi Wang, Linqi Ye, Boyang Xing · Submitted 2026-04-21

#RL#Humanoid

Reinforcement-learning method for multi-gait humanoid locomotion using a Selective Adversarial Motion Prior. It targets diverse gaits in one framework while balancing stability with dynamic expressiveness. Code, real-robot tests, and benchmarks are not specified.

Closed-Loop Sim-to-Real Reinforcement Learning for Deformable Microfiber Shape Control arxiv HighNEW

Alessandro Amici, Houari Bettahar, Veeti Jaakkola, Quan Zhou · Submitted 2026-05-20

#RL#Manipulator

A reinforcement-learning policy is trained only in a simplified frictionless simulator to regulate microfiber geometry, then deployed directly on a real dual-gripper micromanipulation setup using 40 Hz visual feedback to correct surface-contact effects that the simulator does not model. On silk microfibers, the same closed-loop policy reaches 270 ± 80 μm mean point-wise shape error over 24 initial configurations and stays below 1 mm final error across nine diameter/length combinations without retraining or tuning.

GaussianDream: A Feed-Forward 3D Gaussian World Model for Robotic Manipulation arxiv HighNEW

Zijian Zhang, Yuqing Jiang, Qian Cheng, Si Liu, Ding Zhao et al. · Submitted 2026-05-20

#RL#VLA#Manipulator

GaussianDream adds a feed-forward 3D Gaussian world-model plug-in to VLA manipulation policies, using robot trajectories to train a compact spatio-temporal prefix that can reconstruct current scenes and predict short-horizon future Gaussian states. During inference it throws away the rendering and prediction heads and uses only that learned prefix to condition action generation, so closed-loop control gets geometry-aware training benefits without test-time rollouts or planning. It reports 98.4% average success on LIBERO, 52.6% on RoboCasa Human-50, and 50.0% in real-world robot evaluation.

RoVLA: Multi-Consistency Constraints for Robust Vision-Language-Action Models arxiv HighNEW

Jingzhou Luo, Yifan Wen, Yongjie Bai, Xinshuai Song, Yang Liu et al. · Submitted 2026-05-19

#VLA#Manipulator

Vision-Language-Action (VLA) models have shown strong performance on embodied manipulation, yet they remain brittle under visual observation changes, paraphrased language instructions, and compounded perturbations.

Domain-Adaptive Communication-Rate Optimization for Sim-to-Real Humanoid-Robot Wireless XR Teleoperation arxiv HighNEW

Caolu Xu, Zhiyong Chen, Meixia Tao, Li Song, Feng Yang et al. · Submitted 2026-05-19

#Humanoid

Wireless extended reality (XR) teleoperation provides embodied interaction capability for collecting humanoid robot demonstrations, but the large-scale adoption is restricted by the overhead of high-frequency motion transmission.

AffordVLA: Injecting Affordance Representations into Vision-Language-Action Models via Implicit Feature Alignment arxiv HighNEW

Weijie Kong, Zhian Su, Wei Yu, Huixu Dong · Submitted 2026-05-17

#VLA#Manipulator

AffordVLA is a method that aligns VLA hidden visual features with a zero-shot affordance teacher. It reports state-of-the-art results over baselines while preserving inference efficiency, with validation in simulation and real-world manipulation; no code release is noted.

DyGRO-VLA: Cross-Task Scaling of Vision-Language-Action Models via Dynamic Grouped Residual Optimization arxiv HighNEW

Sixu Lin, Yunpeng Qing, Litao Liu, Ming Zhou, Ruixing Jin et al. · Submitted 2026-05-17

#RL#VLA#Other

DyGRO-VLA is an arXiv method for optimizing vision-language-action models with dynamic grouped residual optimization. It uses RL to move VLA training from trajectory imitation toward active learning in task environments; the excerpt states no code, benchmarks, or robot tests.

HALO:Closing Sim-to-Real Gap for Heavy-loaded Humanoid Agile Motion Skills via Differentiable Simulation arxiv High

Xingyi Wang, Chenyun Zhang, Weiji Xie, Chao Yu, Wei Song et al. · Submitted 2026-03-16

#Humanoid

HALO is a differentiable-simulation method for heavy-loaded humanoid agile motion skills. It targets sim-to-real mismatch from unknown carried payloads, aiming to keep RL-trained behaviors effective for real-world humanoid carrying tasks.

Feedback World Model Enables Precise Guidance of Diffusion Policy arxiv HighNEW

Tuo An, Jindou Jia, Gen Li, Jingliang Li, Chuhao Zhou et al. · Submitted 2026-05-15

#RL#IL#Manipulator

Feedback world model turns a robot’s world model into a closed-loop predictor: after each action, it compares the predicted next state with the observed one and updates a lightweight latent feedback state online, without new data or parameter updates. The corrected predictions are then fed into diffusion policy guidance through an action-aware mechanism that emphasizes controllable state components and filters out irrelevant variation. Across LIBERO-Plus, Robomimic, and real-world manipulation, this reduced prediction error by up to 76.4% and improved out-of-distribution success rates by 30%.

NavRL++: A System-Level Framework for Improving Sim-to-Real Transfer in Reinforcement Learning-Based Robot Navigation arxiv HighNEW

Zhefan Xu, Hanyu Jin, Kenji Shimada · Submitted 2026-05-15

#RL#Other

NavRL++ is a system-level framework for reinforcement-learning-based robot navigation. It targets better sim-to-real transfer, but the excerpt gives no results, code, benchmark, or real-robot validation details.

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation arxiv HighNEW

Shijie Lian, Bin Yu, Xiaopeng Lin, Zhaolong Shen, Laurence Tianruo Yang et al. · Submitted 2026-05-14

#IL#VLA#Manipulator

IntentVLA addresses action aliasing in robot imitation learning, where nearly identical visual-language inputs can legitimately lead to different short-horizon action chunks depending on a demonstrator’s immediate intent or task phase. It models those latent short-horizon intents explicitly, aiming to make vision-language-action policies less likely to average over multimodal demonstrations and more able to choose the right manipulation behavior for the current context.

AR-VLA: True Autoregressive Action Expert for Vision-Language-Action Models arxiv High

Yutong Hu, Jan-Nico Zaech, Nikolay Nikolov, Yuanqi Yao, Sombit Dey et al. · Submitted 2026-03-10 · Updated 2026-05-11

#VLA#Other

AR-VLA introduces a standalone autoregressive action expert for vision-language-action models, treating robot actions as a continuous causal sequence rather than a one-shot prediction. It conditions that action stream on refreshable vision-language prefixes, letting the model update its perception-language context while preserving autoregressive structure over actions.

LoopVLA: Learning Sufficiency in Recurrent Refinement for Vision-Language-Action Models arxiv HighNEW

Boyang Shen, Kaixiang Yang, Hao Wang, Qiuyu Yu, Qiang Xie et al. · Submitted 2026-05-11

#VLA#Other

LoopVLA is a method for Vision-Language-Action models that challenges using only the deepest vision-language backbone features for action prediction. It learns when recurrent refinement is sufficient; the excerpt gives no benchmarks, robot tests, or code-release details.

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models arxiv High

Ziyue Zhu, Shangyang Wu, Shuai Zhao, Zhiqiu Zhao, Shengjie Li et al. · Submitted 2026-03-10

#VLA#Manipulator

NS-VLA is an arXiv method for neuro-symbolic vision-language-action modeling in robotic manipulation. It aims to ground instructions in visual context and generate action sequences; no benchmark, real-robot result, or code-release detail is given.

VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models arxiv High

Zixuan Wang, Yuxin Chen, Yuqi Liu, Jinhui Ye, Pengguang Chen et al. · Submitted 2026-03-23 · Updated 2026-05-09

#VLA#Other

VP-VLA is a visual-prompting interface for vision-language-action models, using visual cues to condition robot action generation beyond language and observations. The excerpt gives no numbers, benchmark or robot results, or code-release status.

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing arxiv High

StarVLA Community · Submitted 2026-04-06

#RL#VLA#Other

StarVLA is a modular, Lego-like codebase for developing Vision-Language-Action embodied-agent models. It targets perception, language understanding, and action integration using multimodal foundation models, including VLMs and world models.

ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation arxiv HighNEW

Wei Li, Jizhihui Liu, Li Yixing, Junwen Tong, Rui Shao et al. · Submitted 2026-05-06

#VLA#Manipulator

ConsisVLA-4D is a VLA method for robotic manipulation aimed at more consistent 3D perception and 4D spatiotemporal reasoning. The excerpt frames current 2D-observation VLAs as sensor-heavy or weakly instruction-aligned, but gives no code, benchmark, or robot-test details.

MuGen: Multi-Skill Generative Locomotion Controller for Humanoid Robots arxiv HighNEW

Yusen Feng, Xiang Wang, Heyuan Yao, Zixi Kang, Xinyu Huo et al. · Submitted 2026-05-23

#RL#Humanoid

MuGen learns a reusable latent motion space for humanoid locomotion by training VQ-VAEs with model-based reinforcement learning on hours of heterogeneous human performance data. A teacher-student distillation setup turns that generative representation into a deployable policy, letting the robot track unseen human motion sequences and reuse the learned skills for other locomotion tasks. The result is a humanoid controller that can execute a diverse set of expressive, human-like motions accurately rather than being limited to a fixed menu of gaits.

GazeVLA: Learning Human Intention for Robotic Manipulation arxiv High

Chengyang Li, Kaiyi Xiong, Yuan Xu, Lei Qian, Yizhou Wang et al. · Submitted 2026-04-24 · Updated 2026-04-30

#VLA#Manipulator

GazeVLA is a robotic manipulation method for learning human intention, apparently via gaze, to reduce reliance on large robot-demo datasets. The excerpt gives no metrics, benchmarks, code release, or real-robot evidence.

MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation arxiv High

Yang Liu, Pengxiang Ding, Tengyue Jiang, Xudong Wang, Wenxuan Song et al. · Submitted 2026-03-26 · Updated 2026-03-27

#VLA#Manipulator

MMaDA-VLA is a large diffusion vision-language-action model for robot manipulation from visual observations and natural-language instructions. It targets unified multimodal instruction and generation; no code, benchmark, or real-robot result is stated here.

arxiv

RedVLA: Physical Red Teaming for Vision-Language-Action Models arxiv High

Yuhao Zhang, Borong Zhang, Jiaming Fan, Jiachen Shen, Yishuai Cai et al. · Submitted 2026-04-24

#VLA#Other

RedVLA is a physical red-teaming method for Vision-Language-Action models, aimed at surfacing safety risks before robot deployment. It targets failures that could cause unpredictable, irreversible physical harm.

OmniVLA-RL: A Vision-Language-Action Model with Spatial Understanding and Online RL arxiv High

Haoxiang Jie, Yaoyuan Yan, Xiangyu Wei, Kailin Wang, Hongjie Yan et al. · Submitted 2026-04-20 · Updated 2026-04-24

#RL#VLA#Other

OmniVLA-RL is a vision-language-action method for embodied control with spatial understanding and online RL. It targets spatial perception, multimodal fusion, and RL stability issues; the blurb gives no numbers, code, robot tests, or benchmarks.

TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation arxiv High

Kaidi Zhang, Heng Zhang, Zhengtong Xu, Zhiyuan Zhang, Md Rakibul Islam Prince et al. · Submitted 2026-03-13 · Updated 2026-03-24

#VLA#Manipulator

TacVLA is an arXiv method for contact-aware tactile fusion in VLA robot manipulation. It targets robustness under occlusion, fine manipulation, and physical contact; no numbers, benchmark, real-robot, or code details are given.

UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling arxiv High

Boyu Chen, Yi Chen, Lu Qiu, Jerry Bai, Yuying Ge et al. · Submitted 2026-04-21

#RL#Humanoid

UniT is an arXiv method for using egocentric human data in humanoid policy learning and world modeling. It targets cross-embodiment kinematic mismatch, but the excerpt gives no metrics, benchmarks, real-robot results, or code-release details.

ST-$π$: Structured SpatioTemporal VLA for Robotic Manipulation arxiv High

Chuanhao Ma, Hanyu Zhou, Shihan Peng, Yan Li, Tao Gu et al. · Submitted 2026-04-20

#VLA#Manipulator

ST-π is a structured spatiotemporal VLA method for robotic manipulation. It targets fine-grained manipulation where general VLA models still struggle; no benchmark, real-robot, or code-release detail is given.

VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation arxiv High

Xinglei Yu, Zhenyang Liu, Shufeng Nan, Simo Wu, Yanwei Fu · Submitted 2026-04-17

#IL#Manipulator

VADF is a vision-adaptive diffusion policy framework for robotic manipulation. It targets hard-negative class imbalance from uniform sampling and difficulty-blind training to speed convergence and reduce inference timeouts. Code, benchmarks, and robot tests are not stated.

KineVLA: Towards Kinematics-Aware Vision-Language-Action Models with Bi-Level Action Decomposition arxiv High

Gaoge Han, Zhengqing Gao, Ziwen Li, Jiaxin Huang, Shaoli Huang et al. · Submitted 2026-03-18

#VLA#Manipulator

KineVLA is a kinematics-rich vision-language-action task for robot manipulation. Commands densely encode direction, trajectory, orientation, and relative displacement from start to finish; excerpt gives no code, benchmark, or real-robot details.

HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing arxiv High

Konstantin Gubernatorov, Mikhail Sannikov, Ilya Mikhalchuk, Egor Kuznetsov, Makar Artemov et al. · Submitted 2026-03-16

#VLA#Manipulator

HapticVLA is a VLA method for contact-rich robot manipulation without inference-time tactile sensing. It targets dexterous, safer manipulation while avoiding dedicated tactile hardware, making deployment cheaper and more reproducible across platforms.

WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning arxiv High

Mintae Kim, Koushil Sreenath · Submitted 2026-04-10 · Updated 2026-04-15

#RL#Other

WOMBET targets offline-to-online reinforcement learning for robotics by focusing on how transferable experience should be generated, rather than assuming a fixed source dataset already exists. It uses a world model to produce experience for transfer from a source task to a target task, aiming to make robot RL more sample-efficient and robust when real-world data collection is costly or risky.

arxiv

StarVLA-$α$: Reducing Complexity in Vision-Language-Action Systems arxiv High

Jinhui Ye, Ning Gao, Senqiao Yang, Jinliang Zheng, Zixuan Wang et al. · Submitted 2026-04-13

#VLA#Other

StarVLA-α is a minimalist VLA baseline for studying design choices without extra architectural tricks. A single generalist model stays competitive on LIBERO, SimplerEnv, RoboTwin and RoboCasa and beats π0.5 by 20% on real-world RoboChallenge; code is promised.

Real-Time Whole-Body Teleoperation of a Humanoid Robot Using IMU-Based Motion Capture with Sim2Sim and Sim2Real Validation arxiv HighNEW

Hamza Ahmed Durrani, Suleman Khan · Submitted 2026-05-12

#Humanoid

IMU-based motion-capture method for real-time whole-body teleoperation of a humanoid robot. Claims stable, low-latency control under morphology mismatch, IMU noise, latency, and sim-to-real gaps, with Sim2Sim and Sim2Real validation.

ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models arxiv High

Nastaran Darabi, Amit Ranjan Trivedi · Submitted 2026-04-10

#VLA#Other

ProGAL-VLA is a method for aligning vision-language-action robot models using prospective reasoning. It targets language ignorance and visual shortcuts, but the excerpt gives no code, benchmark, or real-robot details.

arxiv

UAV-Track VLA: Embodied Aerial Tracking via Vision-Language-Action Models arxiv High

Qiyao Zhang, Shuhua Zheng, Jianli Sun, Chengxiang Li, Xianke Wu et al. · Submitted 2026-04-02 · Updated 2026-04-10

#VLA#Other

UAV-Track VLA is a Vision-Language-Action method for embodied visual tracking on UAVs. It targets dynamic urban tasks with semantic language requirements and continuous action generation; code, benchmarks, or robot tests are not specified.

SteadyTray: Learning Object Balancing Tasks in Humanoid Tray Transport via Residual Reinforcement Learning arxiv High

Anlun Huang, Zhenyu Wu, Soofiyan Atar, Yuheng Zhi, Michael Yip · Submitted 2026-03-11

#RL#Humanoid

SteadyTray is a residual reinforcement learning method for object balancing during humanoid tray transport. It targets unsecured payload stability under locomotion-induced oscillations; no metrics, code release, or real-robot validation details are given.

GST-VLA: Structured Gaussian Spatial Tokens for 3D Depth-Aware Vision-Language-Action Models arxiv High

Md Selim Sarowar, Omer Tariq, Sungho Kim · Submitted 2026-03-10

#VLA#Other

GST-VLA replaces ordinary 2D visual patch tokens in vision-language-action models with structured Gaussian spatial tokens that carry explicit 3D depth-aware geometry. From the provided excerpt, the concrete mechanism beyond that token representation and the empirical results are not specified, but the direction is to give robotic policies visual tokens with intrinsic spatial structure rather than forcing geometry to be inferred from flat image patches.

VLA-InfoEntropy: A Training-Free Vision-Attention Information Entropy Approach for Vision-Language-Action Models Inference Acceleration and Success arxiv High

Chuhang Liu, Yayun He, Zuheng Kang, Xiaoyang Qu, Jianzong Wang · Submitted 2026-04-07

#VLA#Other

VLA-InfoEntropy is a training-free inference method for vision-language-action models using vision-attention information entropy. It claims faster inference and higher success, but no numbers, code, benchmarks, or robot tests are given here.

AnyCamVLA: Zero-Shot Camera Adaptation for Viewpoint Robust Vision-Language-Action Models arxiv High

Hyeongjun Heo, Seungyeon Woo, Sang Min Kim, Junho Kim, Junho Lee et al. · Submitted 2026-03-06

#VLA#Manipulator

AnyCamVLA is a zero-shot camera-adaptation method for robot vision-language-action manipulation models. It aims to make fine-tuned VLAs robust to camera viewpoint changes in unstructured environments without further environment-specific tuning.

Safe-Night VLA: Seeing the Unseen via Thermal-Perceptive Vision-Language-Action Models for Safety-Critical Manipulation arxiv High

Dian Yu, Qingchuan Zhou, Bingkun Huang, Majid Khadiv, Zewen Yang · Submitted 2026-03-05

#VLA#Manipulator

An arXiv method for VLA robot manipulation that adds thermal perception to RGB-centric models. It targets safety-critical/night scenarios by sensing heat cues conventional cameras miss; the excerpt gives no code, benchmark, or real-robot results.

Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture arxiv HighNEW

Luis F. W. Batista, Stéphanie Aravecchia, Cédric Pradalier · Submitted 2026-05-04

#RL#Other

Autonomous surface vessels for floating-waste capture are treated as a full closed-loop problem, combining reinforcement-learning control with onboard perception under changing hydrodynamics, disturbances, and difficult water-surface sensing. The study focuses on sim-to-real transfer and robustness evaluation, testing whether policies trained in simulation can still steer an ASV effectively when perception errors and real-world environmental variation enter the loop.

Hybrid Framework for Robotic Manipulation: Integrating Reinforcement Learning and Large Language Models arxiv High

Md Saad, Sajjad Hussain, Mohd Suhaib · Submitted 2026-03-31

#RL#Manipulator

A hybrid robotic manipulation method combining reinforcement learning with large language models. It claims improved manipulation performance, but the provided text gives no metrics, benchmark, code release, or real-robot validation.

FocusVLA: Focused Visual Utilization for Vision-Language-Action Models arxiv High

Yichi Zhang, Weihao Yuan, Yizhuo Zhang, Xidong Zhang, Jia Wan · Submitted 2026-03-30

#VLA#Other

FocusVLA is an arXiv method for focused visual use in vision-language-action models. It targets better action generation via rich vision-language conditioning; no results, code, robot tests, or benchmarks are given.

Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines arxiv High

Ziyao Wang, Bingying Wang, Hanrong Zhang, Tingting Du, Tianyang Chen et al. · Submitted 2026-04-24

#VLA#Other

Vision-Language-Action models are framed here through the data systems that make embodied learning possible: datasets, benchmarks, and the “data engines” that collect, curate, evaluate, and scale robot experience. Rather than treating performance gains as only a modeling problem, the survey focuses on the infrastructure bottlenecks behind VLA robotics and organizes the field around how data is built and used.

X2-N: A Transformable Wheel-legged Humanoid Robot with Dual-mode Locomotion and Manipulation arxiv High

Yan Ning, Xingzhou Chen, Delong Li, Hao Zhang, Hanfu Gai et al. · Submitted 2026-04-23

#Manipulator#Humanoid

arXiv paper on X2-N, a transformable wheel-legged humanoid for dual-mode locomotion and manipulation. It claims wheeled efficiency plus legged versatility for rapid traversal across continuous and discrete terrain. Release or test details are not stated.

Jump-Start Reinforcement Learning with Vision-Language-Action Regularization arxiv High

Angelo Moroncelli, Roberto Zanetti, Marco Maccarini, Loris Roveda · Submitted 2026-04-15

#RL#VLA#Manipulator

Vision-language-action regularization method for jump-starting robotic manipulation RL. It targets long-horizon tasks with sparse or imperfect rewards by improving exploration and credit assignment; no code, robot tests, or benchmark results are stated.

VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning arxiv High

Chaoyang Wang, Wenrui Bao, Sicheng Gao, Bingxin Xu, Yu Tian et al. · Submitted 2026-03-15

#VLA#Other

VLA-Thinker is a method for vision-language-action models that adds image-aware reasoning instead of text-only chain-of-thought over static visuals. The text claims improved embodied-AI reasoning, but gives no metrics, benchmarks, real-robot tests, or code details.

RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation arxiv High

Feng Jiang, Yang Chen, Kyle Xu, Yuchen Liu, Haifeng Wang et al. · Submitted 2026-04-21 · Updated 2026-05-14

#RL#Manipulator

RoboWM-Bench is a benchmark for evaluating video world models on robotic manipulation. It targets whether realistic future-video prediction can provide useful, scalable supervision for robot learning.

STRONG-VLA: Decoupled Robustness Learning for Vision-Language-Action Models under Multimodal Perturbations arxiv High

Yuhan Xie, Yuping Yan, Yunqi Zhao, Handing Wang, Yaochu Jin · Submitted 2026-04-11 · Updated 2026-04-14

#VLA#Other

STRONG-VLA is a decoupled robustness-learning method for vision-language-action models. It targets joint visual corruption and linguistic noise; the excerpt gives no results, code, or benchmark details.

DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models arxiv High

Zihao Zheng, Hangyu Cao, Sicheng Tian, Jiayu Chen, Maoliang Li et al. · Submitted 2026-03-09 · Updated 2026-03-14

#VLA#Other

DyQ-VLA is a temporal-dynamic-aware quantization method for embodied vision-language-action models. It targets VLA inference overhead; the excerpt gives no speed, benchmark, robot-test, or code-release details.

arxiv

PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models arxiv HighNEW

Xinyu Guo, Bin Xie, Wei Chai, Xianchi Deng, Tiancai Wang et al. · Submitted 2026-05-11

#VLA#Manipulator

PriorVLA is an adaptation method for VLA robot manipulation models that keeps a frozen prior expert while training a downstream expert. It updates 25% as many parameters as full fine-tuning, reports 99.1% LIBERO success, +11 pts on RoboTwin 2.0-Hard over pi0.5, and real-robot tests across 8 tasks/2 embodiments.

FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model arxiv High

Xiaoxu Xu, Hao Li, Jinhui Ye, Yilun Chen, Jia Zeng et al. · Submitted 2026-03-11

#VLA#Other

FutureVLA targets vision-language-action models with an explicit future-prediction component, treating robot control as a coupled problem of seeing environmental geometry and anticipating how motor execution will unfold within it. The idea is to improve embodied agents by jointly modeling visuomotor futures rather than separating perception from action planning, though the provided excerpt does not include experiments, benchmarks, or quantitative results.

SELF-VLA: A Skill Enhanced Agentic Vision-Language-Action Framework for Contact-Rich Disassembly arxiv High

Chang Liu, Sibo Tian, Xiao Liang, Minghui Zheng · Submitted 2026-03-10

#VLA#Other

SELF-VLA targets automated disassembly of end-of-life electronics, where robots must handle contact-rich steps rather than just follow a fixed decomposition into subtasks. From the available abstract snippet, it appears to frame disassembly as an agentic vision-language-action problem augmented with skills, but the provided text does not include the specific mechanism, experiments, or results.

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation arxiv High

Tingjun Dai, Mingfei Han, Tingwen Du, Zhiheng Liu, Zhihui Li et al. · Submitted 2026-03-10

#VLA#Manipulator

Progress-aware vision-language-action method for robotic manipulation that tracks progress through explicit milestones. It claims stronger robustness via status grounding, intermediate-state prediction, and stall recovery; no code or benchmark details are given.

Scaling Tasks, Not Samples: Mastering Humanoid Control through Multi-Task Model-Based Reinforcement Learning arxiv High

Shaohuai Liu, Weirui Ye, Yilun Du, Le Xie · Submitted 2026-03-02

#RL#Humanoid

The paper argues that humanoid control should be scaled through the number and diversity of interactive tasks rather than by relying mainly on larger models or offline datasets. It frames multi-task model-based reinforcement learning as a route to more general humanoid skill acquisition, where the robot learns from active interaction across tasks instead of passively imitating fixed data.

arxiv

PCHC: Enabling Preference Conditioned Humanoid Control via Multi-Objective Reinforcement Learning arxiv High

Huanyu Li, Dewei Wang, Xinmiao Wang, Xinzhe Liu, Peng Liu et al. · Submitted 2026-03-25

#RL#Humanoid

PCHC is a multi-objective RL method for humanoid control that conditions one policy on preference vectors to trade off goals like speed and energy. It is validated on two humanoid tasks in simulation and real-world experiments; no code release is indicated.

Spatial Memory for Out-of-Vision Manipulation in Vision-Language-Action arxiv HighNEW

Pengteng Li, Weiyu Guo, He Zhang, Tiefu Cai, Xiao He et al. · Submitted 2026-05-21

#VLA#Manipulator#Bimanual

SOMA gives vision-language-action models a persistent spatial memory built from multi-view scans with a movable head camera, so a robot can reason about objects that are no longer in its current view instead of repeatedly searching reactively. It builds and refines a global spatial-semantic representation over time, then retrieves instruction-relevant cues during manipulation. In five real-world out-of-vision manipulation tasks, including multi-step and dual-arm setups, it improves success rates and changes behavior qualitatively, with faster target localization, less viewpoint search, and near one-shot grasping under partial observability.

arxiv

StableVLA: Towards Robust Vision-Language-Action Models without Extra Data arxiv HighNEW

Yiyang Fu, Chubin Zhang, Shukai Gong, Yufan Deng, Kaiwei Sun et al. · Submitted 2026-05-18

#VLA#Other

StableVLA is a method for improving VLA robustness without extra training data. It targets unseen visual disturbances and imperfect visual conditions; no code, benchmark, or robot-test details are specified.

Can VLA Models Learn from Real-World Data Continually without Forgetting? arxiv HighNEW

Jiarun Zhu, Yijun Hong, Xiaoquan Sun, Zetian Xu, Mingqi Yuan et al. · Submitted 2026-05-26

#VLA#Manipulator

Real-world continual learning for vision-language-action robot policies is tested on a new dataset of four sequential manipulation tasks, covering pick-and-place, contact-rich pressing, and deformable-object folding. The experiments show that VLA models forget previously learned behaviors badly when trained continuously on heterogeneous real demonstrations, then dig into experience replay to identify the implementation details that make it work.

OHP-RL: Online Human Preference as Guidance in Reinforcement Learning for Robot Manipulation arxiv HighNEW

Yunyang Mo, Jian Li, Qiwei Wu, Yihang Kang, Renjing Xu · Submitted 2026-05-15

#RL#Manipulator

OHP-RL is a human-in-the-loop RL method for robot manipulation that treats interventions as online preferences, using a state-dependent gate to decide when they shape policy learning. It reports higher success, faster convergence, and less human effort on three real-world contact-rich Franka tasks.

AerialVLA: A Vision-Language-Action Model for UAV Navigation via Minimalist End-to-End Control arxiv High

Peng Xu, Zhengnan Deng, Jiayan Deng, Zonghua Gu, Shaohua Wan · Submitted 2026-03-15

#VLA#Other

AerialVLA is an end-to-end vision-language-action model for UAV navigation that maps visual observations and language instructions directly to continuous flight control, avoiding the dense oracle guidance and auxiliary object detectors used in more hierarchical VLN systems. The interesting angle is its minimalist control pipeline for dynamic 3D aerial environments, which aims to reduce semantic handoff gaps between perception, language grounding, and low-level action.

arxiv

RotVLA: Rotational Latent Action for Vision-Language-Action Model arxiv HighNEW

Qiwei Li, Xicheng Gong, Xinghang Li, Peiyan Li, Quanyun Zhou et al. · Submitted 2026-05-13

#VLA#Other

RotVLA is a VLA pretraining method built around rotational latent actions for unifying heterogeneous robot action data. The excerpt claims LAM-style shared action spaces help across embodiments, but gives no benchmarks, robot results, or code release.

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation arxiv High

You Wu, Zixuan Chen, Cunxu Ou, Wenxuan Wang, Wenbo Huang et al. · Submitted 2026-03-14

#VLA#Manipulator

ST-VLA targets general robot manipulation by giving a vision-language-action model explicit 4D spatiotemporal awareness, so it can reason jointly about object semantics, scene geometry, and how actions unfold over longer horizons. From the provided text, the emphasis is on open-world manipulation settings where understanding both spatial structure and temporal dynamics is necessary for reliable action planning.

Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation arxiv High

Tuan Duong Trinh, Naveed Akhtar, Basim Azam · Submitted 2026-03-13

#VLA#Manipulator

ArXiv paper probing adversarial vulnerabilities in VLA robot manipulation models that generate CoT text plans before actions. It targets the internal reasoning-to-action text channel; no results, benchmarks, code, or real-robot details are given.

SaPaVe: Towards Active Perception and Manipulation in Vision-Language-Action Models for Robotics arxiv High

Mengzhen Liu, Enshen Zhou, Cheng Chi, Yi Han, Shanyu Rong et al. · Submitted 2026-03-12

#VLA#Manipulator

SaPaVe trains a vision-language-action robot model to combine semantic active perception with manipulation, so the robot can choose informative viewpoints in complex scenes and then execute actions robustly across changes in camera pose. The interesting part is the end-to-end formulation: instead of treating perception gathering and control as separate modules, it jointly learns both in a data-efficient way to improve viewpoint-invariant execution.

Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning arxiv High

Jiaheng Hu, Jay Shim, Chen Tang, Yoonchang Sung, Bo Liu et al. · Submitted 2026-03-12

#RL#VLA#Other

Sequential fine-tuning + LoRA + on-policy RL for continual reinforcement learning in VLA models. Tested on 3 pretrained VLAs and 5 lifelong RL benchmarks, it shows little/no forgetting and often beats complex CRL methods; code is released.

V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation arxiv High

Yaru Liu, Ao-bo Wang, Nanyang Ye · Submitted 2026-04-10

#VLA#Manipulator

V-CAGE targets the data bottleneck in scaling Vision-Language-Action models by generating robotic manipulation data with a vision-closed-loop, agentic pipeline. From the limited text, its focus is on producing datasets that are not just language-consistent, but also physically feasible for manipulation, which is the hard part when synthetic VLA data needs to transfer to real robot behavior.

arxiv

STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation arxiv HighNEW

Yuxuan Tian, Yurun Jin, Bin Yu, Yukun Shi, Hao Wu et al. · Submitted 2026-04-29 · Updated 2026-05-01

#RL#VLA#Manipulator

STARRY is a spatial-temporal, action-centric world-modeling method for robotic manipulation. It targets tighter coupling between future interaction prediction and VLA action execution for precise geometric timing; no code, benchmark, or robot-test details are provided here.

VLA-OPD: Bridging Offline SFT and Online RL for Vision-Language-Action Models via On-Policy Distillation arxiv High

Zhide Zhong, Haodong Yan, Junfeng Li, Junjie He, Tianran Zhang et al. · Submitted 2026-03-27

#RL#VLA#Manipulator

VLA-OPD is a post-training method for VLA robot policies that connects offline SFT with online RL using on-policy distillation. The snippet claims improved deployment reliability for manipulation, but gives no metrics, benchmarks, real-robot results, or code status.

Persistent Robot World Models: Stabilizing Multi-Step Rollouts via Reinforcement Learning arxiv High

Jai Bardhan, Patrik Drozdik, Josef Sivic, Vladimir Petrik · Submitted 2026-03-26

#RL#Manipulator

A robot world-model method for action-conditioned video prediction in manipulation scenes. It uses reinforcement learning to stabilize multi-step rollouts for tasks that are hard to simulate with traditional physics engines.

DualCoT-VLA: Visual-Linguistic Chain of Thought via Parallel Reasoning for Vision-Language-Action Models arxiv High

Zhide Zhong, Junfeng Li, Junjie He, Haodong Yan, Xin Gong et al. · Submitted 2026-03-23

#VLA#Manipulator

DualCoT-VLA is a method for VLA models that adds parallel visual-linguistic chain-of-thought reasoning. It targets multi-step robot tasks needing logical planning and precise, spatially aware manipulation; release/testing details are not stated.

ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors arxiv High

Zifan Xu, Ran Gong, Maria Vittoria Minniti, Ahmet Salih Gundogdu, Eric Rosen et al. · Submitted 2026-03-16 · Updated 2026-04-20

#RL#IL#Other

ExpertGen targets the data bottleneck in robot behavior cloning by learning expert policies in simulation from imperfect behavior priors, rather than relying on large amounts of expensive real-world teleoperation data. The interesting angle is the sim-to-real scaling strategy: it uses weaker prior behaviors as a starting point, then aims to turn them into higher-quality expert policies that can support more robust and generalizable robot learning.

Beyond Action Residuals: Real-World Robot Policy Steering via Bottleneck Latent Reinforcement Learning arxiv HighNEW

Dongjie Yu, Kun Lei, Zhennan Jiang, Jia Pan, Huazhe Xu · Submitted 2026-05-19

#RL#Manipulator

Z-Perturbation Reinforcement Learning (ZPRL) adapts pretrained robot imitation policies by freezing the base policy and learning RL residuals in a compact variational bottleneck latent, rather than perturbing actions directly or updating policy weights. The latent is extracted from observation embeddings during offline training and then decoded to condition the frozen action generator during online finetuning, giving exploration a more task-aligned structure. On eight simulation tasks and four real-world manipulation tasks, ZPRL improves sample efficiency and final performance over post-training baselines, including a 33.7% average real-world success-rate gain over the imitation policies with smoother behavior than action residuals.

arxiv

RoboFlow4D: A Lightweight Flow World Model Toward Real-Time Flow-Guided Robotic Manipulation arxiv HighNEW

Sixu Lin, Junliang Chen, Huaiyuan Xu, Zhuohao Li, Guangming Wang et al. · Submitted 2026-05-17

#RL#Manipulator

RoboFlow4D is a lightweight flow world model for robotic manipulation in 3D environments. It targets real-time, flow-guided planning and action; the provided text does not specify code, robot tests, benchmarks, or numbers.

arxiv

DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation arxiv High

Ziyu Shan, Yuheng Zhou, Gaoyuan Wu, Ziheng Ji, Zhenyu Wu et al. · Submitted 2026-04-16

#Manipulator#MobileManipulator

DockAnywhere is a method for data-efficient visuomotor policy learning in mobile manipulation. It targets the navigate-to-dock-then-manipulate setup by generating novel demonstrations; code, benchmarks, and real-robot results are not specified here.

ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic Manipulation arxiv High

Xuerui Wang, Guangyu Ren, Tianhong Dai, Bintao Hu, Shuangyao Huang et al. · Submitted 2026-03-02 · Updated 2026-04-13

#RL#Manipulator

ACDC is an arXiv method for goal-conditioned RL in robotic manipulation. It combines adaptive curriculum planning with dynamic contrastive control; the excerpt gives no code, benchmark, or real-robot evidence.

SABER: A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation arxiv HighNEW

Narsimha Menga, Parikshit Sakurikar, Amirreza Rouhi, Satya Sai Reddy, Anirudh Govil et al. · Submitted 2026-05-10

#VLA#Manipulator

SABER focuses on scaling the action data needed to adapt vision-language-action robot models to real deployment settings, motivated by the fact that general-purpose robot foundation models struggle when dropped into specialized domains like retail manipulation. The dataset is organized around real-world embodied actions rather than just broad visual-language coverage, aiming to make VLA adaptation more practical for complex unseen tasks where domain-specific behavior matters.

arxiv

MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation arxiv High

Yutong Shen, Hangxu Liu, Penghui Liu, Jiashuo Luo, Yongkang Zhang et al. · Submitted 2026-03-09

#RL#Manipulator#Humanoid

MetaWorld-X is a hierarchical world-modeling method using VLM-orchestrated experts for humanoid loco-manipulation. It targets stable, compositional whole-body control; no code, benchmarks, numbers, or real-robot results are stated here.

Robust Quadruped Locomotion via Evolutionary Reinforcement Learning arxiv High

Brian McAteer, Karl Mason · Submitted 2026-04-08

#RL#Other

ArXiv paper on an evolutionary reinforcement-learning method for robust quadruped locomotion. It targets sim-to-real brittleness when environments change; no results, code, benchmark, or real-robot tests are specified.

SeedPolicy: Horizon Scaling via Self-Evolving Diffusion Policy for Robot Manipulation arxiv High

Youqiang Gui, Yuxuan Zhou, Shen Cheng, Xinyang Yuan, Haoqiang Fan et al. · Submitted 2026-03-05 · Updated 2026-05-08

#IL#Manipulator

SeedPolicy is an arXiv method for improving Diffusion Policy in robot manipulation by scaling to longer observation horizons. It targets degradation from naively stacking observations; the excerpt gives no numbers, code, benchmark, or real-robot details.

BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation arxiv HighNEW

Zhaohui Du, Zhe Wang, Hongmei Fei, Xiwen Cao, Ting Xiao et al. · Submitted 2026-05-08

#VLA#Manipulator

BioProVLA-Agent targets wet-lab automation by combining protocol-driven execution with vision-enhanced VLA embodied agents and closed-loop reasoning for biological manipulation tasks. From the provided text, the concrete emphasis is on making lab robots more reliable and affordable in environments where repetitive procedures, reproducibility, and messy real-world execution constraints matter.

Sword: Style-Robust World Models as Simulators via Dynamic Latent Bootstrapping for VLA Policy Post-Training arxiv HighNEW

Jiaxuan Gao, Yongjian Guo, Zhong Guan, Wen Huang, Wanlun Ma et al. · Submitted 2026-05-08

#RL#VLA#Other

Sword is an arXiv method for using style-robust world models as simulators during VLA policy post-training. It uses dynamic latent bootstrapping; no results, code, benchmarks, or robot-test details are given in the excerpt.

DexSim2Real: Foundation Model-Guided Sim-to-Real Transfer for Generalizable Dexterous Manipulation arxiv HighNEW

Zijian Zeng, Fei Ding, Huiming Yang, Xianwei Li, Yuhao Liao · Submitted 2026-05-03

#VLA#Manipulator

DexSim2Real is an arXiv method for foundation model-guided sim-to-real transfer in dexterous manipulation. It claims broader generalization across manipulation tasks by reducing hand-designed domain randomization and task-specific adaptation; release or benchmark details aren’t specified.

Self-adapting Robotic Agents through Online Continual Reinforcement Learning with World Model Feedback arxiv High

Fabian Domberg, Georg Schildbach · Submitted 2026-03-04

#RL#Other

An arXiv method for self-adapting robotic agents using online continual RL with world-model feedback. It aims to update controllers during deployment for unforeseen changes; no code, benchmark, or real-robot evidence is specified.

LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics arxiv High

Justin Williams, Kishor Datta Gupta, Roy George, Mrinmoy Sarkar · Submitted 2026-03-03

#VLA#Other

LiteVLA-Edge is a quantized on-device vision-language-action method for embedded robot control. It targets multimodal perception, language conditioning, and action generation under compute and latency limits, but the excerpt gives no code, benchmark, or real-robot evidence.

D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation arxiv High

Yu Zhang, Karl Mason · Submitted 2026-03-28 · Updated 2026-04-01

#RL#Manipulator

D-SPEAR is a reinforcement-learning replay method for robotic manipulation. It targets instability in contact-rich, long-horizon tasks using dual-stream prioritized adaptive experience replay; no metrics, code release, benchmarks, or real-robot tests are specified.

Deep Reinforcement Learning for Robotic Manipulation under Distribution Shift with Bounded Extremum Seeking arxiv High

Shaifalee Saxena, Rafael Fierro, Alexander Scheinker · Submitted 2026-04-01

#RL#Manipulator

A reinforcement-learning method for robotic manipulation under distribution shift using bounded extremum seeking. It targets policy degradation when test conditions differ from training, but no results, benchmarks, code, or real-robot validation are specified here.

Afford-VLA: Action-Aligned Visual Planning via Internalized Affordance arxiv HighNEW

Runze Wang, Yuqian Fu, Yu Li, Tao Lin, Tianwen Qian et al. · Submitted 2026-05-22

#RL#VLA#Manipulator

Afford-VLA improves vision-language-action robot manipulation by making visual planning a task-conditioned affordance pathway inside the model rather than relying on external masks, symbolic plans, or broad geometric cues. It uses learnable <AFF> tokens to identify local interaction regions, decodes affordance masks from multimodal features, and feeds compact affordance embeddings directly into action generation so the visual plan is optimized with the downstream control policy. On LIBERO, LIBERO-Plus, SimplerEnv, and real-world manipulation tests, this action-aligned affordance interface consistently improves performance over prior VLA approaches.

StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation arxiv High

Yiran Shi, Dongqi Guo, Tianchen Zhao, Feng Gao, Liangzhi Shi et al. · Submitted 2026-03-30

#VLA#Other

StreamingVLA is an arXiv method for efficient VLA robot control using action flow matching and adaptive early observation. It targets edge deployment, but the excerpt gives no numbers, code release, benchmarks, or real-robot results.

Active Stereo-Camera Outperforms Multi-Sensor Setup in ACT Imitation Learning for Humanoid Manipulation arxiv High

Robin Kühn, Moritz Schappler, Thomas Seel, Dennis Bank · Submitted 2026-03-30

#IL#Manipulator#Humanoid

An arXiv paper titled Active Stereo-Camera Outperforms Multi-Sensor Setup in ACT Imitation Learning for Humanoid Manipulation studies sensor choice for training humanoid manipulation with Action Chunking Transformer imitation learning. From the provided text, the only concrete claim available is that an active stereo-camera setup outperforms a multi-sensor configuration, framed around reducing the difficulty of teaching humanoid robots new industrial tasks.

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA arxiv High

Yi Chen, Yuying Ge, Hui Zhou, Mingyu Ding, Yixiao Ge et al. · Submitted 2026-03-31 · Updated 2026-04-28

#RL#VLA#Other

DIAL is an end-to-end VLA method that decouples intent from low-level action using latent world modeling. The excerpt claims current VLAs underuse pretrained VLMs as encoders, but gives no results, benchmarks, robot tests, or code-release details.

$M^2$-VLA: Boosting Vision-Language Models for Generalizable Manipulation via Layer Mixture and Meta-Skills arxiv High

Siyao Xiao, Yuhong Zhang, Zhifang Liu, Zihan Gao, Jingye Zhang et al. · Submitted 2026-04-27

#VLA#Manipulator

$M^2$-VLA targets a common failure mode in Vision-Language-Action training: end-to-end fine-tuning can make a strong VLM better at robot manipulation while eroding the generalization it started with. It introduces layer mixture and meta-skills as a way to adapt VLMs for manipulation more selectively, aiming to preserve broad visual-language knowledge while improving transfer across robot tasks.

Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation arxiv High

Teng Xue, Alberto Rigo, Bingjian Huang, Jiayi Shen, Zhengtong Xu et al. · Submitted 2026-04-26

#IL#Manipulator

Tube Diffusion Policy targets contact-rich manipulation by learning a reactive policy from visual and tactile feedback, so a robot can keep adapting as contact conditions shift or disturbances occur. The idea is aimed at tasks where vision alone is brittle because the crucial state is in the contact interaction itself, using tactile signals alongside visual observations to guide manipulation continuously.

3D-Mix for VLA: A Plug-and-Play Module for Integrating VGGT-based 3D Information into Vision-Language-Action Models arxiv High

Bin Yu, Shijie Lian, Xiaopeng Lin, Zhaolong Shen, Yuliang Wei et al. · Submitted 2026-03-25

#VLA#Other

3D-Mix is a plug-and-play method for injecting VGGT-derived 3D information into VLA robot-control models. It targets weak 3D perception in MLLM-backed VLAs; no code, robot tests, or benchmark results are specified in the excerpt.

From Abstraction to Instantiation: Learning Behavioral Representation for Vision-Language-Action Model arxiv HighNEW

Bing Hu, Zaijing Li, Rui Shao, Junda Chen, April Hua Liu et al. · Submitted 2026-05-21

#VLA#Other

BehaviorVLA trains vision-language-action policies around a long-horizon behavioral representation rather than fragmented action latents, using a causal Mamba-based visuomotor encoder to compress trajectory context and a phase-conditioned decoder to align that behavior with current execution progress. On RoboTwin 2.0, LIBERO, and CALVIN it reports 58% success, 98% success, and 4.36 average length respectively, and in sim-to-real transfer matches OpenVLA-OFT with only half the demonstration data.

EvoScene-VLA: Evolving Scene Beliefs Inside the Action Decoder for Chunked Robot Control arxiv HighNEW

Chushan Zhang, Ruihan Lu, Jinguang Tong, Xuesong Li, Yikai Wang et al. · Submitted 2026-05-21

#VLA#Other

EvoScene-VLA gives chunked robot VLA policies a persistent scene belief that is updated by the action decoder itself, so each control call starts from geometry informed by both the latest image and the object/contact changes implied by the previous action chunk. The decoder emits the next action chunk plus a compact scene update, while training uses future scene-token targets and frozen depth/3D teachers to shape the latent scene slots before discarding those auxiliaries at deployment. On 31 RoboTwin tasks it improves average success from 87.2% to 89.1% in fixed settings and 86.1% to 88.5% under randomization, and also beats baselines on a Galaxea R1-Lite real robot.

Multi-Robot Learning-Informed Task Planning Under Uncertainty arxiv High

Abhish Khanal, Abhishek Paudel, Hung Pham, Gregory J. Stein · Submitted 2026-03-20

#Other

arXiv paper on multi-robot task planning when task-relevant object locations are unknown. Targets minimum-time completion of complex tasks under uncertainty; no code, benchmark, or robot-test details are given.

RoHIL: Robust Human-in-the-Loop Robotic Reinforcement Learning Against Illumination Variations arxiv HighNEW

Shuoqin Zhang, Yixin Xiong, Xiru Gao, Kai Liu, Ke Wang et al. · Submitted 2026-05-19

#RL#Other

Human-in-the-loop reinforcement learning systems achieve near-perfect success on the workstation where they are trained, but collapse when the same robot is moved to a workstation a few meters away due to shifts in the visual input distribution caused by new lamp positions and window light.

ADMM-Based Distributed MPC with Control Barrier Functions for Safe Multi-Robot Quadrupedal Locomotion arxiv High

Yicheng Zeng, Ruturaj S. Sambhus, Basit Muhammad Imran, Jeeseop Kim, Vittorio Pastore et al. · Submitted 2026-03-19

#Other

An ADMM-based decentralized MPC method with CBF constraints for safe trajectory planning in multi-quadruped locomotion. The excerpt does not specify results, code release, benchmarks, or real-robot validation.

arxiv

PRIME: Physically-consistent Robotic Inertial and Motion Estimation for Legged and Humanoid Robots arxiv HighNEW

Jiarong Kang, Kunzhao Ren, Tao Pang, Xiaobin Xiong · Submitted 2026-05-17

#Humanoid

PRIME is a motion and inertial estimation method for legged and humanoid robots. It claims physically consistent estimates by reasoning about intermittent contact dynamics; no numbers, code, tests, or benchmarks are stated.

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models arxiv HighNEW

Jianwei Tai · Submitted 2026-05-25

#VLA#Other

Vision-language-action policies can look highly capable on clean robot inputs while having very little adversarial margin: OpenVLA-7B falls from over 95% LIBERO success to under 5% under a 16/255 PGD attack. The paper formalizes this as an information-theoretic budget, upper-bounding capability plus robustness by task entropy and adversarial channel capacity, with a much tighter encoder-specific version that shrinks OpenVLA’s relevant budget from about 5,000 to 31 nats. Across 252 Gaussian-VLA settings and 48 OpenVLA-7B/LIBERO/PGD cases, the bound has zero violations, suggesting encoder-specific slack could be a useful common yardstick for comparing robustness defenses.

Switch: Learning Agile Skills Switching for Humanoid Robots arxiv High

Yuen-Fui Lau, Qihan Zhao, Yinhuai Wang, Runyi Yu, Hok Wai Tsui et al. · Submitted 2026-04-16

#Humanoid

Switch is a deep-RL whole-body control method for humanoid robots aimed at switching among agile locomotion skills. The excerpt claims real-world challenging locomotion progress, but gives no numbers, benchmarks, code, or robot-testing details.

From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation arxiv High

Yibin Liu, Yaxing Lyu, Daqi Gao, Zhixuan Liang, Weiliang Tang et al. · Submitted 2026-03-16

#RL#Manipulator

PRIMO R1 is a 7B video-MLLM/RL method for robotic manipulation process monitoring. It cuts progress-estimation MAE by 50% and scores 67.0% on RoboFail, 6 points over OpenAI o1. Model weights, PRIMO datasets, and benchmark are on Hugging Face.

SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization arxiv High

Jiashun Li, Xiaoyu Shi, Hong Xie, Mingsheng Shang, Yun Lu · Submitted 2026-03-14

#RL#VLA#Other

SmoothVLA is an RL fine-tuning method for VLA robot policies, adding an intrinsic jerk-based reward to encourage physically smooth motion. On LIBERO, it reports 13.8% better smoothness than standard RL and stronger generalization than SFT; no code release is mentioned.

Simulator Adaptation for Sim-to-Real Learning of Legged Locomotion via Proprioceptive Distribution Matching arxiv High

Jeremy Dao, Alan Fern · Submitted 2026-04-13

#Other

Simulator Adaptation targets the sim-to-real gap in legged locomotion by adjusting the simulator to better reproduce the proprioceptive signals seen on hardware, rather than only making the policy more robust to mismatch. The paper frames dynamics discrepancies as a distribution-matching problem over robot sensor histories, aiming to make simulated training rollouts look more like real robot behavior so learned locomotion policies transfer with less performance loss.

ReMem-VLA: Empowering Vision-Language-Action Model with Memory via Dual-Level Recurrent Queries arxiv High

Hang Li, Fengyi Shen, Dong Chen, Liudi Yang, Xudong Wang et al. · Submitted 2026-03-13

#VLA#Other

ReMem-VLA is a VLA control method that adds memory via dual-level recurrent queries. It targets closed-loop robot tasks where Markov-style models fail from missing history; code, benchmarks, and real-robot tests are not specified.

arxiv

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation arxiv High

Minghao Jin, Mozheng Liao, Mingfei Han, Zhihui Li, Xiaojun Chang · Submitted 2026-03-13

#RL#VLA#Manipulator

A robotic manipulation method recasts world-model VLAs as structured planners instead of dense future-image predictors. It targets visual redundancy and long-horizon plan drift; no code, benchmark, or real-robot results are stated.

Sim-to-reality adaptation for Deep Reinforcement Learning applied to an underwater docking application arxiv High

Alaaeddine Chaarani, Narcis Palomeras, Pere Ridao · Submitted 2026-03-12

#RL#Other

An arXiv study on sim-to-real adaptation for deep RL in autonomous underwater docking. It argues DRL can handle unpredictable underwater conditions, but the excerpt gives no metrics, code, robot-test, or benchmark details.

Cybo-Waiter: A Physical Agentic Framework for Humanoid Whole-Body Locomotion-Manipulation arxiv High

Peng Ren, Haoyang Ge, Chuan Qi, Cong Huang, Hong Li et al. · Submitted 2026-03-11

#Manipulator#Humanoid

Cybo-Waiter targets humanoid robots that must carry out open-ended natural-language service requests in real human spaces, where they have to coordinate locomotion, manipulation, and decision-making over long horizons despite incomplete observations. It frames the robot as a physical agentic system for whole-body locomotion-manipulation, aiming to connect language-level intent with reliable embodied execution rather than treating navigation and object interaction as separate problems.

arxiv

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning arxiv High

Jindi Lv, Hao Li, Jie Li, Yifei Nie, Fankun Kong et al. · Submitted 2026-04-09

#RL#VLA#Manipulator

ViVa is a video-generative value model for robot reinforcement learning, aimed at improving VLA manipulation under partial observability and delayed feedback. It uses value functions to assess task progress and guide policy improvement; no release or benchmark details are given.

Embedding Classical Balance Control Principles in Reinforcement Learning for Humanoid Recovery arxiv High

Nehar Poddar, Stephen McCrory, Luigi Penco, Geoffrey Clark, Hakki Erhan Svil et al. · Submitted 2026-03-09

#RL#Humanoid

A reinforcement-learning method for humanoid fall recovery that embeds classical balance-control principles. It targets falls and unrecoverable states in unstructured environments; no numbers, code, benchmark, or real-robot validation are specified.

Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models arxiv HighNEW

Nilaksh, Saurav Jha, Artem Zholus, Sarath Chandar · Submitted 2026-05-07

#RL#Other

World model-based policy evaluation can roll out candidate robot actions inside action-conditioned video diffusion models, but in latent diffusion setups the choice of latent space strongly shapes whether those rollouts are useful. The paper examines whether robotic world models benefit more from latents optimized for pixel reconstruction or from semantically structured representations, framing latent design as a central bottleneck for using diffusion video models as practical proxies for real-world control testing.

DFM-VLA: Iterative Action Refinement for Robot Manipulation via Discrete Flow Matching arxiv High

Jiayi Chen, Wenxuan Song, Shuai Chen, Jingbo Wang, Zhijun Li et al. · Submitted 2026-03-27 · Updated 2026-04-07

#VLA#Manipulator

DFM-VLA is an arXiv method for robot manipulation that uses discrete flow matching to iteratively refine actions in tokenized VLA models. The snippet claims it targets limits of existing discrete action decoding; benchmarks, real-robot tests, and code release are not specified.

BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation arxiv HighNEW

Chenhao Yu, Hongwu Wang, Youhao Hu, Jiachen Zhang, Yuanyuan Li et al. · Submitted 2026-05-05

#Manipulator#Humanoid

BifrostUMI is a method for using robot-free demonstrations to train humanoid whole-body visuomotor policies. It targets teleoperation’s hardware-access and efficiency bottlenecks; the excerpt does not specify code, benchmarks, or real-robot results.

Learning while Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies arxiv HighNEW

Yi Wang, Xinchen Li, Pengwei Xie, Pu Yang, Buqing Nie et al. · Submitted 2026-05-01

#RL#VLA#Other

Fleet-scale reinforcement learning is used to keep improving generalist robot policies during real-world deployment, addressing the gap left by large offline pretraining alone. The idea is to learn from many deployed robots as they encounter failures and variations in the wild, so the policy can become more robust without relying solely on static datasets.

SOLE-R1: Video-Language Reasoning as the Sole Reward for On-Robot Reinforcement Learning arxiv High

Philip Schroeder, Thomas Weng, Karl Schmeckpeper, Eric Rosen, Stephen Hart et al. · Submitted 2026-03-30

#RL#Other

SOLE-R1 uses a vision-language model’s video-level reasoning as the only reward signal for on-robot reinforcement learning, replacing hand-engineered rewards or task-specific success detectors with language-conditioned assessment of rollout videos. The interesting bet is that a VLM can judge whether the robot is making progress from visual evidence alone, turning broad video-language understanding into a practical supervision source for physical robot policy learning.

arxiv

Tac2Real: Reliable and GPU Visuotactile Simulation for Online Reinforcement Learning and Zero-Shot Real-World Deployment arxiv High

Ningyu Yan, Shuai Wang, Xing Shen, Hui Wang, Hanqing Wang et al. · Submitted 2026-03-30

#RL#Manipulator

Tac2Real targets contact-rich manipulation by making visuotactile simulation fast and reliable enough for online reinforcement learning, where tactile realism usually fights GPU-scale throughput. It focuses on policies trained with tactile feedback in simulation and deployed zero-shot to real robots, aiming to close the sim-to-real gap for tasks that depend on rich contact sensing.

ROSCell: A ROS2-Based Framework for Automated Formation and Orchestration of Multi-Robot Systems arxiv High

Jiangtao Shuai, Marvin Carl May, Sonja Schimmler, Manfred Hauswirth · Submitted 2026-03-24

#Other

ROSCell is a ROS2-based framework aimed at manufacturing cells where heterogeneous robots and devices need to be assembled, coordinated, and reconfigured quickly for high-mix, low-volume production. It focuses on automating the formation and orchestration of multi-robot systems in flexible matrix production settings, where the practical challenge is keeping interconnected equipment adaptable as tasks change.

Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot arxiv High

Yucheng Xin, Jiacheng Bao, Haoran Yang, Wenqiang Que, Dong Wang et al. · Submitted 2026-04-23

#RL#Humanoid

Imitation and reinforcement learning are used to train humanoid whole-body control for motions that depend on environmental dynamics rather than self-stabilizing balance alone. The focus is on “weightless” or non-self-stabilizing behaviors, where the robot must reproduce human-like motion patterns that cannot be maintained without interaction with the surrounding environment.

PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction arxiv HighNEW

Shizhe Chen, Paul Pacaud, Cordelia Schmid · Submitted 2026-05-20

#VLA#Manipulator

PointACT is a 3D-aware vision-language-action policy for robot manipulation that feeds hierarchical point-cloud features directly into action decoding, so the action tokens can attend across both local geometric details and global scene structure. Its multi-scale point-action interaction with bottleneck window self-attention improves success rates on LIBERO and RLBench, including a 10% gain on RLBench-10Tasks over pretrained VLA baselines. The ablations suggest the gains come from tightly coupling pretrained 2D semantic representations with 3D geometry, especially when the language-vision backbone is frozen and the action expert is trained from scratch.

ReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-Tuning arxiv High

Tuan Van Vo, Tan Q. Nguyen, Khang Nguyen, Nhat Xuan Tran, Duy H. M. Nguyen et al. · Submitted 2026-04-20

#VLA#Other

A teacher-guided fine-tuning method for Vision-Language-Action robot policies. It targets multimodal reasoning over visual observations and language instructions to produce robot actions; no benchmarks, code release, or real-robot results are specified.

AGILE: A Comprehensive Workflow for Humanoid Loco-Manipulation Learning arxiv High

Huihua Zhao, Rafael Cathomen, Lionel Gulich, Wei Liu, Efe Arda Ongan et al. · Submitted 2026-03-20

#RL#Manipulator#Humanoid

AGILE presents a workflow for learning humanoid loco-manipulation skills with reinforcement learning, aimed at making simulation-trained behaviors easier to transfer across different robot platforms. From the provided excerpt, the emphasis is on closing the gap between impressive simulated humanoid behaviors and practical reuse on new hardware, rather than on a single isolated controller or benchmark result.

CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation arxiv HighNEW

Xinyuan Luo, Xingrui Chen, Xunjian Yin, Hongxuan Wu, Boxi Xia et al. · Submitted 2026-05-19

#Manipulator#Humanoid

CEER gives humanoids a task-space control interface built around root motion commands and end-effector pose targets, so higher-level planners can compose loco-manipulation skills without talking directly in joint space or retraining the whole-body policy. A teacher-student setup distills a general motion-tracking controller into a compliant low-level EE-root policy, which can plug into heterogeneous planners and task modules. In simulation and hardware, it reaches 3.3 cm end-effector tracking accuracy with lower jerk than baselines, handles contact-rich teleoperation stably, and succeeds on up to 70% of simulated room-scale single-object loco-manipulation tasks.

ParkingWorld: End-to-End Autonomous Parking Reinforcement Learning from Corrective Experience in 3DGS Simulation arxiv HighNEW

Zhengcheng Yu, Changze Li, Haoran Liu, Tong Qin · Submitted 2026-05-24 · Updated 2026-05-26

#RL#IL#Other

ParkingWorld trains an end-to-end autonomous parking policy with CIL-SERL, a correction-in-the-loop reinforcement learning setup inside a photorealistic 3D Gaussian Splatting simulator reconstructed from real parking scenes. Its core mechanism is a multi-level replay buffer that separates and reconnects ordinary rollouts, human corrections, failed attempts, and rollback correction segments so training can focus on the cases where exploration breaks down. Evaluations in both simulation and on a physical vehicle report higher parking success, efficiency, and safety across diverse constrained parking scenarios.

ForceVLA2: Unleashing Hybrid Force-Position Control with Force Awareness for Contact-Rich Manipulation arxiv High

Yang Li, Zhaxizhuoma, Hongru Jiang, Junjie Xia, Hongquan Zhang et al. · Submitted 2026-03-16

#VLA#Manipulator

ForceVLA2 targets contact-rich robot manipulation by moving beyond pure position control to hybrid force-position control with explicit force awareness. From the provided text, the core idea is to let an embodied AI system sense and regulate interaction forces during manipulation, aiming to improve stability, precision, and robustness in real-world contact-heavy tasks.

HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations arxiv High

Xiaomeng Xu, Jisang Park, Han Zhang, Eric Cousineau, Aditya Bhat et al. · Submitted 2026-03-03 · Updated 2026-05-14

#Manipulator#MobileManipulator

HoMMI is a framework for learning whole-body mobile manipulation from robot-free human demonstrations, pairing a data-collection interface with policy learning so robots can acquire coordinated base, body, and arm behaviors from people acting naturally. The interesting part is the supervision source: instead of requiring teleoperation on the target robot, it tries to turn human demonstrations directly into training data for mobile manipulation policies.

Evo-Depth: A Lightweight Depth-Enhanced Vision-Language-Action Model arxiv HighNEW

Tao Lin, Yuxin Du, Jiting Liu, Nuobei Zhu, Yunhe Li et al. · Submitted 2026-05-14

#VLA#Manipulator

Evo-Depth is a lightweight vision-language-action model for robotic manipulation that augments the usual visual and language inputs with depth information, aiming to improve spatial understanding without making the policy heavy. From the provided text alone, the concrete architecture, training setup, benchmarks, and quantitative results are not specified, so the defensible takeaway is that it targets more depth-aware action generation within the VLA paradigm.

Load-Aware Locomotion Control for Humanoid Robots in Industrial Transportation Tasks arxiv High

Lequn Fu, Yijun Zhong, Xiao Li, Yibin Liu, Zhiyuan Xu et al. · Submitted 2026-03-15

#Humanoid

A load-aware locomotion control method for humanoid robots carrying payloads in industrial transport tasks. It targets stable walking under changing loads and upper-body motion; code, benchmarks, or real-robot results are not stated.

One Token Per Frame: Reconsidering Visual Bandwidth in World Models for VLA Policy arxiv HighNEW

Zuojin Tang, Shengchao Yuan, Xiaoxin Bai, Zhiyuan Jing, De Ma et al. · Submitted 2026-05-08 · Updated 2026-05-13

#RL#VLA#Other

arXiv study on world-model design for pretrained VLA policies, centered on one-token-per-frame visual bandwidth. It frames auxiliary world-module parameterization for long-horizon planning as unresolved; the excerpt gives no results, benchmarks, robot tests, or code.

Morphologically Equivariant Flow Matching for Bimanual Mobile Manipulation arxiv HighNEW

Max Siebenborn, Daniel Ordoñez Apraez, Sophie Lueth, Giulio Turrisi, Massimiliano Pontil et al. · Submitted 2026-05-12

#IL#Manipulator#Bimanual#MobileManipulator

A method for imitation learning in bimanual mobile manipulation using morphologically equivariant flow matching. It exploits bilateral robot symmetry, but the excerpt gives no results, benchmarks, real-robot tests, or code release.

When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents arxiv HighNEW

Xiaolin Zhou, Aojie Yuan, Zheng Luo, Zipeng Ling, Xixiao Pan et al. · Submitted 2026-05-12

#RL#Other

Tool-use language agents are tested under assumptions that often break in deployment: clean inputs, clear tool registries, and dependable APIs. This arXiv paper frames that gap as a sim-to-real problem for tool use, introducing a benchmark and a domain-randomized reinforcement learning recipe aimed at making agents more robust when simulated tool environments misrepresent real-world messiness.

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain arxiv HighNEW

Zhuangyu Han, Abhronil Sengupta · Submitted 2026-05-10

#RL#Other

An arXiv paper on neuromorphic RL for quadruped locomotion over uneven terrain. It targets onboard adaptation to terrain, payload shifts, actuator wear, and power limits, but gives no results, code release, robot tests, or benchmark details in the snippet.

SCDP: Learning Humanoid Locomotion from Partial Observations via Mixed-Observation Distillation arxiv High

Milo Carroll, Tianhu Peng, Lingfan Bao, Chengxu Zhou, Zhibin Li · Submitted 2026-03-10

#Humanoid

SCDP targets deployable humanoid locomotion policies that do not depend on privileged full-body state estimates at test time. It uses mixed-observation distillation to learn from offline data while bridging the gap between rich training observations and the partial, noisier observations available on real robots.

MO-Playground: Massively Parallelized Multi-Objective Reinforcement Learning for Robotics arxiv High

Neil Janwani, Ellen Novoseller, Vernon J. Lawhern, Maegan Tucker · Submitted 2026-03-10

#RL#Other

MO-Playground targets robotics settings where a robot must trade off conflicting objectives rather than optimize a single reward, using massively parallelized multi-objective reinforcement learning to learn families of Pareto-optimal policies. From the provided text, the concrete result details are not available, but the interesting angle is the emphasis on scaling MORL for robotics so researchers can explore objective trade-offs directly instead of retraining separate single-objective policies.

DexHiL: A Human-in-the-Loop Framework for Vision-Language-Action Model Post-Training in Dexterous Manipulation arxiv High

Yifan Han, Zhongxi Chen, Yuxuan Zhao, Congsheng Xu, Yanming Shao et al. · Submitted 2026-03-10

#VLA#Manipulator

DexHiL is an arXiv method for human-in-the-loop post-training of vision-language-action models for dexterous manipulation. It targets adapting VLA policies to specific, complex tasks; no numbers, code, benchmarks, or robot results are stated.

FAME: Force-Adaptive RL for Expanding the Manipulation Envelope of a Full-Scale Humanoid arxiv High

Niraj Pudasaini, Yutong Zhang, Jensen Lavering, Alessandro Roncone, Nikolaus Correll · Submitted 2026-03-09

#RL#Manipulator#Bimanual#Humanoid

FAME uses force-adaptive reinforcement learning to help a full-scale humanoid stay balanced while applying or resisting external forces through its hands during bimanual manipulation. The idea targets a practical bottleneck in humanoid manipulation: hand interaction forces travel through the whole body, so balance control directly limits how far, hard, and reliably the robot can manipulate objects.

SaiVLA-0: Cerebrum--Pons--Cerebellum Tripartite Architecture for Compute-Aware Vision-Language-Action arxiv High

Xiang Shi, Wenlong Huang, Menglin Zou, Xinhai Sun · Submitted 2026-03-09

#VLA#Other

We revisit Vision-Language-Action through a neuroscience-inspired triad.

CMoE: Contrastive Mixture of Experts for Motion Control and Terrain Adaptation of Humanoid Robots arxiv High

Shihao Ma, Hongjin Chen, Zijun Xu, Yi Zhao, Ke Wu et al. · Submitted 2026-03-03

#Humanoid

CMoE targets humanoid locomotion across complex terrains with abrupt transitions, using a contrastive mixture-of-experts approach to motion control and terrain adaptation. From the provided text, the concrete mechanism and results are not available, but the focus is on enabling humanoid robots to autonomously adjust their control strategy as terrain conditions change in real-world settings.

When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making arxiv High

Jun Liu, Pu Zhao, Zhenglun Kong, Xuan Shen, Peiyan Dong et al. · Submitted 2026-03-17 · Updated 2026-04-01

#RL#Other

Embodied robots using LLM-based agents face a tradeoff between spending compute on high-level reasoning and acting quickly enough in the environment. The paper studies when a robot should invoke more expensive reasoning and frames that choice as a reinforcement learning problem for resource-aware decision-making.

Advancing Multi-Robot Networks via MLLM-Driven Sensing, Communication, and Computation: A Comprehensive Survey arxiv High

Hyun Jong Yang, Howon Lee, Kyuhong Shim, Jeongho Kwak, Hyunsoo Kim et al. · Submitted 2026-03-31

#Humanoid

A survey of MLLM-driven multi-robot networks, centered on how humanoid robots could use multimodal perception, language-based reasoning, communication, and distributed computation to coordinate in settings such as warehouses, factories, and rescue operations. It frames MLLMs as a shared intelligence layer for sensing and collaboration across robot teams, with attention to the system-level pieces needed to make that practical rather than treating each robot as an isolated agent.

ThermoAct:Thermal-Aware Vision-Language-Action Models for Robotic Perception and Decision-Making arxiv High

Young-Chae Son, Dae-Kwan Ko, Yoon-Ji Choi, Soo-Chul Lim · Submitted 2026-03-26 · Updated 2026-03-30

#VLA#Other

ThermoAct extends vision-language-action robot models with thermal awareness, aiming to use heat signatures alongside visual input for perception and decision-making in human-robot collaboration. From the available description, it is positioned around safer task execution when RGB vision alone is insufficient, such as reasoning about temperature-sensitive objects or human presence through non-visual cues.

Realtime-VLA V2: Learning to Run VLAs Fast, Smooth, and Accurate arxiv High

Chen Yang, Yucheng Hu, Yunchao Ma, Yunhuan Yang, Jing Tan et al. · Submitted 2026-03-27

#VLA#Other

Realtime-VLA V2 targets the deployment gap between fast GPU inference for vision-language-action models and actually running those models on real robots. It focuses on making VLA execution fast, smooth, and accurate in the physical control loop, where latency and motion quality matter as much as neural compute speed.

Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning arxiv High

Aditya Narendra, Mukhammadrizo Maribjonov, Dmitry Makarov, Dmitry Yudin, Aleksandr Panov · Submitted 2026-03-25

#RL#Manipulator

KG-M3PO is a model-based reinforcement learning approach for partially observable robotic manipulation that ties together perception, symbolic knowledge graphs, and policy learning in a massively multi-task setting. The interesting piece is the explicit use of structured task knowledge to guide manipulation policies across many tasks, rather than treating perception and control as a purely end-to-end learning problem.

Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning arxiv High

Dmitrii Plotnikov, Iaroslav Kolomiets, Dmitrii Maliukov, Dmitrij Kosenkov, Daniia Zinniatullina et al. · Submitted 2026-03-23

#RL#Other

Closed-Loop Verbal Reinforcement Learning treats task-level mobile robot planning as an iterative language-guided policy improvement problem, where the robot executes plans, observes uncertainty or failure in the physical environment, and uses verbal feedback to refine its future decisions. The interesting part is the closed-loop design: planning remains interpretable in natural language while still adapting from real execution outcomes rather than relying only on offline prompts or fixed symbolic rules.

Dreaming the Unseen: World Model-regularized Diffusion Policy for Out-of-Distribution Robustness arxiv High

Ziou Hu, Xiangtong Yao, Yuan Meng, Zhenshan Bing, Alois Knoll · Submitted 2026-03-22

#RL#IL#Other

Diffusion policies can be brittle when the scene shifts in ways not seen during training, so Dreaming the Unseen regularizes the policy with a learned world model that imagines disturbed or corrupted situations before acting. The setup targets visuomotor control under severe OOD changes such as unexpected object displacements and visual corruptions, aiming to turn the diffusion policy’s generative strength into more robust behavior rather than just accurate imitation in-distribution.

Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum arxiv HighNEW

Zhizhao Liang, Yi-Lin Wei, Xuhang Chen, Mu Lin, Yi-Xiang He et al. · Submitted 2026-05-20

#Manipulator#Humanoid

Active Spatial Brain and Generalizable Action Cerebellum split humanoid whole-body manipulation into active 3D scene understanding, task planning, and executable loco-manipulation action generation. The system uses multi-agent large models to handle spatial relations and produce robot actions without task-specific real-robot training data, and the authors benchmark it on spatial manipulation tasks covering both perception/understanding and real-robot execution across diverse environments.

Multi-Robot Coordination for Planning under Context Uncertainty arxiv High

Pulkit Rustagi, Kyle Hollins Wray, Sandhya Saisubramanian · Submitted 2026-03-14 · Updated 2026-03-19

#Other

Real-world robot teams often have to plan when the right objective depends on context that is not fully known in advance, such as which goals or tradeoffs matter in the current situation. This arXiv work appears to focus on multi-robot coordination under that kind of context uncertainty, where robots must reason jointly about plans while accounting for ambiguity in the operating priorities.

PRIOR: Perceptive Learning for Humanoid Locomotion with Reference Gait Priors arxiv High

Chenxi Han, Shilu He, Yi Cheng, Linqi Ye, Houde Liu · Submitted 2026-03-19

#Humanoid

PRIOR targets perceptive humanoid locomotion on complex terrain by using reference gait priors to help policies retain natural-looking movement while adapting to terrain perception. The provided abstract fragment frames it as an alternative to heavier pipelines that rely on multi-stage training, adversarial losses, or extensive real-world calibration, but it does not include enough detail to report the exact training mechanism or results.

Towards Shared Embodied Intelligence in Humanoid Robots through Optimization Development and Testing of the Human Aware ergoCub Robot arxiv HighNEW

Carlotta Sartore, Mohamed Elobaid, Lorenzo Rapetti, Giulio Romualdi, Stefano Dafarra et al. · Submitted 2026-05-26

#Humanoid

ergoCub is a humanoid robot design built around shared embodied intelligence: its body morphology, hardware parameters, and control policies are optimized with explicit models of human body mechanics, motion, and ergonomic cost. The architecture treats human-robot interaction as a function of robot hardware configuration and embeds human-aware models into the robot’s physical intelligence, aiming to make physical collaboration safer and less burdensome for people in industrial and assistive settings.

NeuroMesh: A Unified Neural Inference Framework for Decentralized Multi-Robot Collaboration arxiv High

Yang Zhou, Yash Shetye, Long Quang, Devon Super, Jesse Milzman et al. · Submitted 2026-04-16

#Other

NeuroMesh presents a unified neural inference framework meant to let learned multi-robot models run across decentralized teams of heterogeneous robots. It targets the practical deployment gap where different onboard hardware, limited communication, and fragmented execution stacks make collaborative robot learning systems difficult to move from model design to real deployments.

Acting on the Unseen: Communication-Free Collaborative Filtering for Decentralized Multi-Robot Task Allocation arxiv HighNEW

Alexander Apartsin, Yigal Meshulam, Yehudit Aperstein · Submitted 2026-05-25

#RL#Other

SwarmCF tackles Zero-Knowledge multi-robot task allocation, where robots have no task model, no communication or coordinator, and only a noisy partial broadcast of teammates’ outcomes, by having each robot run online low-rank collaborative filtering over that shared outcome stream. The low-rank structure lets robots infer how they would perform on unseen robot-task pairs and onboard new tasks with Θ(d) rather than Θ(n) per-robot sample complexity, while structure-free learners stay stuck at the prior-mean error floor. Experiments show better masking robustness and anytime performance than other low-rank methods, with unseen-pair skill improving as team size grows and about 80% of a centralized full-communication ceiling recovered even under capacity-1 contention and a robotics-grounded sensing setup.

Safety-Critical Whole-Body Control for Humanoid Robots via Input-to-State Safe Control Barrier Functions arxiv HighNEW

Kwanwoo Lee, Sanghyuk Park, Gyeongjae Park, Myeong-Ju Kim, Jaeheung Park · Submitted 2026-05-25

#Humanoid

Safety-Critical Whole-Body Control for Humanoid Robots via Input-to-State Safe Control Barrier Functions adds an ISSf-CBF safety filter between kinematic and dynamic whole-body controllers, so nominal joint references from KinWBC are minimally adjusted before DynWBC tracks them with contact and full-body feasibility. The filter enforces kinematic constraints such as joint limits, self-collision, obstacle avoidance, and workspace boundaries while accounting for bounded disturbances from model error, tracking error, or external perturbations. Simulations and real-robot tests on locomotion, teleoperation, and single-leg balancing show improved safety margins and real-time enforcement of multiple constraints under model mismatch.

Scalable Trajectory Generation for Whole-Body Mobile Manipulation arxiv High

Yida Niu, Xinhai Chang, Xin Liu, Ziyuan Jiao, Yixin Zhu · Submitted 2026-04-14

#Manipulator#MobileManipulator

Scalable Trajectory Generation for Whole-Body Mobile Manipulation addresses the coordination problem of moving a robot’s base and arm together in unstructured settings. From the provided text, it appears focused on generating whole-body trajectories that let mobile manipulators interact with the physical world while accounting for both locomotion and manipulation motion.

BlockVLA: Accelerating Autoregressive VLA via Block Diffusion Finetuning arxiv HighNEW

Ruiheng Wang, Shuanghao Bai, Haoran Zhang, Badong Chen, Xiangyu Xu · Submitted 2026-05-13

#VLA#Other

BlockVLA targets the latency and compounding-error issues in autoregressive vision-language-action models for robotics by replacing strictly token-by-token action decoding with block diffusion finetuning. The approach keeps the reasoning strengths of AR VLAs while generating chunks of actions more efficiently, aiming to make long-horizon robot execution faster and less brittle.

JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning arxiv HighNEW

Jing Yu Lim, Rushi Shah, Zarif Ikram, Samson Yu, Haozhe Ma et al. · Submitted 2026-05-13

#RL#Other

JEDI is a joint-embedding diffusion world model for online model-based RL. It targets the tradeoff between strong but expensive pixel diffusion and faster latent diffusion that underperforms; no code, robot tests, benchmarks, or numbers are stated here.

DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies arxiv HighNEW

Xianzhe Fan, Yuxiang Lu, Shenyuan Gao, Xiaoyang Wu, Ruihua Han et al. · Submitted 2026-05-12

#VLA#Manipulator

DreamAvoid targets the brittle moments in vision-language-action manipulation policies where a small low-level action error can cascade into failure. It uses test-time “dreaming” around these critical phases to anticipate and avoid bad trajectories, aiming to make fine-grained robot manipulation more robust without retraining the underlying VLA policy.

SEVO: Semantic-Enhanced Virtual Observation for Robust VLA Manipulation via Active Illumination and Data-Centric Collection arxiv HighNEW

Tianchonghui Fang, Yuan Zhuang, Fei Miao · Submitted 2026-05-11

#VLA#Manipulator

SEVO targets the brittleness of low-cost VLA and imitation-learning manipulation policies when they leave their original training setup. It uses semantic-enhanced virtual observations with active illumination and a data-centric collection strategy, aiming to make robot manipulation policies less tied to narrow visual conditions and more robust under deployment shifts.

HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System arxiv High

Tianshuo Yang, Guanyu Chen, Yutian Chen, Zhixuan Liang, Yitian Liu et al. · Submitted 2026-04-15 · Updated 2026-05-10

#VLA#Manipulator

HiVLA is a hierarchical embodied manipulation system centered on visual grounding. It targets the loss of VLM reasoning that can follow narrow-control fine-tuning in end-to-end VLA models; no code, robot tests, or benchmarks are stated here.

Micro-Swarm Locomotion Optimization in Dynamic Flow using Multi-Objective Multi-Agent Reinforcement Learning arxiv HighNEW

Josef Berman, Oren Gal · Submitted 2026-05-24

#RL#Other

A hybrid CFD and multi-objective multi-agent RL system couples an incompressible Navier-Stokes solver with decentralized PPO to train 16 magnetically actuated micro-robots to move upstream through a pulsatile arterial flow while balancing progress, energy use, and smoothness. PCGrad turns out to be essential: without it, the energy and smoothness objectives collapse, while the trained policy reaches progress rewards of 6.5-7.0, energy efficiency around 0.63-0.65, and smoothness near 0.97-0.99. The learned swarm behavior is physically interpretable, shifting from a two-layer hydrodynamic throttling formation to a flow-reversal ratchet strategy and then individualized final approaches near the target.

EMMa: End-Effector Stability-Oriented Mobile Manipulation for Tracked Rescue Robots arxiv High

Yifei Wang, Hao Zhang, Jidong Huang, Shuohang Fang, Haoyao Chen · Submitted 2026-04-09

#Manipulator#MobileManipulator

EMMa targets tracked rescue robots with mobile manipulators, focusing on keeping the end effector stable while the base moves through difficult mission settings. From the provided text, it appears to combine motion reachability and safety with task-dependent manipulation stability, so the robot can operate autonomously without treating arm control as separate from locomotion.

CORAL: Scalable Multi-Task Robot Learning via LoRA Experts arxiv High

Yuankai Luo, Woping Chen, Tong Liang, Zhenguo Li · Submitted 2026-03-10

#VLA#Other

CORAL targets multi-task interference in vision-language-action robot models by using LoRA experts, suggesting a scalable way to specialize parts of a shared policy without fully duplicating the model for each task. From the provided abstract fragment, the concrete mechanism and results are not specified beyond framing the problem, so the takeaway is that it addresses real-world VLA deployment where many robot skills must coexist without degrading one another.

VLA-GSE: Boosting Parameter-Efficient Fine-Tuning in VLA with Generalized and Specialized Experts arxiv HighNEW

Yuhua Jiang, Junjie Lu, Xinyao Qin, Xiaoyu Chen, Kaixin Wang et al. · Submitted 2026-05-07 · Updated 2026-05-08

#VLA#Other

VLA-GSE targets parameter-efficient adaptation of vision-language-action models for robotic control, aiming to avoid the overfitting and catastrophic forgetting that can come with full fine-tuning on limited robot data. It does this by splitting adaptation into generalized and specialized expert components, so the model can retain broad vision-language priors while learning control-specific behavior more selectively.

asRoBallet: Closing the Sim2Real Gap via Friction-Aware Reinforcement Learning for Underactuated Spherical Dynamics arxiv High

Fang Wan, Guangyi Huang, Tianyu Wu, Zishang Zhang, Bangchao Huang et al. · Submitted 2026-04-27 · Updated 2026-05-07

#RL#Humanoid

asRoBallet is a friction-aware end-to-end RL locomotion policy for a humanoid ballbot. The authors claim the first deployment of such a policy on humanoid ballbot hardware, making it real-robot tested rather than simulation-only.

TriRelVLA: Triadic Relational Structure for Generalizable Embodied Manipulation arxiv HighNEW

Hanyu Zhou, Chuanhao Ma, Gim Hee Lee · Submitted 2026-05-07

#VLA#Manipulator

TriRelVLA is a VLA manipulation method built around triadic relational structure for generalization to unseen scenes and objects. It targets entangled object appearance, background, and layout representations; no code, benchmarks, or real-robot tests are stated.

arxiv

HSC-VLA: Hierarchical Scene-Clearing for Robust Bimanual Manipulation in Dense Clutter arxiv High

Zhen Liu, Xinyu Ning, Zhe Hu, XinXin Xie, Yitong Liu et al. · Submitted 2026-03-08

#VLA#Manipulator#Bimanual

HSC-VLA targets a failure mode in vision-language-action policies where dense, irrelevant clutter overwhelms visual grounding during long-horizon bimanual manipulation. It uses hierarchical scene clearing to remove or manage distracting objects before executing the instructed task, aiming to make manipulation in crowded scenes more robust.

arxiv

Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial Squashing arxiv HighNEW

Qijun Liao, Zhaoxin Yu, Jue Yang · Submitted 2026-05-05

#RL#Other

Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial Squashing addresses actuator rate limits directly during policy learning, focusing on the hard per-step bounds that determine how fast robot joints can change in the real world. The method appears to use dynamic decoupled spherical radial squashing to keep actions within those rate constraints, making learned policies more compatible with physical robot hardware rather than relying on post-hoc clipping or unsafe commands.

Decoupling Task and Behavior: A Two-Stage Reward Curriculum in Reinforcement Learning for Robotics arxiv High

Kilian Freitag, Knut Åkesson, Morteza Haghir Chehreghani · Submitted 2026-03-05

#RL#Other

Deep reinforcement learning for robot control is framed here around a two-stage reward curriculum that separates learning task success from shaping the desired behavior. Based on the available text, the work targets the common bottleneck of hand-designing reward functions, but no specific robot tasks, benchmarks, or results are provided.

arxiv

Tendon Force Modeling for Sim2Real Transfer of Reinforcement Learning Policies for Tendon-Driven Robots arxiv High

Valentin Yuryev, Josie Hughes · Submitted 2026-03-04

#RL#Other

Tendon-driven robots can keep actuators off the moving structure and preserve compliance, but their nonlinear force transmission makes learned controllers hard to move from simulation to hardware. The paper focuses on modeling tendon forces for reinforcement-learning sim-to-real transfer, aiming to make simulated policies account for the actuation behavior that usually breaks deployment on complex tendon systems.

SkillVLA: Tackling Combinatorial Diversity in Dual-Arm Manipulation via Skill Reuse arxiv High

Xuanran Zhai, Zekai Huang, Longyan Wu, Qianyou Zhao, Qiaojun Yu et al. · Submitted 2026-03-04

#VLA#Manipulator#Bimanual

SkillVLA targets the combinatorial explosion in dual-arm manipulation, where many object, hand, and subtask pairings can appear even within seemingly similar bimanual tasks. It frames dual-arm behavior around reusable skills rather than treating each combination as a separate vision-language-action mapping, aiming to make generalization to unseen manipulation setups more systematic.

arxiv

Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery arxiv HighNEW

Wenhao Li, Xiu Su, Yichao Cao, Hongyan Xu, Xiaobo Xia et al. · Submitted 2026-05-02

#VLA#Manipulator

Sentinel-VLA targets a common weakness of vision-language-action robot policies: they can use broad visual and language priors to act, but often do not track whether execution is going wrong or recover from errors. It adds a metacognitive layer for active status monitoring, dynamic reasoning, and self-correction during embodied manipulation, aiming to make VLA systems more robust in changing or failure-prone settings.

Where-to-Learn: Analytical Policy Gradient Directed Exploration for On-Policy Robotic Reinforcement Learning arxiv High

Leixin Chang, Xinchen Yao, Ben Liu, Liangjing Yang, Hua Chen · Submitted 2026-03-28 · Updated 2026-04-01

#RL#Other

Analytical policy-gradient exploration method for on-policy robotic RL. It aims to steer agents toward better trajectories for more efficient policy learning; the excerpt gives no code, benchmark, or real-robot details.

MiniVLA-Nav v1: A Multi-Scene Simulation Dataset for Language-Conditioned Robot Navigation arxiv HighNEW

Ali Al-Bustami, Jaerock Kwon · Submitted 2026-05-01

#VLA#Other

MiniVLA-Nav v1 is a simulated benchmark dataset for language-conditioned object approach navigation, where an NVIDIA Nova Carter differential-drive robot follows short natural-language instructions to reach named objects and stop within 1 m. It spans four photorealistic Isaac Sim scenes, including office, hospital, full warehouse, and multi-shelf warehouse layouts, giving researchers a controlled way to test VLA-style navigation across varied indoor and logistics environments.

Direct Dynamic Retargeting for Humanoid Imitation Learning from Videos arxiv HighNEW

Constant Roux, Ludovic De Matteïs, Armand Jordana, Valentin Guillet, Nicolas Mansard et al. · Submitted 2026-05-22

#RL#IL#Humanoid

Direct Dynamic Retargeting teaches humanoids from monocular human videos by skipping the usual geometric or indirect kinematic retargeting step, which the authors argue biases the motion before dynamics are considered. DDR instead optimizes task-space trajectories directly in a physics simulator using sampling-based MPC, handling contact sequences and drift while producing dynamically feasible references. In experiments, those references track demonstrations better than prior pipelines and help RL agents train faster and execute agile balancing behaviors more reliably.

ANCHOR: A Physically Grounded Closed-Loop Framework for Robust Home-Service Mobile Manipulation arxiv HighNEW

Jinhao Jiang, Shengyu Fang, Sibo Zuo, Yujie Tang, Yirui Li · Submitted 2026-04-28

#Manipulator#MobileManipulator

ANCHOR targets long-horizon home-service mobile manipulation in real domestic settings, where robots must follow open-set object references while recovering from disturbances and execution failures. It frames the problem as physically grounded closed-loop control, aiming to keep perception, manipulation, and navigation tied to ongoing feedback rather than brittle one-shot plans.

Agent-Driven Autonomous Reinforcement Learning Research: Iterative Policy Improvement for Quadruped Locomotion arxiv High

Nimesh Khandelwal, Shakti S. Gupta · Submitted 2026-03-28

#RL#Other

ArXiv case study of agent-driven RL research for quadruped locomotion, focused on iterative policy improvement. It documents an agent-assisted workflow, not a fully self-starting system; no code, benchmark, or robot-test details are given.

Learning Versatile Humanoid Manipulation with Touch Dreaming arxiv High

Yaru Niu, Zhenlong Fang, Binghong Chen, Shuai Zhou, Revanth Krishna Senthilkumaran et al. · Submitted 2026-04-14 · Updated 2026-04-27

#Manipulator#Humanoid

A method for learning dexterous, contact-rich humanoid loco-manipulation via Touch Dreaming. It targets whole-body stability, end-effector dexterity, and contact-aware interaction; the excerpt gives no code, benchmark, or real-robot result.

Long-Horizon Manipulation via Trace-Conditioned VLA Planning arxiv High

Isabella Liu, An-Chieh Cheng, Rui Yan, Geng Chen, Ri-Zhao Qiu et al. · Submitted 2026-04-23

#VLA#Manipulator

An arXiv method for long-horizon robot manipulation using trace-conditioned VLA planning. It targets multi-step, progress-dependent tasks where compounding errors break standard VLA policies; no metrics, benchmarks, robot tests, or code status are given.

arxiv

ExpressMM: Expressive Mobile Manipulation Behaviors in Human-Robot Interactions arxiv High

Souren Pashangpour, Haitong Wang, Matthew Lisondra, Goldie Nejat · Submitted 2026-04-07 · Updated 2026-04-23

#Manipulator#MobileManipulator

ExpressMM focuses on mobile manipulators that need to make their intentions legible to nearby people while carrying out tasks in human-centered spaces. It frames expressive robot behavior as part of the manipulation problem itself, so the robot’s motion can communicate what it is about to do rather than merely optimizing for task completion.

How VLAs (Really) Work In Open-World Environments arxiv High

Amir Rasouli, Yangzheng Wu, Zhiyuan Li, Rui Heng Yang, Xuan Zhao et al. · Submitted 2026-04-23

#VLA#Manipulator

Vision-language-action models are being pushed beyond short manipulation skills into long-horizon household chores, with BEHAVIOR1K used as a stress test for how they operate in open-world settings. The work focuses on unpacking how these VLAs actually handle complex environments rather than just reporting task success, making it useful for understanding where current robot-AI systems generalize and where their behavior may be brittle.

Temporal Difference Calibration in Sequential Tasks: Application to Vision-Language-Action Models arxiv High

Shelly Francis-Meretzki, Mirco Mutti, Yaniv Romano, Aviv Tamar · Submitted 2026-04-22

#VLA#Other

Temporal Difference Calibration targets uncertainty estimates for vision-language-action robotics models during sequential execution, including cases where only partial trajectories are available. It frames calibration around how confidence should evolve over time in a task, aiming to make VLA models’ uncertainty more reliable for step-by-step robotic decision-making.

Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems arxiv High

Wenjian Hao, Yuxuan Fang, Zehui Lu, Shaoshuai Mou · Submitted 2026-04-21

#RL#Other

Model-based RL method using linear Koopman dynamics to control nonlinear robotic systems. It targets optimal closed-loop control, but the excerpt gives no benchmarks, real-robot results, or code release details.

Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning arxiv HighNEW

Ismail Geles, Leonard Bauersfeld, Markus Wulfmeier, Davide Scaramuzza · Submitted 2026-05-21

#RL#Other

Multi-agent reinforcement learning is used to train high-speed quadrotor racers that anticipate other drones rather than treating them as environmental noise, using league-based self-play across varying numbers of opponents. In real-world multiplayer racing above 22 m/s, the agents beat a champion-level human pilot while cutting collision rates by 50% relative to single-agent baselines, including in difficult interactions such as overtaking and aerodynamic downwash. Training against diverse artificial agents also transfers zero-shot to safer interaction with human pilots.

arxiv

GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation arxiv High

Marcelino Julio Fernando, Miguel Altamirano Cabrera, Jeffrin Sam, Yara Mahmoud, Konstantin Gubernatorov et al. · Submitted 2026-04-21

#Manipulator#Bimanual#MobileManipulator

GenerativeMPC targets bimanual mobile manipulation by coupling high-level semantic guidance from a VLM-RAG system with whole-body model predictive control and virtual impedance for compliant contact. The interesting piece is the split between contextual reasoning and physically grounded control: the system can use vision-language retrieval to inform what the robot should do while MPC handles coordinated, safe motion across the mobile base and two arms.

Assessing VLM-Driven Semantic-Affordance Inference for Non-Humanoid Robot Morphologies arxiv High

Jess Jones, Raul Santos-Rodriguez, Sabine Hauert · Submitted 2026-04-21

#Humanoid

Vision-language models are tested for semantic-affordance inference on robots whose bodies do not resemble humans, asking whether models trained around human-object interaction can still judge what a differently shaped robot can do with objects. The setup targets a practical gap in VLM-based robotics: affordances depend on morphology, so a model’s human-centric assumptions may break when the agent has non-humanoid limbs, sensors, or effectors.

arxiv

Reinforcement Learning Enabled Adaptive Multi-Task Control for Bipedal Soccer Robots arxiv High

Yulai Zhang, Yinrong Zhang, Ting Wu, Linqi Ye · Submitted 2026-04-21

#RL#Humanoid

Reinforcement learning is used to train an adaptive controller for bipedal soccer robots that must handle tightly coupled behaviors such as stable walking, game movement, and fall recovery in dynamic contact-rich play. The focus is on reducing brittle handoffs between control states, especially transitions between upright locomotion and recovery after a fall, so the robot can maintain task performance under disturbances.

Auction-Consensus Algorithm with Learned Bidding Scheme for Multi-Robot Systems arxiv HighNEW

Jose Rodriguez, Constantine Tarawneh, Sven Koenig, Wenjie Dong, Qi Lu · Submitted 2026-05-21

#RL#Other

CBBA’s hand-designed greedy bids are replaced with a neural bidding policy trained by PPO, while keeping the usual decentralized auction and consensus steps for execution under limited communication. The agents learn bids from partial local observations, with reward shaping based on mixed-integer-programming solutions, and the paper compares Neural Additive, LSTM, and Set Transformer architectures. Across different swarm sizes, the learned bidders produce better task allocations than classical CBBA while preserving decentralized execution.

Mind the Gaps: Multi-Robot Feedback-Driven Ergodic Coverage in Unknown Environments arxiv HighNEW

Thales Costa Silva, Nora Ayanian · Submitted 2026-05-20

#Other

Mind the Gaps tackles multi-robot adaptive coverage when the team does not know the environment’s information distribution in advance. It augments ergodic trajectory optimization with an online-updated parametric environmental model, using real-time feedback to build the target spatial information distribution and steer robots toward regions that currently look most informative. In simulation, this feedback-driven strategy improves coverage efficiency and resource allocation for static or slowly changing environments.

Q-SpiRL: Quantum Spiking Reinforcement Learning for Adaptive Robot Navigation arxiv HighNEW

Mohamed Khair Altrabulsi, Nouhaila Innan, Alberto Marchisio, Muhammad Kashif, Muhammad Shafique · Submitted 2026-05-20

#RL#Other

Q-SpiRL studies robot navigation with a hybrid quantum spiking RL policy, comparing tabular Q-learning, classical MLP/SNN agents, and quantum-enhanced MLP/SNN variants in 20x20, 30x30, and 40x40 grid worlds with static and dynamic obstacles. Its QSNN combines spike-based temporal processing with a variational quantum feature transform, and achieved the best overall balance of reaching the goal, keeping paths efficient, and reducing turns, including up to 99% success in the hardest setting. The authors also ran the hybrid policy on IBM quantum hardware, showing that the approach can execute under real-device quantum conditions.

arxiv

PTLD: Sim-to-real Privileged Tactile Latent Distillation for Dexterous Manipulation arxiv High

Rosy Chen, Mustafa Mukadam, Michael Kaess, Tingfan Wu, Francois R Hogan et al. · Submitted 2026-03-04 · Updated 2026-04-20

#Manipulator

PTLD targets sim-to-real tactile dexterous manipulation by distilling privileged tactile latent information learned in simulation into a deployable control policy. From the provided text, the concrete motivation is automating complex household manipulation tasks where touch-rich control remains difficult, but no specific robot platform, tasks, or performance results are given.

Scalable Multi-robot Motion Planning via Hierarchical Subproblem Expansion and Workspace Decomposition Refinement arxiv HighNEW

Isaac Ngui, Courtney McBeth, James D. Motes, Marco Morales, Nancy M. Amato · Submitted 2026-05-19

#Other

Hierarchical Subproblem Expansion speeds up multi-robot motion planning by coordinating robots through discrete search over a workspace decomposition instead of immediately planning in the full joint configuration space. When conflicts or coupling remain, it refines the workspace representation so the planner can keep more robots in smaller decoupled subproblems, yielding planning-time improvements of up to an order of magnitude.

SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework arxiv HighNEW

Tianshu Wu, Xiangqi Kong, Yue Chen, Qize Yu, Hang Ye et al. · Submitted 2026-05-19

#Manipulator#Humanoid

SUGAR turns unstructured human videos into deployable humanoid loco-manipulation skills by extracting human-object trajectories and contact labels, refining the noisy video-derived priors into physically feasible behaviors, and distilling them into a hierarchical policy that no longer needs reference-motion conditioning at test time. On six simulated and real-hardware tasks, it beats reference-tracking baselines, improves as more video data is added, and transfers zero-shot to a humanoid with closed-loop recovery and stable long-horizon execution under perturbations.

Sampling-Based Safe Reinforcement Learning arxiv HighNEW

Luca Vignola, Bruce D. Lee, Manish Prajapat, Manuel Wendl, Melanie Zeilinger et al. · Submitted 2026-05-19

#RL#Other

Sampling-Based Safe Reinforcement Learning keeps an RL agent safe during learning by planning against a finite set of sampled dynamics models, turning a hard worst-case safety problem under model uncertainty into a tractable constraint-enforcement scheme for continuous control. Its exploration rule limits epistemic uncertainty instead of adding explicit exploration bonuses, and the authors prove high-probability safety plus finite-time near-optimality guarantees. In simulations and real robot experiments, SBSRL supports efficient safe exploration and can be implemented with deep ensembles for higher-dimensional control.

A Rapid Deployment Pipeline for Autonomous Humanoid Grasping Based on Foundation Models arxiv High

Yifei Yan, Yankai Liao, Linqi Ye · Submitted 2026-04-19

#VLA#Manipulator#Humanoid

A deployment pipeline lets a humanoid robot grasp new objects without the usual one-to-two-day cycle of data collection, manual annotation, 3D model acquisition, and task-specific training. It uses foundation models to compress that setup process, aiming to make autonomous humanoid manipulation practical for rapidly changing object sets rather than carefully pre-modeled environments.

Graph Neural Planning and Predictive Control for Multi-Robot Communication-Constrained Unlabeled Motion Planning arxiv HighNEW

Manohari Goarin, Yang Zhou, Giuseppe Loianno · Submitted 2026-05-19

#Other

The multi-robot unlabeled motion planning problem of concurrently assigning robots to goals and generating safe trajectories is central in many collaborative tasks.

arxiv

Web-Gewu: A Browser-Based Interactive Playground for Robot Reinforcement Learning arxiv High

Kaixuan Chen, Linqi Ye · Submitted 2026-04-18

#RL#Other

Web-Gewu is a browser-based interactive playground for robot reinforcement learning aimed at lowering the compute and setup burden in robotics education. From the provided text, it focuses on making embodied-intelligence experiments easier to access without heavy local configuration, but no specific algorithms, tasks, or results are given.

Seeing Together:Multi-Robot Cooperative Egocentric Spatial Reasoning with Multimodal Large Language Models arxiv HighNEW

Kunyu Peng, Zhikun Zhou, Kailun Yang, Di Wen, Ruiping Liu et al. · Submitted 2026-05-18

#Other

arXiv paper on multi-robot egocentric spatial reasoning with multimodal LLMs. It targets cooperative reasoning across multiple embodied viewpoints; the excerpt gives no benchmark, code, or real-robot details.

arxiv

Offload or Overload: A Platform Measurement Study of Mobile Robotic Manipulation Workloads arxiv High

Sara Pohland, Xenofon Foukas, Ganesh Ananthanarayanan, Andrey Kolobov, Sanjeev Mehrotra et al. · Submitted 2026-03-18

#Manipulator

Mobile robotic manipulation workloads are examined through a platform-level measurement study focused on the compute pressure created by foundation-model-based robot perception, planning, and control. The title frames the central tradeoff as whether these workloads should be offloaded from the robot or kept onboard, highlighting how physical AI gains can turn into system overload when deployed on mobile robotic platforms.

Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy arxiv HighNEW

Yuhang Wan, Weixian Lin, Letian Qian, Yiqi Zou, Weiwei Wu et al. · Submitted 2026-05-26

#RL#Other

Motor overheating is treated as part of the quadruped control problem by embedding a whole-body thermal model into RL training, then adding a residual policy that modifies a pre-trained locomotion policy based on motor temperatures. The residual controller preserves normal terrain-traversal behavior when the robot is cool but shifts actions to avoid thermal limits as temperatures rise. In real Unitree A1 tests with a 3 kg payload, it kept the robot walking stably across multiple terrains for over 13 minutes, compared with overheating after about 5 minutes under the nominal policy alone.

Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion arxiv High

Tomoya Kamimura, Haruka Washiyama, Akihito Sano · Submitted 2026-04-16

#RL#Humanoid

Model-based deep RL method for biped robot locomotion that uses passive body dynamics as part of the control strategy. Claims it can generate walking and running gaits, but the snippet gives no code, robot-test, or benchmark details.

Offline Semantic Guidance for Efficient Vision-Language-Action Policy Distillation arxiv HighNEW

Jin Shi, Brady Zhang, Yishun Lu · Submitted 2026-05-15

#VLA#Other

VLA-AD distills billion-parameter vision-language-action robot policies into much smaller closed-loop controllers by using a VLM offline during training to add semantic supervision, such as task-phase anchors and multi-frame direction cues, alongside 7-DoF action imitation. On LIBERO, it compresses OpenVLA-7B into a 158M-parameter student with a 44x size reduction, only a 0.27% average relative performance gap, and 12.5 Hz inference on an RTX 4090, while similar distillation from a pi_0.5-4B teacher also transfers well. The semantic training signals appear to make the student more robust to noisy teacher actions, especially high-frequency gripper errors, without requiring the teacher or VLM at deployment.

Beyond Collision Avoidance: Multi-Robot Yielding and Spatial Affordance in Emergency Evacuations arxiv HighNEW

Ning Zhou, Edmund R. Hunt, Nikolai W. F. Bode · Submitted 2026-05-15

#Other

Mobile service robots in crowded evacuations need to do more than avoid collisions: they must yield in ways that preserve usable space for pedestrians in tight, high-stress settings. This arXiv work frames emergency evacuation as a multi-robot problem involving yielding behavior and spatial affordance, focusing on passively safe navigation when robots share confined routes with people.

Health-Conditioned Vision-Language-Action Models for Malfunction-Aware Robot Control arxiv HighNEW

Hüseyin Arslan, Özgür Erkent · Submitted 2026-05-15

#VLA#Other

An arXiv paper on health-conditioned VLA robot control, aimed at adapting actions to the robot’s physical malfunctions. The provided text gives no metrics, code, benchmark, or real-robot validation.

arxiv

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control arxiv High

Donghu Kim, Youngdo Lee, Minho Park, Kinam Kim, I Made Aswin Nahendra et al. · Submitted 2026-04-06 · Updated 2026-05-15

#RL#Other

FlashSAC targets high-dimensional robot control with an off-policy reinforcement learning approach designed to be faster and more stable when demonstrations are not available. From the provided text, the concrete mechanism and results are not specified, so the safest reading is that it is a Soft Actor-Critic-style method focused on improving training efficiency and robustness for robot control problems.

Propagating Unsafe Actions in LLM Controlled Multi-Robot Collaboration via Single Robot Compromise arxiv HighNEW

Zhen Huang, Zhihuang Liu, Weishang Wu, Zhiping Cai · Submitted 2026-05-15

#Other

LLM-controlled multi-robot teams are examined through a security lens: compromising a single robot can let unsafe actions propagate through the collaboration process. The work focuses on how high-level language-model planning and inter-robot coordination create a pathway for one corrupted agent to influence the behavior of otherwise uncompromised robots.

Residual Reinforcement Learning for Robot Teleoperation under Stochastic Delays arxiv HighNEW

Kaize Deng, Zewen Yang · Submitted 2026-05-14

#RL#Other

Residual reinforcement learning is applied to robot teleoperation with stochastic communication delays, targeting the signal discontinuities that destabilize delayed remote control loops. The setup suggests a learned residual policy augments an existing controller rather than replacing it, aiming to preserve baseline teleoperation behavior while compensating for delay-induced performance loss.

Hand-in-the-Loop: Improving Dexterous VLA via Seamless Interventional Correction arxiv HighNEW

Zhuohang Li, Liqun Huang, Wei Xu, Zhengming Zhu, Nie Lin et al. · Submitted 2026-05-14

#VLA#Manipulator

Hand-in-the-Loop targets the compounding-error problem in dexterous vision-language-action policies by letting a human intervene with seamless corrections during manipulation, rather than treating policy execution as a fixed open-loop rollout. The interesting piece is the corrective feedback loop: small hand-guided interventions can keep contact-rich, high-dimensional manipulation trajectories from drifting, giving the VLA model a practical way to recover over long horizons.

A Prototyping Framework for Distributed Control of Multi-Robot Systems arxiv HighNEW

Junaid Ahmed Memon, Allan Andre Do Nascimento, Kostas Margellos, Antonis Papachristodoulou · Submitted 2026-05-14

#Other

A prototyping framework is described for testing distributed optimization algorithms on multi-robot systems, with the goal of making it easier to move from theoretical controller designs to practical experiments. The emphasis is on distributed control, where robots coordinate through local computation and communication rather than a centralized supervisor, giving researchers a testbed for evaluating how such algorithms behave in realistic deployments.

AnySlot: Goal-Conditioned Vision-Language-Action Policies for Zero-Shot Slot-Level Placement arxiv High

Zhaofeng Hu, Sifan Zhou, Qinbo Zhang, Rongtao Xu, Qi Su et al. · Submitted 2026-04-12 · Updated 2026-04-14

#VLA#Other

AnySlot targets compositional placement commands where a robot must put an object into a specific slot-level location, a case that monolithic VLA policies often struggle to ground precisely. It frames placement as goal-conditioned vision-language-action policy learning, aiming to make slot-level placements work zero-shot from language and visual context rather than requiring task-specific retraining.

Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs arxiv HighNEW

Jiahui Niu, Kefan Gu, Yucheng Zhao, Shengwen Liang, Tiancai Wang et al. · Submitted 2026-05-13

#VLA#Other

Realtime-VLA FLASH targets the inference bottleneck in diffusion-based vision-language-action models, where generating actions through full denoising chains is too slow for real-time robot control. It frames dVLA inference as a speculative process, aiming to preserve the strengths of diffusion policies while cutting latency enough for embodied deployment.

FrameSkip: Learning from Fewer but More Informative Frames in VLA Training arxiv HighNEW

Bin Yu, Shijie Lian, Xiaopeng Lin, Zhaolong Shen, Yuliang Wei et al. · Submitted 2026-05-13

#VLA#Other

FrameSkip targets a common inefficiency in VLA training: treating every frame in dense teleoperated robot demonstrations as equally useful supervision. It learns from fewer but more informative frames, aiming to reduce redundant training signal while preserving the parts of trajectories that best teach the vision-language-action policy.

arxiv

Guide, Think, Act: Interactive Embodied Reasoning in Vision-Language-Action Models arxiv HighNEW

Yiran Ling, Qing Lian, Jinghang Li, Qing Jiang, Tianming Zhang et al. · Submitted 2026-05-13

#VLA#Other

GTA-VLA is an interactive vision-language-action framework for robot control that lets a user steer embodied reasoning with explicit visual cues rather than relying only on a language instruction or passive scene understanding. The idea is to make robot policies spatially guideable: the model can use user-provided cues to guide where it attends, reasons, and acts in the environment.

arxiv

WM-DAgger: Enabling Efficient Data Aggregation for Imitation Learning with World Models arxiv High

Anlan Yu, Zaishu Chen, Peili Song, Zhiqing Hong, Haotian Wang et al. · Submitted 2026-04-13

#RL#IL#Other

WM-DAgger targets the compounding-error problem in robotic imitation learning, where small mistakes push a policy into out-of-distribution states that were never covered by demonstrations. It uses world models to make data aggregation more efficient, aiming to expose and correct those failure-prone states without relying only on costly real-world rollouts.

ALAM: Algebraically Consistent Latent Action Model for Vision-Language-Action Models arxiv HighNEW

Zuojin Tang, Haoyun Liu, Xinyuan Chang, Changjie Wu, Dongjie Huo et al. · Submitted 2026-05-11 · Updated 2026-05-13

#VLA#Other

ALAM targets the action-label bottleneck in vision-language-action training by learning latent actions from action-free videos, using algebraic consistency constraints so those latents behave like coherent robot action transformations rather than arbitrary video features. The interesting part is that it tries to turn abundant passive video into structured supervision for VLA models, making world-change evidence useful even when no robot control commands were recorded.

Towards Long-horizon Embodied Agents with Tool-Aligned Vision-Language-Action Models arxiv HighNEW

Zixing Lei, Changxing Liu, Yichen Xiong, Minhao Xiong, Yuanzhuo Ding et al. · Submitted 2026-05-13

#VLA#Other

Tool-Aligned Vision-Language-Action models aim to make embodied agents handle long-horizon robot tasks by pairing VLA execution with tools that relieve the model from doing all closed-loop planning and low-level physical operation on its own. The idea is interesting because it treats long-horizon embodiment less as a single monolithic policy problem and more as a coordination problem between vision-language-action control and specialized tool use.

Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation arxiv High

Xue Qin, Simin Luan, John See, Cong Yang, Zhijun Li · Submitted 2026-04-13

#Other

Federated Single-Agent Robotics argues that fleet-level coordination does not require splitting each robot into multiple internal agents. Instead, it frames each robot as a single decision-making unit within a federated coordination setup, aiming to preserve simpler intra-robot control while still addressing the systems problems that emerge when many embodied robots operate together.

3D RL-DWA: A Hybrid Reinforcement Learning and Dynamic Window Approach for Goal-Directed Local Navigation in Multi-DoF Robots arxiv HighNEW

Chiara Castellani, Enrico Turco, Domenico Prattichizzo · Submitted 2026-05-12

#RL#Other

3D RL-DWA combines reinforcement learning with the Dynamic Window Approach to handle adaptive local navigation for high-DoF robots moving through 3D space. The method is aimed at goal-directed motion where a robot must choose feasible local actions under its kinematic constraints while adapting beyond a hand-tuned planner.

GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization arxiv HighNEW

Xiaosong Jia, Bowen Yang, Zuhao Ge, Xian Nie, Yuchen Zhou et al. · Submitted 2026-05-12

#VLA#Other

GuidedVLA introduces a plug-and-play way to specialize action attention in vision-language-action models so the robot can focus on task-relevant factors instead of relying entirely on end-to-end supervision to discover them implicitly. From the provided text, the concrete result details are not given, but the mechanism is aimed at making action decoding more explicitly guided and adaptable within VLM-based robot policies.

Learning Action Manifold with Multi-view Latent Priors for Robotic Manipulation arxiv HighNEW

Junjin Xiao, Dongyang Li, Yandan Yang, Shuang Zeng, Tong Lin et al. · Submitted 2026-05-12

#VLA#Manipulator

Learning Action Manifold with Multi-view Latent Priors targets the spatial perception bottlenecks that make vision-language-action policies brittle in robotic manipulation. It appears to learn a structured action space using latent priors from multiple views, aiming to give the policy a better geometric handle on how visual observations map to feasible robot actions.

Mapping Embodied Affective Touch Strategies on a Humanoid Robot arxiv HighNEW

Qiaoqiao Ren, Omar Eldardeer, Francesca Cocchella, Rea Francesco, Alessandra Sciutti et al. · Submitted 2026-05-12

#Humanoid

Affective touch for humanoid robots is framed as an embodied design problem, where the same emotional intent can change meaning depending on where the robot touches, what its body can physically do, and how much agency or social role people attribute to it. The work maps touch strategies across these constraints, giving HRI researchers a way to reason about robot touch as situated behavior rather than a generic emotional signal.

Unified Noise Steering for Efficient Human-Guided VLA Adaptation arxiv HighNEW

Junjie Lu, Xinyao Qin, Yuhua Jiang, Kaixin Wang, Chuheng Zhang et al. · Submitted 2026-05-11

#VLA#Other

Unified Noise Steering is an arXiv method for adapting diffusion-based vision-language-action robot policies with human guidance. It targets efficient real-world distribution adaptation; no numbers, code release, or benchmark details are given.

Data-Asymmetric Latent Imagination and Reranking for 3D Robotic Imitation Learning arxiv HighNEW

Lianghao Luo, Xizhou Bu, Ruyan Liu, Qingqiu Huang, Chufeng Tang et al. · Submitted 2026-05-11

#IL#Other

Data-Asymmetric Latent Imagination and Reranking targets 3D robotic imitation learning when demonstrations include exploratory, suboptimal, or failed behavior rather than clean expert rollouts. From the provided text, the central setup is to make learning robust to imperfect real-world trajectory data, but the excerpt does not include the specific mechanism, benchmarks, or results needed to summarize its empirical findings.

Explicit Stair Geometry Conditioning for Robust Humanoid Locomotion arxiv HighNEW

Jianguo Zhang, Wentai Xu, Shusheng Ye, Yuxiang He, Weimin Qi et al. · Submitted 2026-05-11

#Humanoid

Explicit Stair Geometry Conditioning targets robust humanoid stair climbing by conditioning locomotion control on stair geometry rather than relying only on generic terrain handling. The setup is motivated by the failure modes that make stairs hard in practice: discontinuous contacts, variation in step height, and imperfect perception of real stair structure.

Above and Below: Heterogeneous Multi-robot SLAM Across Surface and Underwater Domains arxiv HighNEW

John McConnell, Armon Shariati, Paul Szenher, Yaxuan Li · Submitted 2026-05-10

#Other

Multi-robot SLAM is framed here for mixed surface-and-underwater teams, where robots need a shared map and consistent localization despite operating in very different sensing and communication conditions. The paper appears to target coordinated cross-domain mapping, with surface and underwater robots maintaining enough common spatial understanding to localize themselves and teammates for joint missions.

Efficient Multi-Robot Motion Planning with Precomputed Translation-Invariant Edge Bundles arxiv HighNEW

Himanshu Gupta, Paul Motter, Aritra Chakrabarty, Rishabh Sodani, Srikrishna Bangalore Raghu et al. · Submitted 2026-05-10

#Other

KiTE-Extend is a planner-agnostic action selection mechanism for sampling-based kinodynamic multi-robot motion planning. It uses precomputed translation-invariant edge bundles to help generate collision-free, kinodynamically feasible trajectories for interacting robots more efficiently.

Minimizing Worst-Case Weighted Latency for Multi-Robot Persistent Monitoring: Theory and RL-Based Solutions arxiv HighNEW

Weizhen Wang, Ziheng Wang, Jianping He, Xinping Guan, Xiaoming Duan · Submitted 2026-05-10

#Other

Multi-robot persistent monitoring is cast as an infinite-horizon graph problem where robots must repeatedly visit prioritized nodes while accounting for travel costs, and performance is judged by the worst weighted delay any node experiences between visits. The work develops theory for minimizing that worst-case weighted latency and pairs it with reinforcement-learning-based trajectory policies, targeting settings where hand-designed patrol schedules struggle as priorities, graph structure, and robot coordination interact.

SKG-VLA: Scene Knowledge Graph Priors for Structured Scene Semantics and Multimodal Reasoning for Decision Making arxiv HighNEW

Zeyu Li, Lei Li · Submitted 2026-05-10

#VLA#Other

SKG-VLA appears to use scene knowledge graph priors to give a vision-language-action system more structured semantic understanding of a scene, so its decisions can draw on object relations and multimodal reasoning rather than only raw visual-language features. The provided text is sparse and oddly centered on complaint-handling evidence rather than robot experiments, so the safest read is that the work targets decision making over heterogeneous inputs and emphasizes structured representations for more interpretable reasoning.

Preserving Foundational Capabilities in Flow-Matching VLAs through Conservative SFT arxiv HighNEW

Tianyi Zhang, Shaopeng Zhai, Haoran Zhang, Fuxian Huang, Qi Zhang · Submitted 2026-05-09

#VLA#Other

Conservative SFT targets a failure mode in flow-matching Vision-Language-Action models where ordinary fine-tuning overwrites many pretrained parameters and erodes the model’s original visual-language and action capabilities. The idea is to make supervised fine-tuning more restrained, preserving foundational behavior while adapting the VLA to downstream robot tasks.

arxiv

Decentralized Heterogeneous Multi-Robot Collaborative Exploration for Indoor and Outdoor 3D Environments arxiv High

Yuxiang Li, Kun Chen, Jiancheng Wang, Shihao Fang, Haoyao Chen et al. · Submitted 2026-04-26 · Updated 2026-05-09

#Other

Decentralized heterogeneous multi-robot collaborative exploration targets teams of different robots working together to map and explore complex 3D indoor and outdoor spaces without relying on a central coordinator. The work is centered on making collaboration exploit each robot’s differing mobility and sensing strengths, addressing the coordination bottleneck that often limits heterogeneous robot teams in real environments.

HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation arxiv High

Shuanghao Bai, Meng Li, Xinyuan Lv, Jiawei Wang, Xinhua Wang et al. · Submitted 2026-04-09

#Manipulator#Humanoid

HEX is a method for cross-embodiment whole-body manipulation using humanoid-aligned experts. It targets coordinated high-DoF humanoid control, but code, real-robot results, and benchmarks are not specified here.

arxiv

ATAAT: Adaptive Threat-Aware Adversarial Tuning Framework against Backdoor Attacks on Vision-Language-Action Models arxiv HighNEW

Kewei Chen, Yayu Long, Shuai Li, Mingsheng Shang · Submitted 2026-05-09

#VLA#Other

ATAAT targets backdoor vulnerabilities in vision-language-action models by focusing on attacks that enter through the visual pathway. It frames the problem as adaptive, threat-aware adversarial tuning, aiming to harden VLA policies against triggered visual inputs that could corrupt downstream robot actions.

LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts arxiv HighNEW

Seungeun Rho, Shamel Fahmi, Jeonghwan Kim, Arianna Ilvonen, Sehoon Ha et al. · Submitted 2026-05-06 · Updated 2026-05-08

#RL#Other

LineRides tackles agile bicycle-robot stunts by guiding reinforcement learning with task lines rather than hand-designed dense rewards or reference motions. The idea is aimed at maneuvers where demonstrations are hard to obtain, letting a new or unusual bicycle platform learn extreme behaviors from geometric guidance instead of imitation data.

arxiv

Multifingered force-aware control for humanoid robots arxiv High

Pasquale Marra, Gabriele M. Caddeo, Ugo Pattacini, Lorenzo Natale · Submitted 2026-03-09

#Manipulator#Humanoid

Addresses force-aware control for humanoid robots with multi-fingered hands, focusing on how contact forces should be distributed across fingers during manipulation. The available text is sparse, but the emphasis is on coordinating multifinger contacts so a humanoid hand can manage forces explicitly rather than treating grasping as purely kinematic control.

Towards Human-Like Manipulation through RL-Augmented Teleoperation and Mixture-of-Dexterous-Experts VLA arxiv High

Tutian Tang, Xingyu Ji, Wanli Xing, Ce Hao, Wenqiang Xu et al. · Submitted 2026-03-09

#VLA#Manipulator

ArXiv method combining RL-augmented teleoperation with a mixture-of-dexterous-experts VLA for higher-DoF robotic manipulation. It targets human-like dexterity beyond simple pick-and-place; no metrics, code, or benchmark details are stated.

arxiv

Synergizing Efficiency and Reliability for Continuous Mobile Manipulation arxiv High

Chengkai Wu, Ruilin Wang, Yixin Zeng, Jiayuan Wang, Mingjie Zhang et al. · Submitted 2026-04-07

#Manipulator#MobileManipulator

Synergizing Efficiency and Reliability for Continuous Mobile Manipulation studies how a robot can keep moving through successive manipulation tasks by combining anticipatory planning with immediate feedback, rather than stopping to replan between actions. The focus is on the efficiency-reliability tradeoff in continuous mobile manipulation, aiming to make robot behavior closer to the way humans fluidly adapt while carrying out chained physical tasks.

Stability of Control Lyapunov Function Guided Reinforcement Learning arxiv HighNEW

Zachary Olkin, William D. Compton, Aaron D. Ames · Submitted 2026-05-03 · Updated 2026-05-06

#RL#Humanoid

Control Lyapunov Function guided reinforcement learning is used to bring explicit stability structure into humanoid locomotion policies, where standard RL often works well empirically but lacks guarantees. The approach sits at the intersection of control-guided learning and policy optimization, aiming to train controllers whose behavior can be analyzed through CLF-style stability arguments rather than treated as a black box.

arxiv

When Life Gives You BC, Make Q-functions: Extracting Q-values from Behavior Cloning for On-Robot Reinforcement Learning arxiv HighNEW

Lakshita Dodeja, Ondrej Biza, Shivam Vats, Stephen Hart, Stefanie Tellex et al. · Submitted 2026-05-06

#RL#IL#Other

Behavior cloning is used as a starting point for on-robot reinforcement learning by extracting Q-values from a BC policy, giving the robot a value signal it can use to improve after demonstrations are exhausted. The interesting move is to turn a normally static imitation-learning policy into something that can guide online exploration and refinement, addressing BC’s usual lack of self-directed improvement.

arxiv

Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning arxiv HighNEW

Alper Kamil Bozkurt, Xiaoan Xu, Shangtong Zhang, Miroslav Pajic, Yuichi Motai · Submitted 2026-05-06

#RL#Other

Policies are trained offline from previously collected data, then moved into a constrained online fine-tuning phase where interaction budget becomes the central resource. The work focuses on choosing which offline policy to deploy and adapt under that budget, aiming to make offline-to-online reinforcement learning more efficient when real robot or environment interactions are limited.

arxiv

From Pixels to Tokens: A Systematic Study of Latent Action Supervision for Vision-Language-Action Models arxiv HighNEW

Yihan Lin, Haoyang Li, Yang Li, Haitao Shen, Yihan Zhao et al. · Submitted 2026-05-06

#VLA#Other

Latent action supervision is examined as a way to give vision-language-action models a shared intermediate action space across heterogeneous robot datasets. The study focuses on comparing fragmented design choices for representing and supervising these latent actions, clarifying how pixel-level observations can be converted into token-like action structure for more consistent VLA training.

arxiv

SigLoMa: Learning Open-World Quadrupedal Loco-Manipulation from Ego-Centric Vision arxiv HighNEW

Shiyi Chen, Haiyi Liu, Mingye Yang, Jiaqi Zhang, Debing Zhang · Submitted 2026-05-05

#RL#Manipulator

SigLoMa targets open-world quadrupedal loco-manipulation from ego-centric vision, aiming to avoid the sample inefficiency and sim-to-real brittleness that often come with exteroceptive reinforcement learning. From the provided excerpt, the concrete mechanism and results are not specified, but the focus is a vision-driven system for legged robots that must move and manipulate in less constrained environments.

Can Explicit Physical Feasibility Benefit VLA Learning? An Empirical Study arxiv High

Yubai Wei, Chen Wu, Hashem Haghbayan · Submitted 2026-04-20 · Updated 2026-05-05

#VLA#Other

An empirical study examines whether making physical feasibility explicit can improve Vision-Language-Action models, which usually learn robot actions directly from multimodal demonstrations. The premise is that adding constraints or signals about what actions are physically possible may help imitation-trained VLAs avoid implausible behavior, but the provided text does not include the specific method, benchmarks, or results.

Towards Low-Gravity Planetary Exploration using Reinforcement Learning for Walking, Jumping, and In-flight Attitude Control arxiv HighNEW

Jørgen Anker Olsen, Kostas Alexis · Submitted 2026-05-23

#RL#Other

Reinforcement-learning policies are trained for a Mars-oriented quadruped to walk, jump vertically, jump forward, and reorient itself in flight, letting it clear obstacles larger than its body and land safely under low gravity. On the Olympus robot, the attitude controller transfers to hardware for single-axis reorientation, reaching 90 degrees in 2.6 seconds, while simulation shows 3.1 m vertical jumps and 3.9 m forward jumps in Martian gravity.

Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation arxiv HighNEW

Chenyu Hui, Xiaodi Huang, Siyu Xu, Yunke Wang, Shan You et al. · Submitted 2026-05-04

#VLA#Other

Simulated robot videos are used as cheap VLA training data, but first transferred toward real-world visual realism to reduce the domain gap that usually hurts deployment. The approach targets efficient video-level augmentation rather than simply collecting more real footage, aiming to preserve the action-relevant structure of simulation while making the observations look more like real robot data.

arxiv

AoI-Aware Multi-Robot Sensing and Transport on Connected Graphs arxiv HighNEW

John Tadrous · Submitted 2026-05-04

#Other

AoI-Aware Multi-Robot Sensing and Transport on Connected Graphs studies teams of mobile robots that both collect measurements from distributed processes and relay them back to a base over a graph. The setup treats age of information from the moment sensing begins, so the objective accounts for stochastic parallel sensing times as well as hop-by-hop transport delays, making the scheduling problem closer to real monitoring pipelines than models that only age data after collection.

arxiv

RoboLight: A Dataset with Linearly Composable Illumination for Robotic Manipulation arxiv High

Shutong Jin, Jin Yang, Muhammad Zahid, Florian T. Pokorny · Submitted 2026-03-04

#RL#Manipulator

RoboLight is a real-world robotic manipulation dataset of synchronized episodes captured while lighting conditions are systematically varied. Its focus on linearly composable illumination is meant to support analysis and training of manipulation policies that can separate task behavior from lighting changes, a practical gap in robot learning data collected under more fixed visual conditions.

arxiv

Anticipation-VLA: Solving Long-Horizon Embodied Tasks via Anticipation-based Subgoal Generation arxiv HighNEW

Zhilong Zhang, Wenyu Luo, Haonan Wang, Yifei Sheng, Yidi Wang et al. · Submitted 2026-05-03

#VLA#Other

Anticipation-VLA targets the long-horizon weakness of vision-language-action robots by generating intermediate subgoals from anticipated future states, rather than relying only on the current image and instruction at each step. The idea is to give the policy a more structured path through extended embodied tasks, reducing the compounding errors that typically derail VLA models over many actions.

arxiv

VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model arxiv HighNEW

Wenhao Li, Xiu Su, Dan Niu, Yichao Cao, Hongyan Xu et al. · Submitted 2026-05-02

#VLA#Manipulator

VLA-ATTC adds a deliberation step to Vision-Language-Action models by using a Relative Action Critic Model at test time to score and compare candidate actions before execution. The aim is to move beyond the usual single-pass, reflex-like policy behavior in embodied manipulation, letting the model spend extra compute on harder decisions where action choice matters most.

Stereo Multistage Spatial Attention for Real-Time Mobile Manipulation Under Visual Scale Variation and Disturbances arxiv HighNEW

Xianbo Cai, Hideyuki Ichiwara, Hyogo Hiruma, Masaki Yoshikawa, Hiroshi Ito et al. · Submitted 2026-05-01

#Manipulator#MobileManipulator

Stereo Multistage Spatial Attention targets the scale instability that mobile manipulators see when onboard stereo cameras move through unstructured environments and the same object changes apparent size. It appears to use staged spatial attention over stereo visual input to keep vision-based motion generation responsive in real time under viewpoint shifts and disturbances, aiming at more reliable mobile manipulation without relying on fixed external perception.

arxiv

CReF: Cross-modal and Recurrent Fusion for Depth-conditioned Humanoid Locomotion arxiv High

Yuan Hao, Ruiqi Yu, Shixin Luo, Guoteng Zhang, Jun Wu et al. · Submitted 2026-03-31 · Updated 2026-04-01

#Humanoid

CReF targets depth-conditioned humanoid locomotion on geometrically complex terrain by moving away from explicit robot-centric 2.5D terrain maps and auxiliary geometry-supervision signals. It uses cross-modal and recurrent fusion to connect depth perception directly with control, aiming to make exteroceptive humanoid traversal more robust without forcing the policy through hand-designed geometric abstractions.

arxiv

Energy-Efficient Multi-Robot Coverage Path Planning of Non-Convex Regions of Interests arxiv High

Sourav Raxit, Jose Fuentes, Paulo Padrao, Abdullah Al Redwan Newaz, Md Tamjidul Hoque et al. · Submitted 2026-04-24 · Updated 2026-04-30

#Other

Energy-efficient multi-robot coverage path planning is tackled for large non-convex regions that include obstacles and no-fly zones. The approach targets the limitations of existing minimum-energy coverage planners that rely on meta-heuristic boustrophedon workspace decomposition, aiming to coordinate multiple robots while accounting for the geometry and constraints of realistic regions of interest.

arxiv

Do World Action Models Generalize Better than VLAs? A Robustness Study arxiv High

Zhanguang Zhang, Zhiyuan Li, Behnam Rahmati, Rui Heng Yang, Yintao Ma et al. · Submitted 2026-03-23 · Updated 2026-04-30

#VLA#Other

The study compares world action models, which predict how a robot’s environment will change under candidate actions, against vision-language-action models for real-world robot action planning robustness. From the provided text, the central question is whether explicitly modeling action-conditioned future dynamics yields better generalization than directly mapping perception and language to actions.

arxiv

FASTER: Rethinking Real-Time Flow VLAs arxiv High

Yuxiang Lu, Zhe Liu, Xianzhe Fan, Zhenya Yang, Jinghua Hou et al. · Submitted 2026-03-19 · Updated 2026-04-29

#VLA#Other

FASTER targets the reaction-time bottleneck in real-time Vision-Language-Action models, where asynchronous inference can keep robot motions smooth but still lag when the environment changes. It reframes flow-based VLA execution around faster adaptation to new observations, making latency itself the central design problem rather than treating responsiveness as a side effect of trajectory generation.

arxiv

DC-Ada: Reward-Only Decentralized Sensor Adaptation for Heterogeneous Multi-Robot Teams arxiv High

Saad Alqithami · Submitted 2026-04-05 · Updated 2026-04-29

#Other

DC-Ada tackles sensor mismatch in heterogeneous multi-robot teams, where robots may have different modalities, ranges, fields of view, or degradation patterns. It frames adaptation as a reward-only decentralized process, aiming to let each robot adjust its sensing behavior from task feedback rather than relying on shared sensor models or centralized supervision.

ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation arxiv High

Yu Sun, Meng Cao, Ping Yang, Rongtao Xu, Yunxiao Yan et al. · Submitted 2026-03-30

#RL#VLA#Manipulator

ManipArena is an arXiv benchmark/protocol for real-world evaluation of reasoning-oriented generalist robot manipulation. It targets VLA models and world models, aiming to make deployment progress measurable with reliable real-world tests.

Anisotropic Diffusion-Driven Ergodic Coverage in Multi-Robot Systems arxiv HighNEW

Thales C. Silva, Anoop Kiran, Nora Ayanian · Submitted 2026-05-22

#Other

Anisotropic Diffusion-Driven Ergodic Coverage turns the mismatch between a desired spatial distribution and multi-robot trajectories into a potential field shaped by anisotropic diffusion rather than uniform heat-flow smoothing. By using the gradient of a Perona-Malik diffusion solution to steer agents, it preserves sensitivity to directional structure in the target density while still subsuming earlier heat-equation and radial-basis formulations. The simulations show how this generalized diffusion view can drive ergodic coverage across different multi-robot search scenarios.

arxiv

FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models arxiv High

Xinyuan An, Tao Luo, Gengyun Peng, Yaobing Wang, Kui Ren et al. · Submitted 2026-03-30

#VLA#Other

FlowHijack targets flow-matching vision-language-action policies such as π₀ by exploiting their continuous action-generation dynamics rather than just poisoning static input-output behavior. The attack is framed as a dynamics-aware backdoor for robotics policies that can preserve normal performance while causing triggered action trajectories to deviate in controlled ways, highlighting a security risk specific to smooth flow-based VLA control.

Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment arxiv High

Kaijun Zhou, Qiwei Chen, Da Peng, Zhiyang Li, Xijun Li et al. · Submitted 2026-04-27

#VLA#Other

arXiv study characterizes Vision-Language-Action model inference across XPUs for on-robot deployment. It targets real-time latency, cost, and energy constraints, making the hardware/acceleration tradeoffs actionable for robot deployment.

arxiv

KERV: Kinematic-Rectified Speculative Decoding for Embodied VLA Models arxiv High

Zihao Zheng, Zhihao Mao, Maoliang Li, Jiayu Chen, Xinhao Sun et al. · Submitted 2026-03-02 · Updated 2026-04-27

#VLA#Other

KERV applies speculative decoding to Vision-Language-Action robot controllers, aiming to speed up token-based action generation without breaking the kinematic consistency needed for embodied control. It uses kinematic rectification to make the draft actions proposed during decoding more physically plausible before they are verified by the larger VLA model, addressing a failure mode where generic speedups can produce invalid or inefficient robot motion.

arxiv

AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation arxiv High

Kai Yang, Zedong Chu, Yingnan Guo, Zhengbo Wang, Shichao Xie et al. · Submitted 2026-04-27

#VLA#Other

AsyncShield targets the deployment gap between large VLA navigation policies and real robots by adding a plug-and-play edge adapter around cloud-hosted models. From the title and abstract snippet, the interesting mechanism is its handling of asynchronous cloud inference, letting a robot keep navigating safely or smoothly despite latency rather than requiring the full VLA model to run onboard.

Semantically Structured Mixture-of-Experts for Compositional Robotic Manipulation arxiv HighNEW

Chengyu Deng, Guanqi Chen, Yizhou Chen, Zejia Liu, Zhiwen Ruan et al. · Submitted 2026-05-22

#IL#Manipulator

SMoDP makes diffusion policies more efficient for multi-task manipulation by routing action chunks through a mixture of experts according to semantic skill phases rather than low-level latent or noise statistics. A lightweight inference-time skill predictor, trained from VLM-generated offline annotations, assigns observations to experts, while dual contrastive alignment ties visual observations to language-defined skills and keeps functionally similar behaviors routed consistently. On multi-task benchmarks, it beats diffusion and MoE baselines with better parameter efficiency and shows compositional transfer to new tasks via parameter-efficient fine-tuning.

arxiv

Betting for Sim-to-Real Performance Evaluation arxiv High

Zaid Mahboob, Yujia Chen, Bowen Weng · Submitted 2026-04-27

#Other

Betting for Sim-to-Real Performance Evaluation studies how to estimate a robot policy’s real-world performance when physical trials are scarce, using a betting-based statistical procedure to make efficient use of limited hardware experiments. The interesting angle is that it treats sim-to-real evaluation as a constrained inference problem rather than just running more rollouts, aiming to provide reliable performance estimates under severe limits on real-world testing.

arxiv

Multi-Robot Motions in Milliseconds: Vector-Accelerated Primitives for Sampling-Based Planning arxiv High

James D. Motes, Marco Morales, Nancy M. Amato · Submitted 2026-04-27

#Other

VAMP is extended to multi-robot motion planning by adding SIMD-accelerated primitives for multi-robot motion validation and first-conflict detection. The interesting part is that the speedup comes from vectorizing core sampling-based planning operations inside the multi-robot setting, aiming to make collision and conflict checks fast enough for millisecond-scale planning.

arxiv

Cooperative Informative Sensing for Monitoring Dynamic Indoor Environments via Multi-Agent Reinforcement Learning arxiv High

Kanghoon Lee, Matthew M. Sato, Jinnyeong Yang, Seungro Lee, Sujin Lee et al. · Submitted 2026-04-25

#RL#Other

Cooperative Informative Sensing treats indoor human-activity monitoring as a multi-robot coordination problem, where agents learn sensing policies for tracking dynamic environments rather than passively covering fixed locations. The title suggests a multi-agent reinforcement learning approach aimed at deciding where robots should sense cooperatively so they gather more informative observations for facility management, safety assessment, and space-use analysis.

$π$, But Make It Fly: Physics-Guided Transfer of VLA Models to Aerial Manipulation arxiv High

Johnathan Tucker, Denis Liu, Aiden Swann, Allen Ren, Javier Yu et al. · Submitted 2026-03-26

#VLA#Manipulator

π₀ is adapted from fixed-base manipulation to aerial manipulation by using physics guidance to bridge the embodiment gap between grounded arms and flying robots. The idea is to preserve the generalization strengths of vision-language-action models while making their actions compatible with the dynamics and constraints of a robot that has to fly while manipulating objects.

arxiv

Breaking Lock-In: Preserving Steerability under Low-Data VLA Post-Training arxiv High

Suning Huang, Jiaqi Shao, Ke Wang, Qianzhong Chen, Jiankai Sun et al. · Submitted 2026-04-25

#VLA#Other

Low-data post-training can make a generalist vision-language-action policy “lock in” to the few demonstrated behaviors and stop following new language instructions. Breaking Lock-In targets that failure mode by preserving steerability during small-dataset adaptation, so the policy can absorb new robot behaviors without losing the instruction responsiveness that made the base VLA useful.

arxiv

RecoverFormer: End-to-End Contact-Aware Recovery for Humanoid Robots arxiv High

Zihui Liu · Submitted 2026-04-24

#Humanoid

RecoverFormer targets the problem of getting humanoid robots back on their feet after unexpected pushes or failures in messy environments, using an end-to-end control policy that is explicitly aware of contacts. From the available abstract text, the concrete mechanism and results are not specified, but the focus is on making recovery behavior part of the learned controller rather than a separately engineered fallback routine.

arxiv

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model arxiv High

Yaxuan Li, Zhongyi Zhou, Yefei Chen, Yaokai Xue, Yichen Zhu · Submitted 2026-04-24

#RL#Other

dWorldEval targets the bottleneck of evaluating robot policies at large scale by using a discrete diffusion world model as a proxy environment, rather than running every policy across thousands of real or simulated task settings. The idea is to make broad policy evaluation feasible across many environments and tasks, turning what is normally an expensive robotics benchmarking problem into a scalable model-based evaluation pipeline.

arxiv

Task-Driven Co-Design of Heterogeneous Multi-Robot Systems arxiv High

Maximilian Stralz, Meshal Alharbi, Yujun Huang, Gioele Zardini · Submitted 2026-04-23

#Other

Task-Driven Co-Design of Heterogeneous Multi-Robot Systems tackles the coupled problem of choosing robot designs, fleet makeup, and plans together rather than optimizing each layer separately. It frames heterogeneous multi-robot design around the downstream task, so the interesting part is the cross-domain reasoning: morphology, composition, and execution are treated as interdependent decisions in one design loop.

arxiv

Path Planning and Reinforcement Learning-Driven Control of On-Orbit Free-Flying Multi-Arm Robots arxiv High

Álvaro Belmonte-Baeza, José Luis Ramón, Leonard Felicetti, Miguel Cazorla, Jorge Pomares · Submitted 2026-03-24

#RL#Manipulator

Hybrid trajectory optimization and reinforcement learning are combined to plan and control free-flying multi-arm robots for on-orbit servicing. The trajectory optimizer handles motion planning, while the learned controller is used to execute or regulate those motions under the dynamics of a spacecraft-mounted multi-arm system, targeting the coupled planning-control problem that makes orbital manipulation difficult.

arxiv

From Noise to Intent: Anchoring Generative VLA Policies with Residual Bridges arxiv High

Yiming Zhong, Yaoyu He, Zemin Yang, Pengfei Tian, Yifan Huang et al. · Submitted 2026-04-23

#VLA#Other

Residual Bridges aims to connect semantic vision-language-action reasoning with fine-grained robot control by anchoring generative VLA policies against residual corrections rather than letting noisy generation directly drive actions. The framing targets the scale mismatch between intent-level cognition and low-level motor execution, using the residual bridge as a mechanism for translating high-level goals into physically grounded behavior.

arxiv

Navigating the Clutter: Waypoint-Based Bi-Level Planning for Multi-Robot Systems arxiv High

Jiabao Ji, Yongchao Chen, Yang Zhang, Ramana Rao Kompella, Chuchu Fan et al. · Submitted 2026-04-22

#Other

Waypoint-Based Bi-Level Planning tackles multi-robot navigation in cluttered spaces by decomposing the problem into waypoint selection and lower-level motion planning, aiming to handle robot-robot collisions, obstacle avoidance, and infeasible motions more explicitly. From the provided text, the central focus is coordinating multiple robots under realistic physical constraints in dense environments; no benchmark results or quantitative gains are given.

arxiv

Mask World Model: Predicting What Matters for Robust Robot Policy Learning arxiv High

Yunfan Lou, Xiaowei Chi, Xiaojie Zhang, Zezhong Qian, Chengxuan Li et al. · Submitted 2026-04-21 · Updated 2026-04-22

#RL#Other

Mask World Model targets robot policy learning by predicting masked, task-relevant parts of future observations rather than trying to model every pixel in a video. The idea is to make video-pretrained world models more robust and useful for control by focusing their predictive capacity on what matters for action, but the provided text does not include benchmark results or quantitative gains.

Abstraction for Offline Goal-Conditioned Reinforcement Learning arxiv HighNEW

Clarisse Wibault, Alexander Goldie, Antonio Villares, Maike Osborne, Jakob Foerster · Submitted 2026-05-21

#RL#Other

Goal-conditioned offline RL is framed around hierarchical policies that do more than shorten horizons: they provide an absolute abstraction that lets behavior learned in one part of the state-goal space transfer to structurally similar contexts elsewhere. The method introduces relativised options and separate representations at different hierarchy levels, then gives two simple algorithms for learning these options and abstracting away from the absolute frame of reference. Experiments show that these inductive biases improve offline GCRL performance by reusing experience across symmetric or redundant state-goal relationships.

Learning to Evolve: Multi-modal Interactive Fields for Robust Humanoid Navigation in Dynamic Environments arxiv HighNEW

Peifeng Jiang, Hong Liu, Jin Jin, Wenshuai Wang, Xia Li · Submitted 2026-05-21

#Manipulator#Humanoid

Multi-modal Interactive Field (MIF) gives a humanoid robot a more reliable scene memory for navigation and manipulation by combining uncertainty-aware 3D Gaussian Splatting, local discrepancy-triggered memory updates, and task-driven geometry reconstruction for interaction safety. On a Unitree-G1 in a dynamic office, it distinguishes transient gait-induced perceptual artifacts from persistent scene changes, raising relocation success from 12% with static scene-graph memory to 94% while cutting semantic memory footprint by 91.4%.

arxiv

SynAgent: Generalizable Cooperative Humanoid Manipulation via Solo-to-Cooperative Agent Synergy arxiv High

Wei Yao, Haohan Ma, Hongwen Zhang, Yunlian Sun, Liangjun Xing et al. · Submitted 2026-04-20

#Manipulator#Humanoid

SynAgent targets cooperative humanoid manipulation, where two embodied agents need to coordinate controllable object handling despite scarce multi-agent training data. The idea is to transfer or compose solo manipulation competence into cooperative behavior, aiming for better generalization across objects while handling the coordination burden that makes humanoid teamwork difficult.

DriveMA: Rethinking Language Interfaces in Driving VLAs with One-Step Meta-Actions arxiv HighNEW

Weicheng Zheng, Yixin Huang, Qiao Sun, Derun Li, Hang Zhao · Submitted 2026-05-20

#RL#VLA#Other

DriveMA replaces long natural-language reasoning chains in driving VLAs with concise one-step meta-actions that can be derived automatically from expert trajectories, giving the model semantic decision grounding without costly annotation or high-latency generation. It trains with action-centric supervision plus turn-level reinforcement learning that jointly rewards meta-action correctness, trajectory quality, and consistency between the two. With only a 2B model it reaches an RFS of 8.060 on the Waymo End-to-End Driving Challenge, and the 4B version raises that to 8.079, with ablations showing a better efficiency–predictability tradeoff than verbose reasoning or finer action sequences.

arxiv

OFlow: Injecting Object-Aware Temporal Flow Matching for Robust Robotic Manipulation arxiv High

Kuanning Wang, Ke Fan, Chenhao Qiu, Zeyu Shangguan, Yuqian Fu et al. · Submitted 2026-04-20

#VLA#Manipulator

OFlow targets robotic manipulation by adding object-aware temporal flow matching to VLA-style models, aiming to model how task-relevant objects move through cluttered scenes rather than only predicting actions from static visual-language context. From the provided text, the concrete results are not given, so the substantive takeaway is the mechanism: tying temporal scene evolution to object recognition for more robust manipulation.

AnchorRefine: Synergy-Manipulation Based on Trajectory Anchor and Residual Refinement for Vision-Language-Action Models arxiv High

Tingzheng Jia, Kan Guo, Lanping Qian, Yongli Hu, Daxin Tian et al. · Submitted 2026-04-20

#VLA#Manipulator

AnchorRefine separates manipulation into coarse trajectory anchors and residual refinements, aiming to give vision-language-action policies both a globally organized motion plan and fine-grained local correction. The arXiv blurb is sparse, but the interesting idea is the explicit split between where the trajectory should go and how each step should be adjusted for precision-critical execution, rather than forcing all actions through one unified prediction space.

arxiv

Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking arxiv High

Zewei Zhang, Kehan Wen, Michael Xu, Junzhe He, Chenhao Li et al. · Submitted 2026-04-19

#Humanoid

Whole-body humanoid locomotion is framed as a combination of motion generation and motion tracking, targeting the hard parts of controlling an unstable, high-dimensional body while reacting in real time to terrain from onboard perception. The work appears to focus on making humanoid walking more adaptive and physically coordinated across the full body, rather than treating locomotion as a narrow leg-control problem.

CaptchaMind: Training CAPTCHA Solvers via Reinforcement Learning with Explicit Reasoning Supervision arxiv HighNEW

Pengcheng Wang, Haoxiang Liu, Yang Dai, Xiangxiang Zeng, Guanhua Chen et al. · Submitted 2026-05-19

#RL#Other

CaptchaMind trains CAPTCHA-solving agents with reinforcement learning plus explicit supervision over intermediate reasoning steps, using CaptchaBench, a 16,000-sample synthetic benchmark spanning eight CAPTCHA task types with region and process annotations. The benchmark shows existing solvers break down on fine-grained visual detail and region-level comparisons, while CaptchaMind reaches 82.9% average success across the eight tasks and 71.0% on real-world CAPTCHAs without relying on closed-source APIs.

arxiv

Sparse Autoencoders Reveal Interpretable and Steerable Features in VLA Models arxiv High

Aiden Swann, Lachlain McGranahan, Hugo Buurmeijer, Monroe Kennedy, Mac Schwager · Submitted 2026-03-19

#VLA#Manipulator

Sparse autoencoders are used to probe Vision-Language-Action robot policies, exposing internal features that correspond to interpretable concepts in manipulation behavior rather than leaving the model as a black box. The interesting part is that these features are not just descriptive: they can apparently be steered, suggesting a route to diagnose and potentially control why fine-tuned VLA models succeed in familiar settings but break on new objects, scenes, or instructions.

arxiv

RoboForge: Physically Optimized Text-guided Whole-Body Locomotion for Humanoids arxiv High

Xichen Yuan, Zhe Li, Bofan Lyu, Kuangji Zuo, Yanshuo Lu et al. · Submitted 2026-03-18 · Updated 2026-03-19

#Humanoid

RoboForge targets the gap between text-generated human motion and motions a humanoid can actually execute, optimizing whole-body locomotion under physical constraints rather than just producing plausible animation. The available description is sparse, but the interesting piece is the text-guided pipeline’s focus on making generated behaviors robot-realizable for humanoids, where balance, contacts, and actuation limits usually break direct motion transfer.

arxiv

LongBench: Evaluating Robotic Manipulation Policies on Real-World Long-Horizon Tasks arxiv High

Xueyao Chen, Jingkai Jia, Tong Yang, Yibo Fu, Wei Li et al. · Submitted 2026-04-18

#Manipulator

LongBench evaluates robotic manipulation policies on real-world long-horizon tasks, targeting the failure modes that emerge when policies are rolled out over extended sequences rather than short isolated skills. It is meant to make degradation over time easier to diagnose, addressing a gap in existing benchmarks that often show whether a robot failed but not much about why long-horizon execution breaks down.

arxiv

Efficient and Versatile Quadrupedal Skating: Optimal Co-design via Reinforcement Learning and Bayesian Optimization arxiv High

Hanwen Wang, Zhenlong Fang, Josiah Hanna, Xiaobin Xiong · Submitted 2026-03-19

#RL#Other

A hardware-control co-design method tunes both a quadruped’s passive-wheel skating hardware and its learned controller, using reinforcement learning together with Bayesian optimization. The setup targets efficient high-speed locomotion by reducing leg inertia with passive wheels, aiming to make roller-skating quadrupeds both energy-efficient and adaptable rather than tied to a fixed gait or manually chosen design.

arxiv

Angle-based Localization and Rigidity Maintenance Control for Multi-Robot Networks arxiv High

J. Francisco Presenza, Leonardo J. Colombo, Juan I. Giribet, Ignacio Mas · Submitted 2026-04-13 · Updated 2026-04-17

#Other

Studies angle-only methods for localizing robots in a multi-robot network while keeping the formation rigid enough for stable coordination. The available description is very sparse, but the focus appears to be on using inter-robot bearing/angle information both to estimate relative positions and to drive a control law that preserves network rigidity.

arxiv

ProbeFlow: Training-Free Adaptive Flow Matching for Vision-Language-Action Models arxiv High

Zhou Fang, Jiaqi Wang, Yi Zhou, Qiongfeng Shi · Submitted 2026-03-18

#VLA#Manipulator

ProbeFlow targets the inference bottleneck in flow-matching action heads for vision-language-action robot policies, where multi-step ODE solving can make control too slow for responsive manipulation. It is a training-free adaptive inference method, suggesting it speeds up VLA control without retraining the underlying model, but the provided text does not include benchmark results or latency numbers.

SQARL: A Size-Agnostic Reinforcement Learning approach for Circuit Allocation in Distributed Quantum Architectures arxiv HighNEW

Víctor Carballo, Júlia López-Closa, Mario Martin · Submitted 2026-05-26

#RL#Other

SQARL tackles qubit allocation for distributed quantum computers, where circuits must be split across smaller quantum cores while avoiding costly inter-core communication. It uses a transformer-based reinforcement learning policy that is size-agnostic, so it can handle different numbers of qubits and cores without retraining. The approach beats prior RL allocation methods, closes much of the gap to the Hungarian Qubit Allocation heuristic, and even reduces allocation cost versus HQA by 33% on Cuccaro Adder circuits and 25% on average for random circuits.

arxiv

Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints arxiv High

Sadık Bera Yüksel, Ali Tevfik Buyukkocak, Derya Aksaray · Submitted 2026-03-17

#RL#Other

Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints addresses safe RL for robotics by combining learned control with a shield that enforces temporal-logic constraints that can change over time. The framing is aimed at real robot deployment, where policies must keep adapting to task objectives while respecting operational rules rather than treating safety as a fixed, offline specification.

arxiv

Safe and Energy-Aware Multi-Robot Density Control via PDE-Constrained Optimization for Long-Duration Autonomy arxiv High

Longchen Niu, Andrew Nasif, Gennaro Notomista · Submitted 2026-04-16

#Other

Safe and energy-aware density control for robot swarms is formulated as a PDE-constrained optimization problem, with stochastic robot motion represented by the Fokker-Planck equation rather than by individual trajectories. The interesting part is that safety constraints and long-duration energy sustainability are enforced directly at the population-density level, giving a way to reason about large multi-robot teams without tracking every robot separately.

arxiv

SLowRL: Safe Low-Rank Adaptation Reinforcement Learning for Locomotion arxiv High

Elham Daneshmand, Shafeef Omar, Glen Berseth, Majid Khadiv, Hsiu-Chin Lin · Submitted 2026-03-17

#RL#Other

SLowRL targets the risky last mile of sim-to-real locomotion by fine-tuning a deployed policy through safe low-rank adaptation rather than full, unconstrained hardware training. It is designed to reduce mechanical risk and sample inefficiency when adapting legged robot controllers on real hardware, focusing updates into a constrained low-rank space so the robot can improve while staying closer to the original simulated policy.

arxiv

Abstract Sim2Real through Approximate Information States arxiv High

Yunfu Deng, Yuhao Li, Josiah P. Hanna · Submitted 2026-04-16

#RL#Other

Reinforcement learning for robotics often depends on having a fast, accurate simulator, and Abstract Sim2Real through Approximate Information States appears to target that sim-to-real bottleneck by using approximate information states as an abstraction between simulated training and real-world deployment. From the provided text alone, the concrete mechanism and results are not available, so the safest reading is that it studies how to transfer learned robot policies when exact simulator fidelity is unavailable.

arxiv

Controlling Fish Schools via Reinforcement Learning of Virtual Fish Movement arxiv High

Yusuke Nishii, Hiroaki Kawashima · Submitted 2026-03-17

#RL#Other

Virtual fish displayed on a screen are trained with reinforcement learning to steer real fish schools, avoiding the durability and motion-limit problems of physical robotic agents. The approach treats the virtual fish’s movement policy as the control interface, using learned on-screen behavior to influence collective animal motion in a more flexible experimental setup.

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning arxiv HighNEW

Guochao Jiang, Jingyi Song, Guofeng Quan, Chuzhan Hao, Guohua Liu et al. · Submitted 2026-05-25

#RL#Other

DVAO adapts GRPO-style LLM reinforcement learning to multi-reward settings by combining per-objective advantages with weights derived from empirical reward variance inside each rollout group. The weighting scheme emphasizes objectives with cleaner learning signal, dampens noisy ones, and is shown to keep advantage magnitudes bounded while adding cross-objective regularization. On mathematical reasoning and tool-use benchmarks with Qwen3 and Qwen2.5, it improves Pareto tradeoffs across objectives and trains more stably than reward- or advantage-combination baselines.

Towards Generalizable Robotic Manipulation in Dynamic Environments arxiv High

Heng Fang, Shangru Li, Shuhan Wang, Xuanyang Xi, Dingkang Liang et al. · Submitted 2026-03-16 · Updated 2026-04-15

#VLA#Manipulator

Vision-Language-Action models are framed here around the gap between static tabletop manipulation and dynamic settings where targets move during execution. The paper appears to focus on making robotic manipulation policies generalize under those changing conditions, but the provided text does not include the proposed mechanism, benchmarks, or results.

arxiv

ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation arxiv High

Jingjing Qian, Zeyuan He, Chen Shi, Lei Xiao, Li Jiang · Submitted 2026-04-15

#Manipulator#MobileManipulator

ESCAPE targets long-horizon mobile manipulation in indoor environments by combining episodic spatial memory with an adaptive execution policy, aiming to keep navigation and object interaction coherent over extended tasks. It is designed to address failures like catastrophic forgetting, spatial inconsistency, and overly rigid action execution, which become more pronounced as embodied AI agents operate across longer sequences.

arxiv

Capability-Aware Heterogeneous Control Barrier Functions for Decentralized Multi-Robot Safe Navigation arxiv High

Joonkyung Kim, Yanze Zhang, Wenhao Luo, Yiwei Lyu · Submitted 2026-04-14

#Other

Capability-Aware Heterogeneous Control Barrier Functions targets decentralized safe navigation in multi-robot teams where agents may have different motion or control capabilities. It appears to focus on enforcing collision-avoidance and safety constraints through control barrier functions that account for those heterogeneous capabilities, aiming to preserve task efficiency rather than treating all robots as identical.

Tree Learning: A Multi-Skill Continual Learning Framework for Humanoid Robots arxiv High

Yifei Yan, Linqi Ye · Submitted 2026-04-14

#Humanoid

Tree Learning targets continual multi-skill training for humanoid robots, where new reinforcement-learned behaviors need to be added without erasing earlier ones. From the provided abstract fragment, the concrete focus is on structuring skill expansion so a humanoid can move beyond isolated single-task policies while controlling catastrophic forgetting, but the mechanism and results are not specified here.

arxiv

Data-Driven Physics Embedded Dynamics with Predictive Control and Reinforcement Learning for Quadrupeds arxiv High

Prakrut Kotecha, Aditya Shirwatkar, Shishir Kolathaya · Submitted 2026-03-15

#RL#Other

Quadrupedal locomotion is framed around combining model predictive control with reinforcement learning, using MPC for structured planning while RL supplies terrain-adaptive behavior and more flexible motion skills. The interesting angle is the attempt to embed learned, data-driven dynamics into a physics-aware control loop, aiming to keep the robustness and foresight of predictive control while extending what quadrupeds can handle in complex environments.

Multi-Robot Box Transport over Different Surfaces with Decentralized Role-based Proportional Control arxiv HighNEW

Aditya Bhatt, Himavarshini Yarragangu, Urvish Shah, Venkata Sai Yaswanth Mohan Thota, Souma Chowdhury · Submitted 2026-05-26

#RL#Manipulator

R2P2 coordinates multiple robots to push rectangular boxes across flat, uphill, and downhill surfaces by assigning local roles such as push, support, or prevent, then using rule-based or proportional velocity control depending on whether the box needs translation or rotation. Because each robot acts asynchronously from its own observation of itself and the box, the method avoids centralized consensus or tight synchronization while still adapting to changes in friction, slope, and box mass. In IsaacSim tests with six robots it outperformed a virtual-leader-follower baseline in success rate, and it also ran onboard four TurtleBots moving a 1.2 kg box in a physical experiment.

arxiv

Dynamic Multi-Robot Task Allocation under Uncertainty and Communication Constraints: A Game-Theoretic Approach arxiv High

Maria G. Mendoza, Pan-Yang Su, Bryce L. Ferguson, S. Shankar Sastry · Submitted 2026-04-13

#Other

A game-theoretic formulation handles online multi-robot task allocation when tasks have deadlines, completion is uncertain, and robots only have partial sensing and communication from distributed hubs. It focuses on coordinating assignments over a finite horizon despite incomplete information, aiming to make task choices robust to both execution uncertainty and communication limits.

arxiv

DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models arxiv High

Siyuan Xu, Tianshi Wang, Fengling Li, Lei Zhu, Heng Tao Shen · Submitted 2026-04-13

#VLA#Other

DA-PTQ targets the practical bottleneck of running vision-language-action models on robots by applying post-training quantization with explicit attention to distribution drift. From the available text, the concrete setup and results are not specified, but the focus is efficient VLA deployment under tight memory and compute budgets without retraining.

arxiv

Building Explicit World Model for Zero-Shot Open-World Object Manipulation arxiv High

Xiaotong Li, Gang Chen, Javier Alonso-Mora · Submitted 2026-03-14

#RL#VLA#Manipulator

Building Explicit World Model for Zero-Shot Open-World Object Manipulation targets robotic object manipulation in settings where the robot must handle unfamiliar objects without task-specific demonstrations. It argues for giving the system an explicit world model rather than relying only on vision-language-action policies trained from large robot demonstration corpora, aiming to improve generalization when collecting coverage for every object and action is impractical.

arxiv

Robust Sim-to-Real Cloth Untangling through Reduced-Resolution Observations via Adaptive Force-Difference Quantization arxiv High

Yoshihisa Tsurumine, Yuki Kadokawa, Kohei Hayashi, Christian Diehm, Takamitsu Matsubara · Submitted 2026-03-14

#Other

Robotic cloth untangling is framed as a sim-to-real control problem where a policy learns to adapt pulling actions as fabric contacts and tension change, avoiding the cost and wear of large-scale real-world training. The method, Adaptive Force-Difference Quantization, uses reduced-resolution observations to make transfer more robust while still preserving the force-change cues needed to decide how to disentangle the cloth.

RIO: Flexible Real-Time Robot I/O for Cross-Embodiment Robot Learning arxiv HighNEW

Pablo Ortega-Kral, Eliot Xing, Arthur Bucker, Vernon Luk, Junseo Kim et al. · Submitted 2026-05-12

#VLA#Other

RIO targets cross-embodiment robot learning by standardizing the real-time input/output layer needed to run robot policies across different hardware platforms. The framing suggests a practical systems contribution: making Vision-Language-Action models easier to deploy, compare, and reuse beyond the specific robot embodiment they were trained or demonstrated on.

arxiv

PRoID: Predicted Rate of Information Delivery in Multi-Robot Exploration and Relaying arxiv High

Seungchan Kim, Seungjae Baek, Micah Corah, Graeme Best, Brady Moon et al. · Submitted 2026-04-12

#Other

PRoID tackles multi-robot exploration where robots must both discover an unknown environment and get the collected information back to a fixed base before time runs out. It frames coordination around a predicted rate of information delivery, so robots can reason about exploration value together with the relay constraints needed to actually return data during the mission.

CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models arxiv HighNEW

Wenxuan Song, Han Zhao, Fuhao Li, Ziyang Zhou, Xi Wang et al. · Submitted 2026-05-11

#VLA#Other

CapVector targets the gap between pretrained vision-language-action models and the limited gains often seen from standard supervised finetuning. It learns transferable “capability vectors” directly in parameter space, aiming to inject reusable task-relevant behaviors into VLA models while reducing the adaptation burden compared with full SFT.

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models arxiv HighNEW

Hao Wang, Xiaobao Wei, Jingyang He, Chengyu Bai, Chun-Kai Fan et al. · Submitted 2026-05-11

#VLA#Manipulator

VEGA targets a common weakness in vision-language-action models: their image encoders are usually trained on 2D data, so they can miss the 3D spatial structure needed for precise manipulation. It aligns visual encoder representations with explicit grounding for spatial awareness, aiming to give robotic policies a better geometric sense of where objects and actions are in the scene.

arxiv

Device-Conditioned Neural Architecture Search for Efficient Robotic Manipulation arxiv High

Yiming Wu, Huan Wang, Zhenghao Chen, Ge Yuan, Dong Xu · Submitted 2026-04-11

#Manipulator

Device-Conditioned Neural Architecture Search targets the mismatch between increasingly heavy visuomotor policies and the very different compute limits of real robot hardware. From the provided text, the work appears to search for manipulation policy architectures conditioned on deployment-device constraints, aiming to make robotic manipulation models more efficient on heterogeneous platforms.

arxiv

Learning to Assist: Physics-Grounded Human-Human Control via Multi-Agent Reinforcement Learning arxiv High

Yuto Shibata, Kashu Yamazaki, Lalit Jayanti, Yoshimitsu Aoki, Mariko Isogawa et al. · Submitted 2026-03-11 · Updated 2026-04-10

#RL#Humanoid

Learning to Assist targets humanoid robots for service and caregiving by framing assistance as a physics-grounded human-human control problem trained with multi-agent reinforcement learning. From the provided text, the concrete mechanism and results are not available beyond the goal of learning assistive humanoid behavior for daily-support scenarios.

arxiv

Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks arxiv High

Jan Schneider, Mridul Mahajan, Le Chen, Simon Guist, Bernhard Schölkopf et al. · Submitted 2026-04-10

#Other

Generalized Actuator Networks target sim-to-real transfer for robots driven by tendon-coupled soft muscles, where nonlinear force production, friction, and hysteresis make standard dynamics models brittle. The approach appears to learn a reusable actuator model that can stand in for hard-to-model muscle-tendon behavior, making it easier to train controllers in simulation and deploy them on physical muscle-actuated robots.

arxiv

Hierarchical Task Model Predictive Control for Sequential Mobile Manipulation Tasks arxiv High

Xintong Du, Siqi Zhou, Angela P. Schoellig · Submitted 2026-03-10

#Manipulator#MobileManipulator

Hierarchical Task Model Predictive Control targets sequential mobile manipulation, where a robot has to coordinate base motion, arm actions, and task ordering over multiple steps. It appears to connect language-derived task plans with a model-predictive control hierarchy, aiming to make high-level instructions executable by a mobile manipulator in everyday, multi-stage settings.

arxiv

Perceptive Hierarchical-Task MPC for Sequential Mobile Manipulation in Unstructured Semi-Static Environments arxiv High

Xintong Du, Jingxing Qian, Siqi Zhou, Angela P. Schoellig · Submitted 2026-03-10

#Manipulator#MobileManipulator

Perceptive Hierarchical-Task MPC targets long-horizon mobile manipulation where the robot has to keep executing sequential actions while noticing semi-static changes in cluttered, unstructured spaces. The approach appears to couple hierarchical task planning with model predictive control and perception feedback, so execution is shaped not just by motion feasibility but by ongoing awareness of how the environment has changed.

arxiv

Cross-Hand Latent Representation for Vision-Language-Action Models arxiv High

Guangqi Jiang, Yutong Liang, Jianglong Ye, Jia-Yang Huang, Changwei Jing et al. · Submitted 2026-03-10

#VLA#Manipulator

Cross-Hand Latent Representation for Vision-Language-Action Models focuses on giving VLA policies a shared latent space for dexterous hand coordination, motivated by the gap between current robot manipulation and the fluid cross-hand skill humans use in everyday tasks. From the provided text, the concrete mechanism and results are not available, so the safe takeaway is that the paper targets representation learning for coordinated hand use in vision-language-action robot control.

arxiv

Incremental Residual Reinforcement Learning Toward Real-World Learning for Social Navigation arxiv High

Haruto Nagahisa, Kohei Matsumoto, Yuki Tomita, Yuki Hyodo, Ryo Kurazume · Submitted 2026-04-09

#RL#Other

Incremental Residual Reinforcement Learning targets social navigation for mobile robots by improving a learned policy through residual updates rather than replacing the existing controller outright. The aim is to make deep RL more practical for real-world navigation around people, where robots need to adapt safely and incrementally instead of relying only on policies trained offline or in simulation.

arxiv

MoMaStage: Skill-State Graph Guided Planning and Closed-Loop Execution for Long-Horizon Indoor Mobile Manipulation arxiv High

Chenxu Li, Zixuan Chen, Yetao Li, Jiapeng Xu, Hongyu Ding et al. · Submitted 2026-03-09

#Manipulator#MobileManipulator

MoMaStage targets long-horizon indoor mobile manipulation, where robots must turn natural-language instructions into extended sequences of navigation and manipulation actions without compounding small mistakes. It uses a skill-state graph to guide planning and closed-loop execution, aiming to make mobile manipulators more robust across diverse indoor environments where open-loop plans often break down.

arxiv

Exploiting Aggregate Programming in a Multi-Robot Service Prototype arxiv High

Giorgio Audrito, Andrea Basso, Daniele Bortoluzzi, Ferruccio Damiani, Giordano Scarso et al. · Submitted 2026-04-08

#Other

Aggregate programming is applied to a prototype multi-robot service system, framing robot coordination as collective behavior specified over the group rather than as separate per-robot control logic. From the available text, the work is positioned around service-oriented deployments in domains such as healthcare, exploration, and rescue, where scalable coordination among multiple robots is the central concern.

arxiv

Train-Small Deploy-Large: Leveraging Diffusion-Based Multi-Robot Planning arxiv High

Siddharth Singh, Soumee Guha, Qing Chang, Scott Acton · Submitted 2026-04-08

#Other

Train-Small Deploy-Large tackles multi-robot path planning models that break when deployed with more robots than they saw in training. It uses diffusion-based planning to learn on smaller robot teams while retaining the ability to coordinate larger teams at test time, targeting the common scaling gap where fixed-size learned planners generalize downward but not upward.

Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning arxiv HighNEW

Yu Yang, Yihong Guo, Anqi Liu, Pan Xu · Submitted 2026-05-24

#RL#Other

CEDGE tackles off-dynamics offline RL by generating whole trajectories rather than stitching together model-predicted transitions, which helps avoid long-horizon error accumulation when source and target dynamics differ. It trains a diffusion model on source-domain trajectories, then steers samples toward the target domain with energy terms for return, domain mismatch, and behavior, so the same generator can adapt to new target dynamics without retraining. On the ODRL benchmark, these guided trajectories improve both diffusion-based planning under dynamics shifts and downstream policy learning when used as synthetic target data.

arxiv

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation arxiv HighNEW

Yixin Zhu, Zixiong Wang, Jian Yang, Jin Xie, Jingyi Yu et al. · Submitted 2026-05-07

#Manipulator

Robot manipulation policies are evaluated in simulation, but many benchmarks still look too synthetic to predict real-world behavior well. This benchmark targets that gap by focusing on visually realistic simulation as a testbed, making it easier to study how much appearance fidelity affects manipulation performance and sim-to-real reliability.

AtomicVLA: Unlocking the Potential of Atomic Skill Learning in Robots arxiv High

Likui Zhang, Tao Tang, Zhihao Zhan, Xiuwei Chen, Zisheng Chen et al. · Submitted 2026-03-08

#VLA#Manipulator

AtomicVLA is a robot VLA method for learning atomic skills in long-horizon manipulation. It targets continual skill acquisition and generalization beyond single actions; no code, benchmarks, or real-robot results are specified in the excerpt.

arxiv

Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance arxiv High

Wenxuan Song, Jiayi Chen, Shuai Chen, Jingbo Wang, Pengxiang Ding et al. · Submitted 2026-03-26 · Updated 2026-04-07

#VLA#Other

Fast-dVLA targets the bottleneck in discrete diffusion vision-language-action models, aiming to make them fast enough for real-time robot control. From the provided text, the concrete mechanism and results are not specified, but the paper frames the problem as improving adaptation beyond standard supervised finetuning while reducing the cost of deploying pretrained VLA models.

arxiv

Perceptive Variable-Timing Footstep Planning for Humanoid Locomotion on Disconnected Footholds arxiv High

Zhaoyang Xiang, Upama Pant, Ayonga Hereid · Submitted 2026-03-08

#Humanoid

The paper tackles humanoid walking when safe footholds are separated into stepping-stone-like regions because obstacles, clutter, or slippery terrain make parts of the ground unusable. It focuses on perceptive footstep planning with variable timing, so the robot can reason not just about where to step on disconnected admissible footholds, but also when to take those steps during locomotion.

arxiv

Provable imitation learning for control of instability in partially-observed Vlasov--Poisson equations arxiv HighNEW

Xiaofan Xia, Qin Li, Wenlong Mou · Submitted 2026-05-06

#IL#VLA#Other

Addresses stabilization of partially observed Vlasov-Poisson plasma dynamics, a control problem tied to suppressing instabilities in nuclear fusion settings. The framing suggests a provable imitation-learning approach that learns control policies from expert behavior while retaining guarantees for an infinite-dimensional kinetic PDE system under limited observations.

arxiv

OmniDP: Beyond-FOV Large-Workspace Humanoid Manipulation with Omnidirectional 3D Perception arxiv High

Pei Qu, Zheng Li, Yufei Jia, Ziyun Liu, Liang Zhu et al. · Submitted 2026-03-05 · Updated 2026-03-06

#Manipulator#Humanoid

OmniDP targets humanoid manipulation beyond the robot’s normal field of view by using omnidirectional 3D perception to expand the effective workspace. The arXiv snippet only states the motivation and title-level idea, so the concrete mechanism and results are not available here beyond large-workspace manipulation under perceptual constraints.

arxiv

Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration arxiv High

Ninghao Zhang, Bin Zhu, Shijie Zhou, Jingjing Chen · Submitted 2026-03-06

#VLA#Manipulator

Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration tackles the problem that robot vision-language-action policies can become unreliable when instructions fall outside their training distribution. The approach suggests improving linguistic grounding without additional training by recalibrating attention, aiming to make manipulation behavior track the user’s language more faithfully under OOD instructions.

arxiv

Moving Through Clutter: Scaling Data Collection and Benchmarking for 3D Scene-Aware Humanoid Locomotion via Virtual Reality arxiv High

Beichen Wang, Yuanjie Lu, Linji Wang, Liuchuan Yu, Xuesu Xiao · Submitted 2026-03-06

#Humanoid

Humanoid locomotion has become capable of flashy dynamic skills, but this arXiv work targets the harder problem of moving through cluttered 3D scenes rather than open, flat spaces. It focuses on scaling data collection and benchmarking for scene-aware humanoid navigation using virtual reality, suggesting a pipeline where humans can help generate realistic traversal behavior and evaluate robots in obstacle-rich environments.

arxiv

HarvestFlex: Strawberry Harvesting via Vision-Language-Action Policy Adaptation in the Wild arxiv High

Ziyang Zhao, Shuheng Wang, Zhonghua Miao, Ya Xiong · Submitted 2026-03-06

#VLA#Manipulator

HarvestFlex studies how vision-language-action policies can be adapted for real greenhouse tabletop strawberry harvesting, where the robot has to handle long action sequences, cluttered plants, occlusions, and shiny fruit surfaces. It focuses on transferring VLA behavior from more general settings into an in-the-wild agricultural manipulation task, making strawberry picking a concrete test case for whether these policies can cope with messy, visually difficult environments.

arxiv

Dynamic Whole-Body Dancing with Humanoid Robots -- A Model-Based Control Approach arxiv High

Shibowen Zhang, Jiayang Wu, Guannan Liu, Helin Zhu, Junjie Liu et al. · Submitted 2026-04-05

#Humanoid

An integrated model-based control pipeline generates and executes dynamic whole-body dance motions on humanoid robots. From the provided text, the concrete takeaway is that it targets coordinated, full-body rhythmic motion rather than isolated limb gestures, using a model-based formulation to make the dances physically executable on the robot.

arxiv

VA-FastNavi-MARL: Real-Time Robot Control with Multimedia-Driven Meta-Reinforcement Learning arxiv High

Yang Zhang, Shengxi Jing, Fengxiang Wang, Yuan Feng, Hong Wang · Submitted 2026-04-05

#RL#Other

VA-FastNavi-MARL aligns asynchronous audio and visual commands into a shared latent representation so a robot controller can respond to heterogeneous multimedia instructions in real time. The interesting part is the combination of multimedia fusion with meta-reinforcement learning, aimed at making navigation/control policies adapt quickly as human commands arrive through different modalities.

Understanding the Impact of Geometric Foundation Models on Vision-Language-Action Models arxiv HighNEW

Yurou Yang, Muyuan Lin, Roberto Martin-Martin, Martin Labrie, Shreekant Gayaka et al. · Submitted 2026-05-23

#RL#VLA#Other

GR00T-N1.5 is used as a testbed to measure how much geometry a modern vision-language-action model already encodes, with VGGT serving as the geometric foundation model reference. The study quantifies a VLA–GFM “geometric gap” via linear probing, then compares three controlled architectures for injecting VGGT-derived geometry into the VLA. It also separates architectural effects from practical choices like training data, camera count, and reconstruction quality, giving a clearer picture of when geometric VLAs actually improve robot behavior.

arxiv

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation arxiv High

Pierrick Lorang, Johannes Huemer, Timothy Duggan, Kai Goebel, Patrik Zips et al. · Submitted 2026-04-04

#IL#Manipulator

Build on Priors tackles data-efficient learning for real-world, long-horizon robot manipulation by combining vision-language guidance with neuro-symbolic imitation learning. From the available text, the core aim is to let robots acquire complex manipulation behaviors from only a handful of demonstrations, using prior structure rather than relying on large-scale trial data.

arxiv

Diffusion Policy through Conditional Proximal Policy Optimization arxiv High

Ben Liu, Shunpeng Yang, Hua Chen · Submitted 2026-03-05

#RL#IL#Other

Diffusion Policy through Conditional Proximal Policy Optimization appears to connect diffusion-policy style action generation with PPO-style reinforcement learning, targeting decision-making settings such as robotics where policies must produce structured continuous actions. From the provided text alone, the concrete mechanism, benchmarks, and results are not available, so the safest reading is that it explores a reinforcement-learning formulation for training conditional diffusion policies rather than just imitating data.

Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies arxiv High

Siddharth Srikanth, Freddie Liang, Ya-Chuan Hsu, Varun Bhatt, Shihan Zhao et al. · Submitted 2026-03-12 · Updated 2026-04-03

#VLA#Other

A red-teaming method for VLA robot policies using quality-diversity prompt generation. It aims to surface robustness failures in vision-language robot tasks; no metrics, code, benchmarks, or robot tests are given in the provided text.

arxiv

Swimming Under Constraints: A Safe Reinforcement Learning Framework for Quadrupedal Bio-Inspired Propulsion arxiv High

Xinyu Cui, Fei Han, Hang Xu, Yongcheng Zeng, Luoyang Sun et al. · Submitted 2026-03-04

#RL#Other

Safe reinforcement learning is used to train quadruped-inspired swimming gaits that remain stable despite destabilizing lift fluctuations and full 6-DoF fluid coupling. The setup targets bio-inspired aquatic robots where high-thrust propulsion can easily trade off against control safety, framing swimming as a constrained policy-learning problem rather than just maximizing speed or thrust.

arxiv

Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA arxiv High

Zihua Wang, Zhitao Lin, Ruibo Li, Yu Zhang, Xu Yang et al. · Submitted 2026-04-03

#VLA#Manipulator

Open-Loop Planning, Closed-Loop Verification targets the inference cost of Vision-Language-Action models by separating fast speculative action planning from closed-loop verification. The provided text only states the problem framing, so the concrete mechanism, benchmarks, and results are not available beyond the idea of using verification to make VLA manipulation control cheaper.

arxiv

A Rapid Instrument Exchange System for Humanoid Robots in Minimally Invasive Surgery arxiv High

Bingcong Zhang, Yihang Lyv, Lianbo Ma, Yushi He, Pengfei Wei et al. · Submitted 2026-04-03

#Humanoid

Humanoid robots are being explored as assistants for minimally invasive surgery, where rapid and reliable instrument exchange is a practical bottleneck. The paper appears to focus on an exchange system that lets a humanoid robot swap surgical tools efficiently, aiming to make robot-assisted MIS workflows smoother and less dependent on manual handoffs.

arxiv

Model-Based Reinforcement Learning for Control under Time-Varying Dynamics arxiv High

Klemens Iten, Bruce Lee, Chenhao Li, Lenart Treven, Andreas Krause et al. · Submitted 2026-04-02

#RL#Other

Model-based reinforcement learning is applied to control problems where the system dynamics change over time, such as from wear, drift, or shifting operating conditions. The framing targets a common gap in learning-based controllers, which often assume stationary dynamics even though many real robots and engineered systems gradually or abruptly move away from the model they were trained on.

arxiv

3-D Relative Localization for Multi-Robot Systems with Angle and Self-Displacement Measurements arxiv High

Chenyang Liang, Liangming Chen, Baoyi Cui, Jie Mei · Submitted 2026-04-02

#Other

3-D Relative Localization for Multi-Robot Systems with Angle and Self-Displacement Measurements studies how a team of robots can estimate their relative positions using only local inter-robot angle observations and each robot’s own displacement measurements. The focus is on making relative localization work under measurement noise, where reconstructing consistent 3D geometry from limited local sensing is especially difficult.

Emergent Dexterity via Diverse Resets and Large-Scale Reinforcement Learning arxiv High

Patrick Yin, Tyler Westenbroek, Zhengyu Zhang, Joshua Tran, Ignacio Dagnino et al. · Submitted 2026-03-16 · Updated 2026-04-02

#RL#Other

An arXiv method for robot dexterity using diverse resets and large-scale reinforcement learning in parallel physics simulation. It targets brittle, task-specific sim-to-real pipelines; code or real-robot validation is not specified.

IsaacIPC: Coupling High-Fidelity Simulation and Realistic Rendering for Contact-Rich Robotic Systems arxiv HighNEW

Qixin Liang, Zhongqing Han · Submitted 2026-05-23

#Manipulator

IsaacIPC couples GPU-accelerated incremental potential contact simulation with IsaacSim/Lab so contact-rich robotic systems can be simulated with deformable mechanics while still rendered realistically in real time. It maps deformation from simulation meshes to visual meshes for data collection and policy evaluation, and adds a geometric mortar contact potential for tactile surfaces to better capture contact-pressure distributions. The system is shown on rigid-deformable setups including a quadruped, dexterous hand, and UMI gripper, with GMCP evaluated on contact benchmarks.

arxiv

Sim-to-Real Fruit Detection Using Synthetic Data: Quantitative Evaluation and Embedded Deployment with Isaac Sim arxiv High

Martina Hutter-Mironovova · Submitted 2026-03-30

#Other

Synthetic fruit imagery from Isaac Sim is used to train object detectors for sim-to-real transfer when real training data is scarce, with evaluation focused on how well the synthetic-data models carry over to real fruit detection. The study also follows the pipeline through embedded deployment, so the result is not just a dataset-generation exercise but a quantitative look at detection performance under practical compute constraints.

arxiv

Reducing Mental Workload through On-Demand Human Assistance for Physical Action Failures in LLM-based Multi-Robot Coordination arxiv High

Shoichi Hasegawa, Akira Taniguchi, Lotfi El Hafi, Gustavo Alfonso Garcia Ricardez, Tadahiro Taniguchi · Submitted 2026-03-30

#Other

LLM-based multi-robot systems can turn natural-language instructions into high-level action plans, but they still struggle when physical execution fails and robots need help recovering. This work studies an on-demand human-assistance mechanism for those action failures, aiming to coordinate intervention only when needed so the robots can continue while reducing the operator’s mental workload.

Libra-VLA: Achieving Learning Equilibrium via Asynchronous Coarse-to-Fine Dual-System arxiv High

Yifei Wei, Linqing Zhong, Yi Liu, Yuxiang Lu, Xindong He et al. · Submitted 2026-04-27

#VLA#Manipulator

Libra-VLA targets generalist robotic manipulation by linking vision-language understanding to executable actions, so robots can interpret semantic instructions and turn them into physical behavior. The available excerpt only states the broad VLA setting and the paper’s asynchronous coarse-to-fine dual-system framing, without enough detail to summarize the specific mechanism or results.

arxiv

CREST: Constraint-Release Execution for Multi-Robot Warehouse Shelf Rearrangement arxiv High

Jiaqi Tan, Yudong Luo, Sophia Huang, Yifan Yang, Hang Ma · Submitted 2026-03-27

#Other

CREST tackles double-deck multi-agent pickup and delivery for warehouse shelf rearrangement, building on MAPF-DECOMP’s split between planning collision-free shelf trajectories and assigning robots to execute them. Its constraint-release execution mechanism appears aimed at making those preplanned shelf movements easier to carry out in dense multi-robot settings, where rigidly enforcing all constraints during execution can bottleneck rearrangement.

arxiv

Line-of-Sight-Constrained Multi-Robot Mapless Navigation via Polygonal Visible Regions arxiv High

Ruofei Bai, Shenghai Yuan, Xinhang Xu, Xingyu Ji, Xiaowei Li et al. · Submitted 2026-03-27

#Other

Line-of-Sight-Constrained Multi-Robot Mapless Navigation via Polygonal Visible Regions studies how teams of robots can navigate around unknown obstacles while preserving direct line-of-sight connectivity for communication and coordination. The approach centers on polygonal visible regions, using local geometric visibility constraints rather than a prebuilt map to keep robots connected as they move through partially unknown environments.

arxiv

Move-Then-Operate: Behavioral Phasing for Human-Like Robotic Manipulation arxiv High

Haoming Xu, Lei Lei, Jie Gu, Chu Tang, Jingmin Chen et al. · Submitted 2026-04-26

#VLA#Manipulator

Move-Then-Operate is a vision-language-action approach that splits manipulation into a coarse movement phase followed by a contact-sensitive operation phase. The idea is to make robot behavior more human-like by separating relocation from the precise interactions where force, contact, and timing matter most.

arxiv

SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation arxiv High

Zhuoran Li, Zhiyang Li, Kaijun Zhou, Jinyu Gu · Submitted 2026-03-25 · Updated 2026-03-27

#VLA#Other

SOMA targets a common failure mode in vision-language-action robot policies: they often break under perceptual noise or out-of-distribution environment changes because they lack persistent memory and a way to diagnose why an action failed. It augments VLA control with strategic orchestration, long-term memory, causal failure attribution, and in-context interventions, aiming to let the robot adapt its behavior dynamically rather than relying on a fixed policy response.