[1]董梓呈,胡伟石,邵辉,等.概率预测强化学习下非结构环境机械臂变阻抗力跟踪控制[J].华侨大学学报(自然科学版),2024,45(4):461-470.[doi:10.11830/ISSN.1000-5013.202403035]
 DONG Zicheng,HU Weishi,SHAO Hui,et al.Probability Prediction Reinforcement Learning for Variable Impedance Force Tracking Control of Robotic Arms in Unstructured Environments[J].Journal of Huaqiao University(Natural Science),2024,45(4):461-470.[doi:10.11830/ISSN.1000-5013.202403035]
点击复制

概率预测强化学习下非结构环境机械臂变阻抗力跟踪控制()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第45卷
期数:
2024年第4期
页码:
461-470
栏目:
出版日期:
2024-07-20

文章信息/Info

Title:
Probability Prediction Reinforcement Learning for Variable Impedance Force Tracking Control of Robotic Arms in Unstructured Environments
文章编号:
1000-5013(2024)04-0461-10
作者:
董梓呈1 胡伟石2 邵辉1 郭霖1
1. 华侨大学 信息科学与工程学院, 福建 厦门 361021;2. 华侨大学 实验室与设备管理处, 福建 厦门 361021
Author(s):
DONG Zicheng1 HU Weishi2 SHAO Hui1 GUO Lin1
1. College of Information Science and Engneering, Huaqiao University, Xiamen 361021, China; 2. Department of Laboratory and Device Management, Huaqiao University, Xiamen 361021, China
关键词:
变阻抗控制 机械臂力跟踪 强化学习 非结构环境 概率预测模型
Keywords:
variable impedance control robotic arm force tracking reinforcement learning unstructured environment probability prediction model
分类号:
TP273
DOI:
10.11830/ISSN.1000-5013.202403035
文献标志码:
A
摘要:
针对非结构环境下末端实时移动机械臂阻抗控制力跟踪问题,通过动态调节阻尼系数以应对接触环境的不确定性。为确保阻抗策略的高效搜索,利用机械臂与接触环境交互产生状态-动作序列构建概率预测模型(PPM)。学习过程中,机械臂仅需与非结构接触环境进行少量交互即可获得最优变阻抗策略,这使得该过程在真实机械臂上直接训练成为可能。仿真实验表明,在几种非结构环境下,所提出的方法使力跟踪动态和稳态性能均明显优于传统阻抗控制和自适应变阻抗控制。
Abstract:
Aiming at the real-time impedance control force tracking problems of the end mobile robotic arm in a unstructured environment, the damping coefficient is dynamically adjusted to cope with the uncertainty of the contact environment. To ensure efficient search of the impedance strategy, a probabilistic prediction model(PPM)is constructed by utilizing the interaction between the robotic arm and the contact environment to generate state-action sequences. During the learning process, the robotic arm only needs to interact minimally with the unstructured contact environment to obtain the optimal variable impedance strategy. This makes it possible to directly train the process on a real robotic arm. Simulation results show that in several unstructured environments, the proposed method significantly outperforms the traditional impedance control and adaptive variable impedance control in both dynamic and steady-state force tracking performance.

参考文献/References:

[1] PETERNEL L,TSAGARAKIS N,CALDWELL D,et al.Robot adaptation to human physical fatigue in human-robot co-manipulation[J].Autonomous Robots,2018,42(5):1011-1021.DOI 10.1007/s10514-017-9678.
[2] 倪涛,黎锐,缪海峰,等.船载机械臂末端位置实时补偿[J].吉林大学学报(工学版),2020,50(6):2028-2035.DOI:10.13229/j.cnki.jdxbgxb20190662.
[3] REN Qinyuan,ZHU Wenxin,ZHAO Feng,et al.Learning-based force control of a surgical robot for tool-soft tissue interaction[J].IEEE Robotics and Automation Letters,2021,6(4):6345-6352.DOI:10.1109/LRA.2021.3093018.
[4] LI Y,GOWRISHANKAR G,NATHANAEL J,et al.Force, impedance, and trajectory learning for contact tooling and haptic identification[J].IEEE Transactions on Robotics,2018,34(5):1-13.DOI:10.1109/TRO.2018.2830405.
[5] 刘胜遂,李利娜,熊晓燕,等.基于卡尔曼滤波的机器人自适应控制方法研究[J].机电工程,2023,40(6):936-944.DOI:10.3969/j.issn.1001-4551.2023.06.017.
[6] 李振,赵欢,王辉,等.机器人磨抛加工接触稳态自适应力跟踪研究[J].机械工程学报,2022,58(9):200-209.DOI:10.3901/JME.2022.09.200.
[7] ROVEDA L,IANNACCI N,VICENTINI F,et al.Optimal impedance force-tracking control design with impact formulation for interaction tasks[J].IEEE Robotics and Automation Letters,2016,1(1):130-136.DOI:10.1109/LRA.2015.2508061.
[8] JUNG S,HSIA T C,BONITZ R G.Force tracking impedance control of robot manipulators under unknown environment[J].IEEE Transactions on Control Systems Technology,2004,12(3):474-483.DOI:10.1109/TCST.2004.824320.
[9] DUAN Jinjun,GAN Yajui,CHEN Ming,et al.Adaptive variable impedance control for dynamic contact force tracking in uncertain environment[J].Robotics and Autonomous Systems,2018,102:54-65.DOI:10.1016/j.robot.2018.01.009.
[10] CAO Hongli,CHEN Xiaoan,HE Ye,et al.Dynamic adaptive hybrid impedance control for dynamic contact force tracking in uncertain environments[J].IEEE Access,2019,7:83162-83174.DOI:10.1109/ACCESS.2019.2924696.
[11] HAMEDANI M H,SADEGHIAN H,ZEKRI M,et al.Intelligent impedance control using wavelet neural network for dynamic contact force tracking in unknown varying environments[J].Control Engineering Practice,2021,113:104840.DOI:10.1016/J.CONENGPRAC.2021.104840.
[12] ANDRYCHOWICZ O M,BAKER B,CHOCIEJ M,et al.Learning dexterous in-hand manipulation[J].The International Journal of Robotics Research,2020,39(1):3-20.DOI:10.1177/0278364919887447.
[13] LI Yunfei,KONG Tao,LI Lei,et al.Learning design and construction with varying-sized materials via prioritized memory resets[C]//International Conference on Robotics and Automation.Philadelphia:IEEE Press.2022:7469-7476.DOI:10.1109/ICRA46639.2022.9811624.
[14] BUCHLI J,STULP F,THEODOROU E,et al.Learning variable impedance control[J].The International Journal of Robotics Research,2011,30(7):820-833.DOI:10.1177/0278364911402527.
[15] LI Chao,ZHANG Zhi,XIA Guihua,et al.Efficient force control learning system for industrial robots based on variable impedance control[J].Sensors,2018,18(8):2539.DOI:10.3390/s18082539.
[16] WU Min,HE Yanhao,LIU S.Adaptive impedance control based on reinforcement learning in a human-robot collaboration task with human reference estimation[J].International Journal of Mechanics and Control,2020,21(1):21-32.DOI:10.1007/978-3-030-19648-6_12.
[17] DU Zhijiang,WANG Wei,YAN Zhiyuan,et al.Variable admittance control based on fuzzy reinforcement learning for minimally invasive surgery manipulator[J].Sensors,2017,17(4):844.DOI:10.3390/s17040844.
[18] DEISENROTH M P,FOX D,RASMUSSEN C E.Gaussian processes for data-efficient learning in robotics and control[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(2):408-423.DOI:10.1109/TPAMI.2013.218.
[19] RASMUSSEN C E,WILLIAMS C K I.Gaussian processes for machine learning[M].Cambridge:MIT Press,2005.
[20] DEISENROTH M P.Efficient reinforcement learning using Gaussian process[D].Karlsruhe:Karlsruhe Institute of Technology,2010.DOI:10.5445/KSP/1000019799.

备注/Memo

备注/Memo:
收稿日期: 2024-03-23
通信作者: 邵辉(1973-),女,副教授,博士,主要从事机器人运动规划与控制的研究。E-mail:shaohuihu11@163.com。
基金项目: 福建省自然科学基金资助项目(2021J01291); 华侨大学研究生教育教学改革研究项目(22YJG006)
更新日期/Last Update: 2024-07-20