تعداد نشریات | 38 |
تعداد شمارهها | 1,240 |
تعداد مقالات | 8,994 |
تعداد مشاهده مقاله | 7,844,791 |
تعداد دریافت فایل اصل مقاله | 4,706,386 |
کنترل ردیاب وضعیت بهینه تحملپذیر عیب کوادروتور در حضور قیود حالت و ورودی با استفاده از یادگیری تقویتی ایمن | ||
مکانیک هوافضا | ||
مقاله 9، دوره 20، شماره 1 - شماره پیاپی 75، فروردین 1403، صفحه 141-161 اصل مقاله (1.35 M) | ||
نوع مقاله: گرایش دینامیک، ارتعاشات و کنترل | ||
نویسندگان | ||
سجاد روشنروان1؛ سعید شمقدری* 2 | ||
1دانشجوی دکتری، دانشکده مهندسی برق، دانشگاه علم و صنعت ایران، تهران، ایران | ||
2نویسنده مسئول: دانشیار، دانشکده مهندسی برق، دانشگاه علم و صنعت ایران، تهران، ایران | ||
تاریخ دریافت: 15 مهر 1402، تاریخ بازنگری: 07 آبان 1402، تاریخ پذیرش: 11 آذر 1402 | ||
چکیده | ||
در این مقاله، به ارائه روشی جهت طراحی سیستم کنترل وضعیت ردیاب بهینه برای پرنده کوادروتور که در معرض عیوب اجزا و عملگر قرار دارد پرداختهشده است. روش کنترل تحملپذیر عیب یکپارچه پیشنهادی مبتنی بر یادگیری تقویتی ایمن ارائهشده است و قادر است بدون نیاز به شناخت قبلی از دینامیک پرنده، قیود ورودی و حالات را تضمین نماید. به این منظور، روش بهینه پیشنهادی با ساختار شبکه عصبی دوگانه شامل شبکههای عصبی شناساگر-نقاد ارائهشده است. در قانون بهروزرسانی وزنهای شبکه شناساگر علاوه بر متغیر در نظر گرفتن ضریب فراموشی از روش پاسخ تجربه استفادهشده که باعث افزایش سرعت همگرایی و مقاومت نسبت به نویز اندازهگیری و کاهش خطای تخمین میشوند. در این روش، حل مسئله کنترل ردیاب وضعیت بهینه تحملپذیر عیب در حالت مقید با حل مسئله پایدارسازی بهینه نامقید برای یک سیستم افزوده معادل میشود که در آن قیود ورودی کنترلی و حالات به ترتیب با انتخاب تابع هزینه مناسب بر سیگنال ورودی و توابع کنترل مانع مناسب بر حالات، تضمین داده میشوند. همچنین آشکارسازی وقوع عیب بدون نیاز به هیچگونه بانکی از مدل یا فیلتر و صرفاً با مقایسه مقدار باقیمانده معادله همیلتون-ژاکوبی-بلمن با یک آستانه از پیش تعیینشده انجام میپذیرد. پایداری فراگیر یکنواخت وزنهای هر دو شبکه و درنتیجه همگرایی قانون کنترل به پاسخ بهینه با استفاده از قضیه لیاپانوف اثبات و با استفاده از نتایج شبیهسازی صحت عملکرد آن نشان دادهشده است. | ||
تازه های تحقیق | ||
| ||
کلیدواژهها | ||
کنترل وضعیت کوادروتور؛ عیوب اجزا و عملگر؛ کنترل بهینه تحملپذیر عیب؛ آشکارسازی وقوع عیب؛ یادگیری تقویتی ایمن | ||
عنوان مقاله [English] | ||
Fault-Tolerant Optimal Attitude Tracking Control of Quadrotor Subject to State and Input Constraints Using Safe Reinforcement Learning | ||
نویسندگان [English] | ||
Sajad Roshanravan1؛ Saeed Shamaghdari2 | ||
1Ph.D. Student, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran | ||
2Corresponding author: Associate Professor, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran | ||
چکیده [English] | ||
In this article, a method for designing a fault-tolerant optimal attitude tracking control (FTOATC) for a quadrotor UAV subject to component and actuator faults is presented. The proposed fault-tolerant method is based on safe reinforcement learning (SRL) and is capable of ensuring input and state constraints without the need for prior knowledge of the quadrotor dynamics. To this end, the proposed optimal method is presented with a dual neural network (NN) structure consisting of identifier-critic neural networks. In the identifier NN update law, in addition to considering the variable forgetting factor dependent on measurement noise, the experience response method is used, which increases convergence speed and robustness to measurement noise and reduces estimation error. In this method, solving the constrained FTOATC problem is equivalent to solving an unconstrained optimal stabilization problem for an augmented system, where control input constraints and states are guaranteed by selecting suitable cost functions on the input signal and appropriate control barrier functions (CBF)on the states, respectively. Furthermore, fault detection is performed without the need for any model or filter bank, simply by comparing the residual value of the Hamilton-Jacobi-Bellman (HJB) equation with a predetermined threshold. The Uniformly Ultimately Boundedness (UUB) of identifier and critic NN weight errors and, as a result, the convergence of the control input to the neighborhood of the optimal solution are all proved by Lyapunov theory and the performance of the method is validated through simulation results. | ||
کلیدواژهها [English] | ||
Quadrotor attitude control, Component and actuator faults, Fault-tolerant optimal control, Fault detection, Safe reinforcement learning | ||
مراجع | ||
[3] Zhao W, Liu H, Lewis FL. Data-driven fault-tolerant control for attitude synchronization of nonlinear quadrotors. IEEE Transactions on Automatic Control. 2021;66(11):5584-91. DOI :10.1109/TAC.2021.3053194. [4] Amin AA, Hasan KM. A review of fault tolerant control systems: advancements and applications. Measurement. 2019;143:58-68. DOI :10.1016/j.measurement.2019.04.083. [5] Roshanravan S, Sobhani Gendeshmin B, Shamaghdari S. Design of an actuator fault-tolerant controller for an air vehicle with nonlinear dynamics. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering. 2019;233(10):3534-46. DOI :10.1177/0954410018801254. [6] Jiang J, Yu X. Fault-tolerant control systems: A comparative study between active and passive approaches. Annual Reviews in control. 2012;36(1):60-72. DOI :10.1016/j.arcontrol.2012.03.005. [7] Rudin K, Ducard GJ, Siegwart RY. Active fault-tolerant control with imperfect fault detection information: Applications to UAVs. IEEE Transactions on Aerospace and Electronic Systems. 2019;56(4):2792-805. [8] Lan J, Patton RJ. A new strategy for integration of fault estimation within fault-tolerant control. Automatica. 2016;69:48-59. [9] Roshanravan S, Shamaghdari S. Simultaneous fault detection and isolation and fault-tolerant control using supervisory control technique: asynchronous switching approach. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering. 2020;234(8):900-11. DOI :10.1177/0959651819893891. [10] Ruan Z, Yang Q, Ge SS, Sun Y. Performance-guaranteed fault-tolerant control for uncertain nonlinear systems via learning-based switching scheme. IEEE Transactions on Neural Networks and Learning Systems. 2020;32(9):4138-50. DOI :10.1109/TNNLS.2020.3016954. [11] Li L, Luo H, Ding SX, Yang Y, Peng K. Performance-based fault detection and fault-tolerant control for automatic control systems. Automatica. 2019;99:308-16. DOI :10.1016/j.automatica.2018.10.047. [12] Cheng W, Zhang K, Jiang B. Hierarchical Structure-Based Fixed-Time Optimal Fault-Tolerant Time-Varying Output Formation Control for Heterogeneous Multiagent Systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2023;53(8):4856-66.. DOI :10.1109/TSMC.2023.3257426. [13] Bardi M, Dolcetta IC. Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations: Springer; 1997. DOI :10.1007/978-0-8176-4755-1. [14] Lewis FL, Vrabie D. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE circuits and systems magazine. 2009;9(3):32-50. DOI :10.1109/MCAS.2009.933854. [15] Huang J, Zeng W, Xiong H, Noack BR, Hu G, Liu S, Xu Y, Cao H. Symmetry-Informed Reinforcement Learning and its Application to Low-Level Attitude Control of Quadrotors. IEEE Transactions on Artificial Intelligence. 2023;5(3):1147-61. DOI :10.1109/TAI.2023.3249683. [16] Bernini N, Bessa M, Delmas R, Gold A, Goubault E. Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts. Engineering Applications of Artificial Intelligence. 2024; 127: 107090. DOI :10.1016/j.engappai.2023.107090. [17] Zhu Y, Lian S, Zhong W, Meng, W. Reinforcement learning method for quadrotor attitude control based on expert information. 8th International Conference on Automation, Control and Robotics Engineering (CACRE); 2023: IEEE. DOI :10.1109/CACRE58689.2023.10208497. [18] Yang Y, Vamvoudakis KG, Modares H, Yin Y, Wunsch DC. Safe intermittent reinforcement learning with static and dynamic event generators. IEEE Transactions on Neural Networks and Learning Systems. 2020;31(12):5441-55. DOI :10.1109/TNNLS.2020.2967871. [19] Marvi Z, Kiumarsi B. Safe reinforcement learning: A control barrier function optimization approach. International Journal of Robust and Nonlinear Control. 2021;31(6):1923-40. DOI :10.1002/rnc.5132. [20] Al-Tamimi A, Lewis FL, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B. 2008;38(4):943-9. DOI :10.1109/TSMCB.2008.926614. [21] Lv Y, Na J, Yang Q, Wu X, Guo Y. Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics. International Journal of Control. 2016;89(1):99-112. DOI :10.1080/00207179.2015.1060362. [22] Lv Y, Na J, Zhao X, Huang Y, Ren X. Multi-H∞ controls for unknown input-interference nonlinear system with reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems. 2021. DOI :10.1109/TNNLS.2021.3130092. [23] Mishra A, Ghosh S. Simultaneous identification and optimal tracking control of unknown continuous-time systems with actuator constraints. International Journal of Control. 2022;95(8):2005-23. DOI :10.1080/00207179.2021.1890824. [24] Roshanravan S, Shamaghdari S. Adaptive fault-tolerant tracking control for affine nonlinear systems with unknown dynamics via reinforcement learning. IEEE Transactions on Automation Science and Engineering. 2022;21(1):569-80. DOI :10.1109/TASE.2022.3223702. [25] Dierks T, Jagannathan S, editors. Optimal control of affine nonlinear continuous-time systems. Proceedings of the 2010 American control conference; 2010: IEEE. DOI :10.1109/ACC.2010.5531586. [26] Liu D, Yang X, Wang D, Wei Q. Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints. IEEE transactions on cybernetics. 2015;45(7):1372-85. DOI :10.1109/TCYB.2015.2417170. [27] Yang H, Jiang B, Staroswiecki M. Supervisory fault tolerant control for a class of uncertain nonlinear systems. Automatica. 2009;45(10):2319-24. DOI :10.1016/j.automatica.2009.06.019. [28] Ma H-J, Xu L-X, Yang G-H. Multiple environment integral reinforcement learning-based fault-tolerant control for affine nonlinear systems. IEEE Transactions on Cybernetics. 2019;51(4):1913-28. DOI :10.1109/TCYB.2018.2889679. [29] Choi YC, Ahn HS. Nonlinear control of quadrotor for point tracking: Actual implementation and experimental tests. IEEE/ASME Transactions on Mechatronics. 2014;20(3):1179-92. DOI :10.1109/TMECH.2014.2329945. [30] Edwards C, Lombaerts T, Smaili H. Fault tolerant flight control. Lecture notes in control and information sciences. 2010;399:1-560. DOI :10.1007/978-3-642-11690-2. [31] Modares H, Lewis FL, Naghibi-Sistani M-B. Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Transactions on neural networks and learning systems. 2013;24(10):1513-25. DOI :10.1109/TNNLS.2013.2276571. [32] Na J, Mahyuddin MN, Herrmann G, Ren X, Barber P. Robust adaptive finite‐time parameter estimation and control for robotic systems. International Journal of Robust and Nonlinear Control. 2015;25(16):3045-71. DOI :10.1002/rnc.3247. [33] Modares H, Lewis FL. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica. 2014;50(7):1780-92. DOI :10.1016/j.automatica.2014.05.011. [34] Abu-Khalaf M, Lewis FL. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica. 2005;41(5):779-91. DOI :10.1016/j.automatica.2004.11.034. [35] Modares H, Lewis FL, Naghibi-Sistani M-B. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica. 2014;50(1):193-202. DOI :10.1016/j.automatica.2013.09.043. [36] Stone M. The generalized Weierstrass approximation theorem. Mathematics Magazine. 1948;21(5): 237-254. [37] Rudin W. Principles of mathematical analysis1953. [38] Ding SX. Model-based fault diagnosis techniques: design schemes, algorithms, and tools: Springer Science & Business Media; 2008.
| ||
آمار تعداد مشاهده مقاله: 5,043 تعداد دریافت فایل اصل مقاله: 193 |