Editor’s Note: The paper on which this article is based was originally presented at the 9th Annual International Conference on Information Technology (InCIT), held in Phuket, Thailand in November 2025. It is reprinted here with the gracious permission of the IEEE. Copyright 2025, IEEE.
In the 21st century, artificial intelligence (AI) technology has been driving a paradigm shift in defense, particularly by significantly enhancing operational capabilities in the air force domain. AI-powered combat systems support a wide range of functions such as target detection, tactical decision-making, situational awareness, and autonomous flight, thereby reducing the cognitive burden on human pilots and improving real-time responsiveness and survivability. However, the increasing autonomy of AI systems introduces complex challenges, including potential violations of international law, decision-making errors, and ambiguous attribution of ethical responsibility. This has led to a growing need to redefine the role of human fighter pilots within AI-integrated operations. To address these issues, this study proposes a quantitative ethical decision-making model that mathematically integrates national military ethics principles and international legal norms, while incorporating dynamic battlefield variables. The proposed model aims to contribute to defense policy development and combat training systems by offering a structured and operationally applicable ethical evaluation framework.
Related Work
This article proposes a foundational framework for mathematically modeling ethical decision-making in AI-enabled combat systems. To support this framework, the paper examines and analyzes prior research across three key domains: 1) the development of AI technologies integrated into fighter aircraft; 2) national military ethical standards; and 3) approaches to the quantification of ethical judgment. Based on this analysis, the article emphasizes its unique and differentiated contribution to the current literature on military AI ethics.
Evolution of AI-Based Fighter Aircraft Systems
AI fighter jet technology is advancing in diverse ways depending on national strategies. The United States is pursuing unmanned–manned teaming and next-generation combat platforms through initiatives such as the Air Combat Evolution (ACE), Skyborg, and Next Generation Air Dominance (NGAD) programs. China is integrating AI pilot systems into the J-20 and enhancing the autonomous combat capabilities of AI-powered drones. Europe is focusing on AI-assisted human operations and cloud‑based battlefield analysis technologies under programs like the Future Combat Air System (FCAS) and Tempest, with a strong emphasis on ethical compliance and operational safety. These developments are shifting the role of human pilots from operators to strategic decision-makers or supervisors, thereby highlighting the need for quantitative models that support ethical design and accountability frameworks.
National Standards for Military AI Ethics
As the military application of AI technologies expands, countries are establishing military AI ethics standards based on their strategic objectives and philosophical principles. The United States emphasizes responsibility grounded in practical utility, the European Union promotes legislated human-centric principles, China prioritizes state-centered strategic ethics, and South Korea remains in the early stages of institutional development.
|
Country |
Key Policies |
Core Ethical Values - From Our Sponsors -
|
|
United States |
AI Ethical Principles, Implementing Responsible AI |
Responsibility, Reliability, Traceability, Auditability |
|
European Union |
AI Ethics Guidelines, AI Act |
Human-centricity, Explainability, |
|
South Korea |
– |
Human dignity, Transparency, Explainability, Safety |
|
China |
New Generation AI Ethics Guidelines |
Autonomy, State Security, Social Order |
Table 1: National policies on military AI ethics
Trends in Mathematical Modeling of Ethical AI
Mathematical modeling approaches for evaluating ethical decision-making in AI remain limited. While theoretical models such as TOPSIS, Bayesian networks, and Markov Decision Processes (MDPs) offer frameworks for quantifying ethical judgments, they fall short in reflecting the dynamic variables of actual battlefield environments, resulting in insufficient reliability and consistency. In South Korea, the Korea Institute for Defense Analyses (KIDA) has employed the Analytic Hierarchy Process (AHP) to prioritize military AI ethical principles. However, AHP also faces limitations in accounting for the operational variability and situational dynamics of combat scenarios. Therefore, there is a growing need for research that quantitatively analyzes the sensitivity of ethical principles to combat environments and the interactions among different ethical standards.
Need for Mathematical Modeling in Military AI Ethics
Qualitative ethical judgments in the application of AI to weapon systems can lead to severe consequences, such as civilian casualties or violations of international law. Ensuring the ethical integrity of AI-enabled military systems requires the ability to quantitatively evaluate human involvement, the balance between AI autonomy and human control, and compliance with the laws of armed conflict. Given the potential conflicts among national ethical standards and the diversity of tactical environments, a mathematical model that enables consistent ethical evaluation under varying conditions is essential. Such a model must be capable of partial quantification of ethical compliance in accordance with international norms, while resolving contradictions among ethical principles.
Design of the Mathematical Model
To quantitatively represent the ethical decision-making process of AI fighter pilots, this study proposes a multidimensional ethical function model (Ef) that integrates national ethical standards with the laws of armed conflict. The model is designed to enable consistent ethical judgments across diverse operational scenarios by balancing differences in national values with legal obligations.
Modeling National Ethical Standard
In designing an AI ethics model for fighter pilots, incorporating national military AI ethical principles is essential. This study derives representative ethical criteria from the United States, European Union, South Korea, and China, and formulates them into mathematical functions suitable for analysis. Each function reflects the respective country’s ethical priorities and is incorporated into the overall model to ensure broad applicability and fidelity to international ethical diversity.
United States
The ethical judgment function representing the United States incorporates the core principles outlined in the U.S. Department of Defense’s AI Ethical Principles: Responsibility (R), Fairness (F), Traceability (T), Reliability (Rc), and Governability (G). These elements are weighted to reflect their relative importance in ethical evaluation, and the function is mathematically defined as:
(1)
European Union
The European Union (EU), through its Ethics Guidelines
for Trustworthy AI, identifies five core ethical principles: respect for human autonomy, prevention of harm, fairness, explicability, and transparency. In this study, a mathematical ethical decision model is defined to incorporate these principles.
(2)
Korea
The ethical framework for defense AI in South Korea is currently under development. A study by the Korea Institute for Defense Analyses (KIDA) proposed five core ethical principles: human dignity, controllability, safety, responsibility, and explicability. In this study, a mathematical analysis model is constructed based on these principles.
(3)
China
While the ethical principles for military AI in China have not been publicly disclosed, the Chinese government outlined core national values—national security, social stability, and centralized control—in its Ethical Norms for the New Generation Artificial Intelligence, published in 2021. This study assumes that the defense domain aligns with these national policies and accordingly defines a mathematical model for the ethical evaluation of military AI in China.
(4)
National Ethics Integration Model
Ethical evaluation in AI systems can vary significantly depending on each nation’s operational doctrines, ethical philosophy, and political or strategic objectives. To reflect the relative influence of national ethical standards, this study introduces a set of weighting factors and defines an integrated model, National Ethics, that incorporates the core ethical principles identified by each country.
(5)
Mathematical Modeling of LOAC
The application of the Law of Armed Conflict (LOAC) is an essential element in the ethical decision-making process of a combat aircraft pilot. In particular, when an AI system must autonomously determine whether to employ force, it is critical to quantitatively evaluate compliance with the core principles of LOAC. To this end, the following function incorporates key LOAC principles—Military Necessity (MN), Proportionality (P), Distinction (D), Unnecessary Suffering (US), and Honor (H)—as variables for assessment.
(6)
Tactical Situation Modeling
Tactical situations are not static environments but are instead characterized by real-time changes within dynamic and uncertain battlefield conditions. As a result, ethical decision-making cannot rely solely on static standards; it is directly influenced by various operational variables. To reflect this, four key tactical factors are defined as variables: Situational Awareness (SA), Time Criticality (TC), Survivability (S), and Coalition Compatibility (Co). Each factor is assigned a corresponding weight φ1, φ2, φ3, φ4 to represent its relative importance. Based on these variables, the battlefield-context ethical evaluation function is defined as follows:
(7)
Ethical Function Modeling
In this study, the ethical decision-making function for AI-based fighter pilots is formulated by incorporating two primary components. The first component, Ef1, represents the degree to which the AI system complies with nation-specific ethical standards. The second component, Ef2, reflects the ethical evaluation based on international laws of armed conflict. To account for the dynamic characteristics of real-world battlefield conditions, both components are modeled as functions of the tactical situation variable tactical situations.
National Ethics Function Model
The nation-specific ethics function is defined as the product of the baseline national ethical evaluation and a tactical situation adjustment coefficient. It is expressed as in:
(8)
This equation quantitatively expresses how AI ethical judgment is modulated based on both nation-specific strategic values and real-time battlefield dynamics.
LOAC-Based Ethics Function Model
The ethics function, based on the core principles of the Law of Armed Conflict, is defined as the product of those principles and a tactical situation adjustment coefficient, ensuring legal legitimacy under international law. The function is formulated as in:
(9)
Integrated Ethical Function Model
The integrated ethical function is constructed by combining the nation-specific ethics function Ef1 and the international law-based ethics function Ef2 using a weighting factor δ. It is formulated as in:
(10)
This integrated model enhances both the consistency and flexibility of ethical judgment by accounting for a balanced consideration of strategic national values and legal principles, rather than relying on a single ethical standard.
|
ID |
Operational Situation |
ID |
Operational Situation |
|
S1 |
Emergency air support request |
S20 |
Electronic warfare decision-making |
|
S2 |
Joint policing operation with allied forces |
S21 |
Approaching unidentified object in restricted zone |
|
S3 |
Limited airstrike in a civilian-populated area |
S22 |
Determining return route after mechanical failure |
|
S4 |
A-to-G mission in cooperation with autonomous UAVs |
S23 |
Emergency return request from friendly aircraft |
|
S5 |
Intelligence acquisition during ceasefire negotiations |
S24 |
Response after detection of chemical or biological weapon |
|
S6 |
Air mission to support personnel rescue |
S25 |
Decision-making under communication disruption |
|
S7 |
Escort mission for aircraft carrying nuclear warheads |
S26 |
Assessment of civilian infrastructure threat |
|
S8 |
Surprise enemy attack situation |
S27 |
Recovery after AI system malfunction |
|
S9 |
Night stealth infiltration operation |
S28 |
Multilateral operation and alliance rules conflict |
|
S10 |
Medical evacuation under enemy fire |
S29 |
Response to friendly fire incident |
|
S11 |
Preemptive strike on enemy command center |
S30 |
Simultaneous attack and airspace conflict |
|
S12 |
Response to high-altitude unmanned infiltration |
S31 |
Escorting civilian aircraft under threat |
|
S13 |
Maritime attack support mission |
S32 |
AI weapon system shutdown decision |
|
S14 |
Emergency supply transport to front line |
S33 |
Target prioritization during coalition strike |
|
S15 |
Protection of civilian aircraft |
S34 |
Determination to protect civilian facilities |
|
S16 |
Radar-evading infiltration and strike mission |
S35 |
Engagement with unidentifiable hostile drones |
|
S17 |
Retaliatory strike after civilian facility attack |
S36 |
Ethical judgment in hostage situation |
|
S18 |
Request for pilot override during AI mission |
S37 |
Deployment of untested AI-based weapon system |
|
S19 |
Close-range dogfight engagement |
S38 |
Decision on recognition of enemy surrender gesture |
Table 2: Air operation scenarios
Model Application and Analysis
Simulation Configuration
This simulation is based on 38 operational scenarios that a combat aircraft pilot may encounter. For both the Ef1 and Ef2 models, 100 unique combinations of parameter values were generated.
Additionally, 1,000 sets of battlefield situation variables were sampled from a normal distribution with a mean of µ=0.5\mu = 0.5µ = 0.5 and standard deviation σ=0.1. The national weighting factor (δ) was discretized into seven levels. These settings were used to conduct a sensitivity analysis of the proposed ethical evaluation functions across all scenario conditions.
Results of the Nation-Specific Ethics Function
Simulation results of the nation-specific ethics evaluation function Ef1 across all operational scenarios show the following patterns. Scenarios S14 and S22 exhibited high mean values and low standard deviations in the simulation results. This indicates that the corresponding ethical evaluations were both stable and consistent, particularly under weight combinations where Situational Awareness and Time Criticality were emphasized. These results suggest that the scenarios are ethically appropriate in terms of operational efficiency and perception-based decision-making. In contrast, scenarios S31 and S38, where the weights for Survivability and Coalition Compatibility were relatively low, produced lower ethical scores. This implies that these variables were underrepresented in the evaluation process and that ethical considerations related to battlefield stability and coalition cooperation were insufficiently incorporated.
Results of the LOAC-Based Ethical Function
The output values of the LOAC-based ethical function Ef2, assessed across all scenarios, present an analysis of the stability and sensitivity of the ethical decision structure, based on the mean and standard deviation of the ethical suitability scores derived from each scenario. Ef2 is designed to quantitatively reflect core principles of international humanitarian law, including Military Necessity, Proportionality, Distinction, Prohibition of Unnecessary Suffering, and Honor. Simulation results show that in scenarios with clearly defined ethical standards—such as S11 and S24—the evaluation outcomes remained consistent and stable. This indicates that the proposed function maintains its ethical direction and judgment criteria reliably, even under varying battlefield conditions, thereby demonstrating its potential utility as a trustworthy ethical decision-making tool in real-world combat environments.
Analysis of Unstable Scenarios
Analysis reveals scenarios in which the standard deviation of the ethical evaluation results was relatively high. This observation suggests that the model may exhibit instability or heightened sensitivity to fluctuations in certain conditions. In the case of Ef2, scenarios S29, S30, and S31 showed particularly high standard deviations, indicating inconsistency in the ethical decision-making outcomes.
|
Group |
Scenarios |
Analysis/Policy Implications |
|
Stable Decision Group: #0 |
S3, S9, S24, S27, S31, S32 |
High alignment across national ethical standards enables stable judgment. Suitable for autonomous AI decision-making. Human-in-the-loop can be minimized. |
|
Borderline Group : #1 |
S2, S5, S6, S7, S8 S12, S13, S14, S15, S16, S17, S18, S19, S20, S23, S26, S28 |
Ethical judgment varies depending on the weight of specific national ethical standards (e.g., Chinese model influence). Strategic interpretational gaps may arise. Coordination is needed in multinational or alliance operations to prevent ethical conflicts. |
|
Unstable Decision Group: #2 |
S1, S4, S10, S11 S21, S22, S25 S34, S35, S36, S37 |
High sensitivity to changes in tactical situation (TS) variables. Requires structural design for adaptive human intervention. Needs safeguards such as judgment deferral or ethical failsafes under high uncertainty. |
Table 3: Scenario-based grouping of AI ethical evaluations
This instability appears to stem from either a low weighting of tactical variables such as Situational Awareness, Time Criticality, Survivability, and Coalition Compatibility or conflicts between those variables, which hinder the model’s ability to maintain coherent ethical judgments. In particular, scenario S29 presents a situation where both Time Criticality and Coalition Compatibility are simultaneously emphasized under different ethical principles. This creates a conflict in prioritization, making it difficult for the AI system to determine which standard should take precedence. As a result, Ef2 demonstrated high sensitivity, where even small variations in the input conditions led to significant changes in the ethical evaluation outcome. In the case of Ef2, scenarios S35, S37, and S19 also exhibited notable variability in ethical evaluation outcomes.
These scenarios represent situations in which priority among LOAC principles—such as Distinction, Unnecessary Suffering, and Proportionality—is ambiguous or context-dependent. As a result, the model demonstrated heightened sensitivity in the presence of legal standard conflicts. These findings suggest that while the designed model provides stable and reliable judgments in most scenarios, interpretational conflicts among ethical principles or imbalanced influence among tactical variables can lead to instability in decision outcomes. Therefore, when considering real-world deployment, it is essential to incorporate supplementary algorithms, decision deferral mechanisms, or adaptive weight adjustments to address such high-sensitivity scenarios.
Distribution of Ethical Sensitivity Across Scenarios
A total of 38 combat scenarios were classified into three distinct groups, as shown in Table 3 on page 26, based on the statistical characteristics of the ethical evaluation value (Ef) and the structural similarity of decision functions. Each group is distinguished by sensitivity indicators such as the relative weights of the α, β, and γ coefficients, as well as the mean (µ) and standard deviation (σ) of ethical outcomes.
First, the Stable Decision Group (#0) shows similar structures in both Ef1 and Ef2, with balanced α and β weights and consistently stable values for both µ and σ. These scenarios are well suited for fully automated AI-based ethical decision-making. Second, the Borderline Group (#1) exhibits relatively high average values, but with a heavier influence of β, along with greater variance.
This suggests heightened sensitivity to situational factors and the possibility of varying interpretations of ethical standards depending on context Third, the Unstable Decision Group (#2) is characterized by dominant α, suppressed β and γ values, and high standard deviations. These indicate the presence of ethical judgment conflicts and potential trade-offs between tactical variables. Such scenarios are high-risk and require mandatory human intervention in the ethical decision-making loop. This classification quantitatively demonstrates how the ethical evaluation structure of an AI combat system can vary depending on the operational scenario. It also serves as a framework for determining appropriate levels of automation, human involvement, and ethical risk for each scenario group. Moving forward, this categorization provides practical guidance for policy development, training system design, and the tailored application of ethical models according to scenario-specific characteristics.
Results of the Integrated Ethical Model
The integrated model Ethics Total, which combines the nation-specific ethics function (Ef1) and the LOAC-based ethics function (Ef2) according to a weighting parameter δ, was simulated to assess its overall performance. Compared to the individual models (Ef1 and Ef2), the integrated model demonstrated greater stability and consistency in scenario-based ethical judgments, as evaluated through quantitative metrics. The integrated model yielded lower standard deviations across most scenarios relative to the standalone models, indicating enhanced robustness of ethical outputs. In particular, when the δ value approached 0.5, the balance between national ethics and LOAC principles was optimally achieved, minimizing output variability.
However, the model still exhibited relatively high variance in certain scenarios. These cases suggest residual sensitivity due to conflicts or imbalances among tactical variables, LOAC factors, and national ethical standards. Such instability was most pronounced when multiple contextual variables conflicted simultaneously or when ambiguity in the interpretation of ethical principles was present, reducing the consistency of judgment outcomes.
Conclusion and Future Research Directions
This study proposed a mathematical model that integrates nation-specific ethical standards, the Law of Armed Conflict, and tactical situation variables to quantify the ethical decision-making process of AI-based combat aircraft pilots. By applying the model across a wide range of combat scenarios, we analyzed the sensitivity and stability of AI ethical judgments under varying conditions.
The proposed framework overcomes the limitations of conventional declarative ethical standards by offering a quantitative, context-aware evaluation structure applicable to real-world operational environments, thereby laying the foundation for practical deployment and policy formulation of AI combat systems. Simulation results revealed that the ethical function responded differently depending on national values and operational contexts, and that stability and sensitivity varied across scenarios.
Particularly, scenario clustering analysis identified both stable types suitable for automated ethical judgment and high-risk types requiring human oversight. These findings underscore the necessity of context-specific, adaptive ethical system design for AI weapons systems.
For future research, several directions are proposed. First, expanding the range of tactical scenarios and incorporating multinational ethical standards will improve the accuracy and adaptability of integrated ethical models across various national and allied military forces. Second, the development of real-time decision-making algorithms, grounded in the proposed ethical function models, is essential for practical implementation in AI combat systems and training environments. Third, addressing ethical conflicts requires the design of dynamic priority adjustment mechanisms, potentially through rule-based approaches or reinforcement learning techniques. Lastly, in high-risk scenarios that necessitate human oversight, the integration of human-in-the-loop interfaces and decision-support systems will be critical to ensuring safe and effective human-AI collaboration.
This research holds both academic significance and strategic value in that it mathematically establishes ethical robustness for AI weapon systems and demonstrates its applicability in real-world military contexts. We anticipate that future efforts will extend this work toward practical implementation and global consensus-building in the domain of military AI ethics.
Acknowledgement
This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MND) (RS-2022-II220601, Military specialized AI Curriculum Establishment and Operation (Military AI Development and Management Program)
References
- J. Jae-Gyu, “A Study on Defense Artificial Intelligence Ethics,” Defense Policy Research, vol. 39, no. 1, Korea Institute for Defense Analyses, 2023, pp. 213–240
- H. Won-Jung, “Military Use of Artificial Intelligence from the Perspective of the Nature of War,” Korean Journal of Military Studies, vol. 76, no. 3, Hwarangdae Institute, 2020, pp. 31–59
- L. Sang-Hyung, “Is Ethical Artificial Intelligence Possible? – Moral and Legal Responsibility of AI,” Law and Policy Studies, vol. 16, no. 4, 2016, pp. 283–303.
- K. Yong-Sam, “Vision and Operational Environment for the Army’s Drone-Bot Combat System,” Defense & Technology, no. 477, 2018, pp. 50–59.
- K. Seung-Rae, “Legal Issues and Prospects in the Era of the 4th Industrial Revolution and AI,” Law Research, vol. 19, no. 2, 2018, pp. 1–30.
- International Committee of the Red Cross (ICRC), “The Ethical Challenges of AI in Military Decision Support Systems,” ICRC Law & Policy Blog, Sep. 2024.
- C. Batallas, “When AI Meets the Laws of War,” IE Insights, Oct. 2024. [Online].
- Z. Stanley-Lockman, “Responsible and Ethical Military AI,” Center for Security and Emerging Technology (CSET), Aug. 2021. [Online]
- M. Anneken, N. Burkart, F. Jeschke, A. Kuwertz-Wolf, A. Mueller, A. Schumann, and M. Teutsch, “Ethical Considerations for the Military Use of Artificial Intelligence in Visual Reconnaissance,” Feb. 2025. [Online].
- D. Trusilo and D. Danks, “Commercial AI, Conflict, and Moral Responsibility: A theoretical analysis and practical approach to the moral responsibilities associated with dual-use AI technology,” Jan. 2024. [Online].
- Zurek, J. Kwik, and T. van Engers, “Model of a military autonomous device following International Humanitarian Law,” Ethics and Information Technology, vol. 25, art. 15, Feb. 2023
- M. Anneken, N. Burkart, F. Jeschke, A. Kuwertz-Wolf, A. Mueller, A. Schumann, and M. Teutsch, “Ethical Considerations for the Military Use of Artificial Intelligence in Visual Reconnaissance,” Feb. 2025. [Online].
- Z. Stanley-Lockman, “Responsible and Ethical Military AI,” Center for Security and Emerging Technology (CSET), Aug. 2021. [Online].
- Kim, J. Choi, and J. Baek, “Design of an integrated function model for quantifying ethical decisions in AI fighter pilots,” presented at the 2025 Annual Conference of the Institute of Electronics and Information Engineers (IEIE), Jeju, Korea, June 2025.
- A. Hickey, “The GPT Dilemma: Foundation Models and the Shadow of Dual‑Use,” Jul. 2024. [Online].
- D. Helmer *et al*., “Human‑centred test and evaluation of military AI,” Dec. 2024. [Online].
- K. Cools and C. Maathuis, “Trust or Bust: Ensuring Trustworthiness in Autonomous Weapon Systems,” arXiv:2410.10284, Oct. 14, 2024. [Online].
- H. Khlaaf, S. W. Myers, and M. Whittaker, “Mind the Gap: Foundation Models and the Covert Proliferation of Military Intelligence, Surveillance, and Targeting,” Oct. 18, 2024. [Online].
- A. Nalin Tripodi, “Future Warfare and Responsibility Management in the AI‑based Military Decision‑making Process,” J. Armed Forces Media Univ., vol. 14, no. 1, Spring 2023. [Online].
- N. Upreti and J. Ciupa, “Towards Developing Ethical Reasoners: Integrating Probabilistic Reasoning and Decision‑Making for Complex AI Systems,” Feb. 28, 2025. [Online].
- T. Izumo, “Coarse Set Theory for AI Ethics and Decision‑Making: A Mathematical Framework for Granular Evaluations,” Feb. 2025. [Online].
- “Tractable Probabilistic Models for Ethical AI,” in Tractable Probabilistic Models for Ethical AI, Springer, 2022.
- Y. Wang, Y. Wan, and Z. Wang, “Using Experimental Game Theory to Transit Human Values to Ethical AI,” Nov. 2017. [Online].
