Design an Improved Trust-based Quality of Service Aware Routing in Cognitive Mobile Ad-Hoc Network

: Mobile ad hoc networks (MANETs) are wireless networks that can be conﬁgured at will. It has no infrastructure and centralized control, so it is only suitable for provisional communications. In a dynamically topological and resource-constrained network, ensuring QoS and security is challenging. MANETs are dynamic networks, so navigating them can be challenging and more susceptible to attacks. MANET requires signiﬁcant memory, speed, and transmission bandwidth for conventional security measures like cryptographic techniques. Consequently, these methods are unsuitable for identifying malicious behaviour or self-centered nodes. Nodes that are malicious, selﬁsh, or malfunctioning can be identiﬁed based on the trust method, which calculates how much trust exists between them. A trust-based QOS-aware routing protocol is proposed in this paper to calculate trust in MANET (I-TQAR). The tree important performance metrics are considered for result validation such as delay, throughput and packet delivery ratio (PDR). I-TQAR o ﬀ ers signiﬁcantly improved performance in all areas compared to the existing TQR and TQOR protocols.


INTRODUCTION
Wireless mobile nodes spontaneously form temporary networks in MANETs.People and vehicles can access the internet wirelessly in such areas without an existing communications infrastructure [1].Mobile ad hoc networks have radio ranges that allow direct communication between nodes; nodes outside these ranges communicate through intermediates [2], [3].This type of wireless network is called a MANET since all communication nodes automatically form a wireless network [4].There are various requirements for ad hoc network routing protocols, including scalability, safety, facility excellence, energy efficiency, multicasting, combining, and collaborating between nodes [5].Security and service quality are considered here as qualitative properties.Mobile nodes form an autonomous distributed system when interconnected by wireless channels, as shown in Figure 1.
Recently, Quality of Service (QoS) has become a major concern for MANETs [6], [6], [7].Throughput, bandwidth, jitter, and delay and delay are all QoS requirements that must be satisfied when using traditional QoS routing.A MANET environment does not guarantee QoS based on security requirements.The security of MANETs is an important consideration for QoS routing.Despite the importance of trust issues in QoS routing, little research has been conducted on resisting malicious behaviour and addressing trust issues [8].A classic routing method is enhanced with trust and quality of service parameter estimation to enhance network security.In MANETs, an indirect trust degree is calculated by incorporating neighbours' recommendations and direct trust degrees are calculated by analyzing direct observations.We only consider link delay as a QoS constraint in this case since multi-QoS constraints are NP-complete.Until now, cognitive wireless networks have not considered trust management as a cognitive process.The trust-based routing protocol we use for secure routing in cognitive networks takes advantage of the cognitive properties of the cognitive layer.Trust learning makes it possible to detect blackhole and grayhole attackers more easily.Three optimization objectives are considered when solving the routing problem: delay, throughput, and PDR.MANETs have several important performance indicators.These include delay, throughput, and packet deliverability [9].

LITERATURE
MANETs are wireless networks that communicate over multiple hops between wireless nodes.MANET nodes must simultaneously act as routers and hosts.On-demand routing (reactive) and table-driven routing (proactive) are two types of MANET routing protocols [10].Many situations lend themselves to on-demand routing protocols, which are more efficient than table-driven ones [11].Most MANETs assume that all routing protocols will be fully cooperative among participating nodes.Because MANETs are open, mobile, and have dynamic topologies and protocol weaknesses, attackers can target in various ways [2], [12].MANETs have been proposed with several secure routing protocols [13][14][15][16].It is common for them to assume centralized units or third parties, as a result of which MANETs cannot function independently.
Authentication, confidentiality, data integrity, and other security properties remain at risk from routing challenges, such as attacks [17].Therefore, there is a distinction between internal and external nodal attacks.Attacks from external nodes cause the former, but attacks from grayholes and blackholes cause the latter [17][18][19][20].Consequently, optimization plays a major role in achieving trust-enabled secure routing, providing a solution or two based on different objectives.A MANET has two types of trust: nodal and route trust [19].Certain objectives must be met for simultaneous optimization to work in the real world.Bio-inspired algorithms also determine MANET routing paths.Several algorithms are used today, including PSO, ACO, bee colony optimization, and genetic algorithms.The ant-colony based route algorithm is one algorithm based on the ACO [21], HOPNET, AntHocNet [22], Ant-based Dynamic Zone Routing Protocol [23] and Hybrid ACO Routing [24] Foraging behaviour of ants is the basis for this theory [25].
MANET research mainly focuses on developing routing protocols that incur the smallest hop-count metric [26], incurring the least cost.Two categories of protocols can be distinguished: reactive protocols and proactive protocols.
Routes are discovered reactively by AODV nodes [27], [28] only when needed.In proactive routing mechanisms, OLSR continuously exchanges network topology information between its nodes to find routes [29], [30].In contrast, hop-count metrics don't reflect a network's mobility or other contextual features when routing.Thus, a high mobility scenario may not be optimal for the route choice.
DSR was extended with a context-aware inference scheme to punish malicious accusers and accused.According to [31], selfish nodes could be detected by a context-aware mechanism.MANET environments with limited resources may not be able to use digital signatures.CORE (COllaborative REputation) is proposed by the author [32].Inbuilt reputation functionality distinguishes direct, indirect, and functional reputations and a monitoring mechanism.A protocol is proposed to decide whether to cooperate or gradually isolate a node.Only positive reputation information is exchanged through this mechanism.However, if it does not have the option to submit negative feedback, it may be forced to rely on positive reports.The author proposes a new incentive-based approach to trust management called SORI (Secure and Objective Reputation-based Incentive) [33].In addition to facilitating packet forwarding, one-way hash chain authentication schemes discourage selfish behaviour and encourage quantified objective measures.
This scheme may perform less well in a hostile environment because malicious nodes exist.The author [34] suggested that AODV (Adhoc On-Demand Distance Vector) can be considered an extension of AODV, in which the trust factor is used by one node, and another uses the security level.The approach taken by each node depends on its security level and trust factor.It proposes varying levels of encryption according to node trust factors rather than encrypting all routing information for every request.As a result, the approach conserves resources by adjusting security levels according to hostility; however, it does not assess trust levels.

METHODOLOGY
The algorithm uses the AODV protocol to implement Trust-Based QoS-aware Routing (I-TQAR) at the network layer [27].Two cognitive phases make up the learning process in the cognitive layer of CN: the learning of paths (routings) and the learning of trust.RL methods of expected-SARSA [35], [36] are used to learn path segments.Considering the lower update variance and faster convergence of the expected SARSA, it is adopted [35].As a result of improving an existing trust model, we use a model to learn about trust [37].Nodes use RL methods to determine the best path for packet delivery based on their interactions with their environment.Further, nodes interact to learn each other's trustworthiness (trust learning).Figure 3 explains the path learning and trust learning phases, followed by the I-TQAR protocol.Our routing protocol is composed of the following RL components, assuming network nodes as agents: • The environment of a node includes all other nodes except that particular one.
• Number of nodes currently holding RREQ packets in the network at time index n in each state s n i ∈ S (i ∈ (1, . . .., k} • RREQ is selected, at time index n, among the nodes directly connected to a forwarding network to s i , among the nodes directly connected to a forwarding network to s i , according to the action a n i ∈ A. • Q(s n i , a n i ) chooses an action based on its Q-value a i at state s i ending up at state n at time index n s n+1 i .Agents update this value when the node select their next hop..
• There is a direct correlation between reward.ir n i ∈ Rpath quality and rate of travel.Nodes that select actions ia n i (select the next forwarding node.) can receive new rewards ir n+1 i ∈ Rbased on the new state is n+1 i of the environment.
For each node (e.g.agent j) in the network to reach its end-to-end goal, it considers other nodes as part of its environment (e.g.agent i).
A node without network knowledge is assumed to have zero Q-value at the beginning of communication.As described in [35], This rule is based on weighted sums of old Q-values and learned Q-values, where learned Q-values represent immediate rewards, while old Q-values represent future rewards: The learning rate, ∝, and discount factor, γ, are fixed in (1).Taking into account the ε − greedy selection policy, the next intermediate forwarding node will be π s n+1 i , a n+1 i .The immediate reward will be r n+1 i .A RL method based on Expected-SARSA is adopted in this study.The Future Q-value of expected SARSA is determined by weighing all possible actions following the formula below: (1) a packet's delivery delay The reverse process occurs through the next intermediate forwarding node of its delivery delay through the next forwarding node.Using this strategy, you pick the best action based on its Q-value a n+1 i first through a chance of 1 − ε, and By selecting the average of all other actions with Q-values, you will arrive at the results you want.
A forwarding node's path quality affects the total link delay.Expected Transmission counts represent path quality [38].Using packet loss ratios between neighbouring nodes estimates how many transmissions are required to deliver a packet to its destination.
Using hello packets, we obtain the value.Over a while [t − t, t]) A record of how many hello messages the machine broadcasts is kept (e.g.h j [t − t, t] a neighbour node's hello message number is recorded j s broadcast (e.g.h j [t − t, t].In hello messages to neighbours, these values are conveyed.ET X t i j according to the formula below, during time t, nodes i and j have the following relationship: As part of our computations, we ignore the queueing delay, which comes from transmission, propagation, and queuing delays.Thus, d t i j is calculated based on ET X and t pkt (ignoring propagation and queuing delays), as follows: Using this parameter, we can estimate how long a packet will take to reach a neighbouring node on average.The tradeoff between exploration and exploitation is a common challenge in RL algorithms.As a result of our analysis of all possible actions, we choose an action using a greedy strategy.An agent in ε − greedy determines which action is optimal (one that maximizes Q-value); it takes action directly chance of 1 − ε, and chooses consistently at random between the two actions with probability ε where 0 < ε < 1. Experimentally, we determine the appropriate value.

B. Trust learning phase
• Trust model: Each node learns the trustworthiness of its neighbours during the trust learning phase of the algorithm.agents i & j interact through one another in time intervals (t, t + δt].Consequently, each agent updates its neighbour's trust value in time intervals δt seconds.A trust threshold λ is used to determine a node's trustworthiness.Nodes that are considered trustworthy exceed or equal the threshold of trust.If node A is untrustworthy, it will remain isolated for the network's lifetime.Node B will also remain isolated since there will be no reconsideration.False decisions may lead to node isolation, which harms the final result.As part of our plans, we will address this issue.
Each node detects and isolates a black and grey hole using direct and indirect trust.Malicious nodes drop all packets intended for forwarding in black hole attacks.By participating in the routing process, it maintains its trustworthiness.While participating in the routing process, a malicious node selectively drops data packets randomly with an average probability of 0.5, unlike a black hole attack.There is an assumption that trust is asymmetric between neighbour nodes.A historical and current trust evaluation is also used to calculate the total/current trust.It is important to include previous trust evaluations in calculating current trust to prevent abrupt trust level changes caused by grayhole attacks.
Trust computation: Based on the interaction between two neighbours at a certain time t, direct trust is computed as follows: There are two trust nodes i is the trustee, node j is the trustee, and node i to node j is f i, j packets that are forwarded, in contrast, node j forwards f j packets the time interval (t, t + δt]. Based on neighbour recommendations, each node evaluates its trustee/target node.The trust value of each recommender is assigned by node i based on its general reputation (total) since a recommender can be malicious.Based on the average weighted sum of the trust values of the recommenders, the indirect trust level of the trustee node is calculated as follows: A T i,k value representing the total trust between two nodes, whereas a T k, j an indication of how much a node trusts its neighbours.Alternatively, current trust is computed by weighing direct and indirect trusts at the node in question, as follows: CT t i, j = w 1 DT t i, j + w 2 IT t i, j ; It is considered more reliable to obtain direct information directly from the node.In the formula below, the total/current trust is calculated by weighting the current and historical trusts: The current trust is more important than the historical one because it is based on recent information.

RESULT ANALYSIS & DISCUSSION
Simulating I-TQAR, TQOR, and TQR protocols, which are recently proposed trust-based QoS-aware routing protocols, we examined their performance.Direct and indirect trust must be computed for packet forwarding through trusted routes to meet the QoS constraint of packet forwarding through trusted routes.A random mobility pattern was generated by our simulation using NS-2's 'setdest utility'.As nodes move randomly between 0 and the maximum specified speed (10 meters per second), nodes start randomly in a 1000m by 1000m area.A node's trust value can drop packets more often when malicious nodes are randomly distributed.The data transmitted was 512 bytes.

A. Experimental Results Discussion
According to this definition, an end-to-end delay is the average time packets traverse between a source and a destination.This end-end delay can also include a retransmission, propagation or transfer delay.TQR and TQOR cause more delays than the proposed protocol I-TQAR.I-TQAR's network topology, which contains 6 malicious nodes, does not look like that.Our protocol prevents malicious nodes from interacting with each other.Considering the overall trust value between nodes in the proposed model, all interactions are based on that value.
It can be seen from Figure 5 that the PDR for the I-TQAR protocol decreases as the number of malicious agents increases.Communication from sources to destinations is slow as a result of packet loss.Despite maintaining a high PDR, our protocol maintains a normal network operation.We detect and mitigate attackers during the route discovery process, so a combination of malicious attacks will not affect the PDR, as we usually communicate if any malicious attacks are present.
The information is sent from any source, the routing from all moving nodes determines the results.Generally, the simulation result is the accumulated data transmitted during the simulation period (irrespective of whether or not data have been sent and received).The figure below shows the simulation time comparison between the proposed I-TQAR model and current TQR and TQOR routing protocols.The propagation period and multiple delays in any communication system are reflected in latency.Because all nodes behave like clients and servers over the network channel, topology has a large end-to-end delay.Sending and receiving packets take time from end to end.When data is transmitted across a network, latency is how long it takes.With different values of network load, here is a typical time output.According to the proposed I-TQAR network topology, the maximum end-to-end delay is 1.26 seconds.For a network to achieve its maximum throughput, its loss rate, or packet delivery ratio, must be determined.The number of malicious nodes and origin packets delivered by each protocol are calculated based on Figure 8.There is no relation between offered mobile nodes and the TQR, TQOR, and I-TQAR packet delivery ratio.The malicious node at 6 was served particularly well by routing protocols TQOR and I-TQAR, which delivered most of the original packets.TQR, however, delivered half of the original packets, as shown in Figure 8.

CONCLUSION
Since wireless networks and mobile computing hardware can now support ad-hoc networking, researchers have been paying greater attention to the topic.Recently, several new routing protocols have been proposed for ad-hoc networking environments.Still, node-level performance comparisons and detailed performance information for each protocol were unavailable.Based on average end-end delays, average throughputs and packet delivery ratios (PDR), this paper compares TQR, TQOR, and I-TQAR.Therefore, this method outperforms those currently in use.An NS2 simulator monitors throughput, packet delivery ratios, delay between endpoints, and packet received ratios.

CONFLICTS OF INTEREST
The author declares no conflict of interest.

FIGURE 1 .
FIGURE 1. Network topology of the ad-hoc network.

FIGURE 6 .
FIGURE 6.Time between end-of-simulation and simulation time (sec)

FIGURE 7 .
FIGURE 7. Number of malicious mobile nodes versus average end-end delay (Sec).