• Nenhum resultado encontrado

An Intent-Based Reasoning System for Automatic Generation of Drone Missions for Public Protection and Disaster Relief

N/A
N/A
Protected

Academic year: 2023

Share "An Intent-Based Reasoning System for Automatic Generation of Drone Missions for Public Protection and Disaster Relief"

Copied!
82
0
0

Texto

Autonomous drones can automate search and rescue operations by removing the human pilot, which can increase efficiency and reduce costs. The development of 5G can provide the infrastructure needed to meet the stringent performance requirements of machine learning models and autonomous drones.

Methodology

Research problem

Scope and Limitations

As shown in Figure 2.4, the 5G network can be divided into network slices with different configurations. Therefore, the prompt contains instructions for generating a MAVLink file using the user's audio recording. The system aims to create drone missions based on the user's intent.

The mission generation test provides a framework for measuring the quality of the drone plan based on the user's input by assigning a score to the model output. Based on the results, it is recommended to use voice commands of low duration to provide the system with the most accurate translation of the intent. An essential functionality of the system is the ability to inspect the drone mission plan to evaluate whether the intent has been translated correctly.

Edge computing can be used to bring computing closer to the location of a drone.

Figure 2.1: The 4G Network Architecture [18].
Figure 2.1: The 4G Network Architecture [18].

Significance and Contribution

Organization

  • System Overview
  • User Equipment
  • E-UTRAN
  • Evolved Packet Core
  • System Overview
  • New Radio User Equipment (NR UE)
  • Next-Generation Radio Access Network (NGRAN)
  • Core Network
  • Network Functions
  • Overview
  • Network Slicing
  • Summary

The Evolved Universal Terrestrial Radio Access Network (E-UTRAN) is the air interface part of the 4G network, allowing UEs to connect to the network core. It consists of a base station called an Evolved Node B (eNodeB) and uses radio signaling to connect the UE to the network.

Cloud Native 5G Core

Drone Technology

Parrot Drone

Parrot also offers drone users an open source software development platform (SDK) Parrot SDK [36] that developers can use to use code to interact with the drone. 4G connectivity allows the drone to be piloted in remote areas, and compatibility with the Parrot SDK opens up many possibilities for upgrades through programming. The drone also allows the upload and execution of drone mission plans, allowing the drone to operate independently.

Figure 2.5: The Parrot ANAFI Ai with the Skycontroller 4 [27].
Figure 2.5: The Parrot ANAFI Ai with the Skycontroller 4 [27].

MAVLink

The Parrot ANAFI AI has unique features such as 4G connectivity and Parrot SDK compatibility, enabling it to be used for rescue operations.

Machine Learning

Transformer

The key components of the transformer are the self-attention mechanism and the feedforward neural network. The encoder processes the input sequence and generates the weights that capture the contextual information of each token. The output of the final encoder is sent through a linear neural network that acts as a classifier.

Figure 2.8: The Transformer Architecture [57].
Figure 2.8: The Transformer Architecture [57].

Whisper

The Softmax activation takes the weights to generate a probability distribution of each position of the output sequence. The output of the transformer can be used for words or sentences, which can be used for complex use cases such as chatbots, language completion models, and speech recognition models. A Mel scale [26] is a pitch scale where listeners perceive the pitch as an equal distance.

OpenAI Chat

Humans perceive lower frequencies such as 500 MHz and 1000 MHz as very distinct, while higher frequencies such as 6000 MHz and 8000 MHz are barely noticeable.

Summary

The user communicates with the system using a mobile device, the communication is managed using a REST API, a drone mission plan is automatically generated based on transcribed voice commands using machine learning models, and the drone can be programmed to follow mission plans . The user input function allows users to interact with the system using their express intent. The communication management functions handle the communication between the system components and are implemented as a REST API.

User Input

Intent Input

This gives the chat completion model an understanding of the MAVLink format before instructing it on how to generate the file from the user's transcribed text. The key components of understanding the system's intentions are the voice recognition model that transcribes the user's intent, and the chat completion model that generates drone missions based on the intent. The evaluation process assesses the correctness of the MAVLink file based on inputs of varying complexity.

Figure 3.4: Before and after the mobile application has received coordinates.
Figure 3.4: Before and after the mobile application has received coordinates.

Mission Verification

Intent Execution

The intent execution function allows the user to execute the translated intent on the programmable drone. The screen also displays the intent translated from a voice recording to a text transcript, allowing the user to verify that the system has translated their intent correctly. In addition, the raw unparsed drone plan is displayed, allowing the user to inspect the content the drone receives and executes.

Communication Management

Mission Generation Communication

Intent generation communication manages communication between system components to generate a drone mission based on the user's intent. The web server receives an intent from the mobile application to generate a drone plan based on the user's intent. The chat completion model returns the drone's mission plan as a MAVLink file based on the prompt containing the user's intent.

Drone Programming

Mission Generation

The system uses the chat completion model Text-Davinci-003 to translate the user's intent into a drone mission. Another language model that was not used is GPT-turbo-3.5, a chatbot, which is cheaper but cannot be customized like Text-Davinci-003. Ultimately, Text-Davinci-003 can translate the user's intent into mission plans that can be executed by the drone.

Drone Control

Summary

It describes how the system's ability to generate drone missions based on intent was tested using correctness and latency tests for audio transcription and mission generation. To evaluate the system's ability to translate intentions into drone mission plans, the accuracy and latency of key components are tested.

Testing Intent

Correctness Test

The model correctly transcribed 78 words in the sentence because it transcribed every word correctly except for the transcribed Drive instead of Flyv. The transcription test gets a correctness score of 87.5% because the model transcribed 87.5% of the words correctly. The transcription accuracy results from Table 4.4 show that the audio transcription model can perfectly transcribe audio of short duration (10 seconds).

Latency Test

Mission Generation Test

Correctness Test

The test results from Graph 4.3a show that commands with a temperature of 0.0 have a latency of 16 seconds on easy difficulty, which decreases to 8 seconds on medium difficulty and increases to 124 seconds. The test results from Graph 4.3b show that commands with a temperature of 0.5 have a latency of 20 seconds on easy difficulty, which decreases to 10 seconds on medium difficulty and increases to 82 seconds. a)Average latency per difficulty of model temperature 1. b)Average latency per difficulty at all temperatures. The test results from Graph 4.4a show that commands with a temperature of 1.0 have a latency of 10.5 seconds on easy difficulty, which increases to 11 seconds on medium difficulty and increases to 27 seconds.

Figure 4.2: Model output evaluation process.
Figure 4.2: Model output evaluation process.

Evaluation

Evaluating Intent Capture

This is to be expected as the longer the duration of the audio, the more difficult it is for the model to transcribe the audio correctly because it increases the chance of incorrect transcription. Latency results show that the model uses 2-5 seconds to transcribe short, medium and long duration audio. Additionally, a typical voice command has a duration of 10-30 seconds, which the model has demonstrated it can reliably transcribe.

Evaluating Mission Generation

This is because the language model will choose more random and unlikely tokens when generating an answer. The latency results show that the model's latency decreases on assignments of easy to medium complexity, and then increases dramatically on assignments of difficult complexity. The overall results of the mission generation evaluation are that the model can reliably generate drone missions with commands of easy complexity, but cannot generate drone missions with commands of increased complexity.

Summary

  • User Input
  • Communication Management
  • Mission Generation
  • Drone Programmability

The first chapter discusses the technical feasibility of the system and some design and implementation choices. A solution to this would have been to use the GPS location of the mobile device. This has allowed the drone to be implemented into the system, allowing MAVLink commands to be uploaded to the drone.

System Performance

In addition to 4G connectivity, the ANAFI AI is included in the Parrot SDK ecosystem, which allows developers to develop software to control the drone. A problem with the drone's hardware is that the battery life (30 minutes) is low and therefore requires frequent charging, which limits the range. Another problem is that the drone is not equipped with night vision or thermal cameras, making it difficult to use at night when there is not much daylight.

System Viability

Potential

The decision-making process of the drone does not need to use a common language, but can be trained to generate drone missions specifically based on natural language. An example is if a drone has to fly large distances, and the calculation can be shifted based on its position. Drones can be operated with precise control and navigation from completely different locations, like a centralized disaster operations room.

Ethical Concerns

This can allow the drone to recommend routes, optimize routes and also coordinate with other drones in order to improve coverage and efficiency. While acknowledging the system's potential, it is important to acknowledge that its actualization is still a long way off. This can allow drones to access areas that they would not normally be able to reach.

Privacy & Security

Future Work

User Input

It could also display a counter that shows the recording time, and it should also allow the user to play back to confirm that the audio was recorded as intended. The MAVLink screen should provide information on whether the MAVLink was successfully uploaded to the drone. Any errors thrown by the drone must be handled by Olympe, and if the system cannot handle them, they must be forwarded to the user and given an indication of what steps to take to resolve the problem.

Drone Mission Generation

Another useful feature would be to allow the user to view the transcript on the recording screen and allow the transcript to be edited if it was incorrect. Because currently the system does not give an indication of the connection, which means that you do not know in advance whether the MAVLink upload will succeed.

Experiments

Summary

Research Problem: IBR systems for controlling autonomous drones using 5G technology can be first responders during PPDR missions. Integrating Network Slicing and Machine Learning into Edge Networks for Low Latency Services in 5G and Beyond Systems.” In: Applied Sciences. Implementation of Drone Technology for Farm Monitoring and Pesticide Spraying: An Overview.” In: Information Processing in Agriculture pp.

The 4G Network Architecture [18]

The Network Functions of the 5G Core [29]

Standalone and Non-Standalone Deployment Models, retrieved

The Parrot ANAFI Ai with the Skycontroller 4 [27]

The MAVLink file format

A MAVLink file

The Transformer Architecture [57]

The Whisper Architecture [46]

The System Implementation

The Mobile Application Tab Navigator

The Mobile Application Recording Screen

Before and after the mobile application has received coordi-

Before and after the system has processed the users intent

Transcription latency of varying audio duration

Model output evaluation process

Average Latency per Command Difficulty of Model Tempera-

Average Latency per Command Difficulty of Model Tempera-

Server specifications

Mobile device specifications

Model specifications

Transcription correctness results from the audio transcription

Correctness results from varying degrees of temperatures

Imagem

Figure 2.1: The 4G Network Architecture [18].
Figure 2.2: The Network Functions of the 5G Core [29].
Figure 2.3: Standalone and Non-Standalone Deployment Models, retrieved from STL Partners [40]
Figure 2.4: 5G Network Slicing [13].
+7

Referências

Documentos relacionados

Conclusão: Ainda não há um consenso sobre a dosagem ideal, forma de administração e apresentação da glutamina na melhora do trato gastrointestinal, apenas se sabe