[PENDING] Explainable Artiﬁcial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

InalltheaBovede-nitations, comprehensibilityMergesasTheMost essential concept in xai.both Transparency and interpretability are strongToThisconcepti MeaningNintInTaTitreliesontheCapability offtheaudienceto understandtheknowledge containedModel. A model can be explained, but the interpretability of the model is something that comes from the design of the model itself. The amount of variables is too high and/or the similarity measure is too complex to be able to fully simulate the model, but the similarity measure and the set of variables can be decomposed and analyzed separately.

The similarity measure cannot be decomposed and/or the number of variables is so high that the user must rely on mathematical and statistical tools to analyze the model. Rules have become so complicated (and the ruleset size has grown so much) that mathematical tools are needed to inspect the model behavior. General Additive Models Variables and the interaction between them according to the smooth functions involved in the model must be constrained within human capacities to understand.

Interactions become too complex to be simulated, so decomposition techniques are needed to analyze the model. It is important to observe that predictions generated by KNN models are based on the idea of distance and similarity between examples, which can be tailored depending on the specific problem being addressed. Interestingly, this prediction approach resembles that there are experience-based, -there-and-based experiences. see. There is the basis for why KNN has also been widely adopted in contexts where model interpretation is a requirement[186-189]. Furthermore, besides being simple to explain, the ability to inspect the reasons why a new sample has been classified into a group, and to examine from which these predictions have fallen, strengthens the interaction between the users and the model.

Fig. 2. Diagram showing the diﬀerent purposes of explainability in ML models sought by diﬀerent audience proﬁles

Post-hoc explainability techniques for machile learning models: Taxonomy, shallow models and deep learning

Again, the Baise models fall under the ceiling of Transparent models. Its categorization makes it undersimulable, decomposable and algorithmically transparent. It is worth noting, however, that under certain circumstances (too complicated or troublesome variables), a model can detach these first two properties to show the first two properties. such as cognitive modeling [201,202], fishing[196,203], games[204], climate[205], econometrics[206]or robotics[ 207]. Furthermore, they have also been used to explain other models, such as mean tree ensembles[208]. Several trends emerge from our literature analysis. To begin with, rule extraction techniques prevail in model-agnostic contributions under the umbrella of post-hoc explainability. This might have been intuitively expected if the wide use of rule-based doctrines that are explicable in the 3-explanation, and not in the composition of the complex 3. model self. Similarly, another large group of contributions deals with feature relevance. Lately, these techniques have received a lot of attention from the community when dealing with DL models, with hybrid approaches that take advantage of particular aspects of this class of models and therefore compromise the method's dependence on the plain. Finally, on wise technique under post-hose explainability,[240] proposes a framework that sets recommendations that, if taken, would transform an example from one class to another. make them higher in pay rates.

Among the explanations by simplification, four classes of simplifications are made. Each of them differs from the other in how deeply they enter the structure of the internal structure of the algorithm. First, some authors propose technique-niquest construction of rules based on the model of the model based on its 9-proposition. method that derives rules directly from the covering algorithm trained by successive covering gamodified support vectors. In [57] the same authors propose an eclectic rule extraction, still considering only the support vectors – the trained brain model. guethatlongan- tecedents reduce the comprehensibility, thus, fuzzy approach for more linguistically comprehensible results. These second class of simplifications can be re-illustrated by [98], who proposed the title of the obscene plan of SVM, together with support vectors , in components in relation to other creation loads. actions between support vectors and the hyperplane. In a third model simplification approach, another group of authors considers the addition of actual training data as a component for building those rules. soidsandhyper-rectangles in the input space.Similar to[106] , theauthorsproposed hyper-rectangerulaxtraction, analgorithmbasedon svc (control support) to findprototypeVeevectorsForeachclass andthende finesmallhyper-rectanglesaround.in [105] overlayThe authorspresentanovelchniqueasacomponentofamulti-kernel svm.Thismulti-KernelmethodconsistsOftureSelection, Prediction modeling and otype. Some ModelsimplificationtechniqueshavebeenproposedforneuralnetworkswithoneSinglehiddenLayer,howeverfewworkshavebeenproposedforneureuralnetworkswithmultiplehiddenLayers. presented in [259]multilayer neural network adding more decision trees and rules. ForInstance, [56] presenteSasimpleDistillationMethod called interpretableMimicLearningtoextractaninterpretableModelby means lovingBoostingtrees.inTamedirection, theauthorsin [135] ProposeaHierarChicefTheftheTaPacethatre- VealsTheiterresiveResivofunLikelyCelyCeAHierarChicefThefacheTaterPacethater DisaworksaddresstheDistillationofknowl- EdgeFromanenSembleofModelSintoaSingLemodelLayerneuralNetwork s is Morecomplex AsthenumberofLayerSincreases,explainingthesemodelsModelsMethodsModelsMethodsModelsthathavemore very popular.

They consider reaching a neuron as an object that can be decomposed and expanded, and then summing these decompositions and propagating them back through the network, resulting in a deep Taylor decomposition. to the difference. To increase the interpretability of classical CNNs, the authors in [113] used a loss for each filter in high-level convolutional layers to force each filter to learn very specific object components. The obtained activation patterns are much more interpretable due to their exclusivity with respect to the different labels to be predicted. pixel of the input image in the form of a heat map. They used an LRP (Layer-wise Relevance Propagation) technique, which relies on Taylor series close to the prediction point rather than partial derivatives at the prediction point itself. To further improve the quality of the visualization, attribution methods such as heat maps, salience maps or class activation methods (GradCAM [292]) are used (see Figure 7). In particular, the authors of [292] proposed a gradient-weighted class activation mapping (Grad-CAM), which uses the gradients of each target concept and culminates in the final convolution layer to produce a coarse localization map, identifying the important regions in the image are highlighted to predict the concept. While many of the techniques explored above use local explanations to obtain a global explanation of a CNN model, others explicitly focus on building global explanations from locally found prototypes. prevent changing the way these low-level representations are captured. All in all, visualization combined with feature relevance methods are perhaps the most chosen approach to explainability in CNNs.

Instead of using a single interpretation technique, the framework proposed in [295] combines several methods to provide much more information about the network. For example, combining feature visualization (what is the neuron looking for?) with assignment (how does it affect the outcome?) allows investigating how the network decides between labels. This visual interpretation of the capability interface shows different blocks such as function visualization and assignment depending on the visualization goal. This interface can be thought of as a grouping of individual elements that are as long as layers (input, hidden, output), atoms (aneuron, channel, spatial or neuron group), content (activations – amount of fired neurons, attribution – which classes as spatial position the most contributors tending to become more meaningful in later layers) and presentation (information visualization, feature visualization). Figure 8 shows some examples. In the first group, the authors[280] extend the use of LRP to RNN. They propose a specific propagation rule that works with multiplicative connections as in LSTM (LongShortTermMemory) and GRU (Gated RecurrentUnits). The authors[281] propose a visualization technique based on the finite horizon. n-grams that discriminates interpretable cells within LSTM and GRU networks. Following the assumption that the architecture is not modified, [296] extends the interpretable mimicry distillation method used for CNN models to LSTM networks by extracting interpretable features by focusing on Gradient fitting Boosting Tree on a trained LSTM network. In addition to an approach that does not change the inner workings of RNNs,[285] presents the RETAIN(ReverseTimeAttenIoN) model, which detects influential past patterns using a two-level neural attention model. To create an interpretable RNN, the authors[283] propose an RNN based on SISTA(SequentialIterative Soft-T). hresholding algorithm) that models a sequence of correlated observations with a sequence of scatter vectors, which makes its weights interpretable as parameters of a principle statistical model. Finally, [284] constructs a combination of HMM(HiddenMarkovModel)andRNN, so that the whole model approach takes advantage of the interpretability of the HMM and the accuracy of the RNN model.

Adiﬀerent perspective on hybridXAI models consists of enriching the knowledge of black-box models with the one of transparent ones, as proposed[24] and further refined in[169]and[307]. In particular, this can be done by limiting the bilateral 169-and-[307]. joint-encompassing white-and-black-box models[307]. Another hybrid approach consists of mapping a non-interpretable black-box system to a white-box twin that is more interpretable. For example, an opaque neural network can be combined with a transparent Case Based Reasoning (CBR) system[316,317]. maintains the same accuracy. The explanation by example consists of analyzing the feature weight of the DNN, which is then used in CBR, to retrieve nearest neighbor cases to explain the DNN's prediction. ( Fig. 10 ). Some methods [82,144] are classified into a single category (Explanation by Simplifying Multi-Layer Neural Network) in Fig. 6, while they are in 2 different categories (namely, explanation of DeepNetwork Processingwith Decision Trees and ],as shown in Fig. 11.

Fig. 6. Taxonomy of the reviewed literature and trends identiﬁed for explainability techniques related to diﬀerent ML models

XAI: Opportunities, challenges and future research needs

In this path to performance, when performance comes hand in hand with complexity, interpretability finds itself on a downward slope that has so far proved inevitable. But the appearance of more sophisticated methods of explanation could reverse, or at least cancel, that its slope appeared by appearing in presentative 7 work. Fig. AI demonstrates its power to improve the relationship between model interpretation and performance. However, the information revealed by XAI techniques can be used both to generate more effective attacks in adversarial contexts aimed at obfuscating the model, while also developing techniques to better protect against private exposure using such information. For example, with respect to a supervised ML classification model, adversarial attacks to detect the minimum changes that should be applied to the input data to cause a different classification of signals from a computer that has nothing switching an automatic mini-system. susceptible to the human eye, guided vehicles detect a 45 mph signal [359]. For the special case of DL models, available solutions such as Cleverhans[360] seeks to detect adversarial vulnerabilities and provide different approaches to harden the model against them. Other examples include AlfaSVMLib[361] for SVM models and AdversarialLib[362] for evasive attacks. While XAI techniques can be used to provide more efficient adversarials or to reveal confidential aspects of the model itself, some cent contributions have capitalized on the capabilities of Generative Adversarial Networks (GANs[364]),[364] Variative 6 models-to- and-variational-to-and models. ce trained generative models can generate instances of what they have learned based on noise input vector that can be interpreted as a representation of the data at hand. adopted by several recent studies[366,367] mainly as an attribution method to relate a particular output of a DeepLearning model to their input variables. Another interesting research direction is the use of generative models to create counterfactuals, i.e. modifications to input data that may ultimately change the original prediction of the model's performance taking into account the model's performance taking into account its performance[368]. for his/her improved confidence and informed criticism. In light of this recent trend, we definitely believe that there is a way forward for generative ML models to play their part that requires understandable machine decisions.

Inparallel,researchhasbeenconductedtowardsminimizingboth riskanduncertaintyofharmsderivedfromdecisionsmadeontheout- putofaMLmodel.Asaresult,manytechniqueshavebeenreported toreducesucharisk,amongwhichwepauseattheevaluationofthe model’soutputconfidencetodecideupon.Inthiscase,theinspectionof theshareofepistemicuncertainty(namely,theuncertaintyduetolack ofknowledge)oftheinputdataanditscorrespondencewiththemodel’s outputconfidencecaninformtheuserandeventuallytriggerhis/herre- jectionofthemodel’soutput[370,371].Tothisend,explainingviaXAI techniqueswhichregionoftheinputdatathemodelisfocusedonwhen producingagivenoutputcandiscriminatepossiblesourcesofepistemic uncertaintywithintheinputdomain. Kur iThasBeennotTheTretreProductibilitySubjecticessUnotonly tothemeresharingofdata, Modeletsultstothercommunity, por alsotoTheavabilityFinformimi i të dhënave es [372] .Inotherwords, inordertotransformDatainAvalUableActionableAssetSetSet EadoptionOfxai TechniqueSDueToTheRpowerfulabilityTodeScriblack-Boxmodelsin Anunderstandable, HenceConveyableFashionTowardardsColleagues nga Socialscience, Politika, Shkencat Humane dhe Pleqtë. Manyexamplesoftheimplementationofthisapproacharecurrently availablewithpromisingresults.Thestudiesin[375–382]werecarried outindiversefields,showcasingthepotentialofthisnewparadigmfor datascience.Aboveall,itisrelevanttonoticetheresemblancethatall conceptsandrequirementsofTheory-guidedDataSciencesharewith XAI.Alltheadditionspresentedin[374]pushtowardtechniquesthat wouldeventuallyrenderamodelexplainable,andfurthermore,knowl- edgeconsistent.Theconceptofknowledgefromthebeginning,centralto Theory-guidedDataScience,mustalsoconsiderhowtheknowledgecap- turedbyamodelshouldbeexplainedforassessingitscompliancewith theoreticalprinciplesknownbeforehand.This, Përsëri, OpenSamagni fi centwindowofopportunityforxai.

When designing an approach to interpretability, contextual factors, potential impacts, and domain-specific needs must be taken into account: these include a thorough understanding of the purpose for which the AI model is built, the complexity of the explanations the audience needs, and the performance and interpretation capabilities of existing technology, models and methods. The latter provide a reference point for the AI system that can be deployed instead. Where possible, interpretation techniques should be given preference: when considering explainability in the development of an AI system, deciding which XAI approach to choose should take into account domain-specific risks and needs, available data sources and existing domain knowledge , and the suitability of the ML model to meet the requirements of the computational task to be addressed. In practice, the aforementioned aspects (contextual factors, impacts and domain-specific needs) may make transparent models preferable to complex modeling alternatives whose interpretability requires the application of post-hoc XAI techniques. options best match the characteristics of the problem.

Fig. 12. Trade-oﬀ between model inter- inter-pretability and performance, and a represen-tation of the area of improvement where the potential of XAI techniques and tools resides.

Toward responsible AI: Principles of artiﬁcial intelligence, fairness, privacy and data fusion

We anticipate that all the guidelines proposed in [383] and summarized above will be further complemented and enriched by future methodological studies, ultimately towards more responsible use of AI. Methodological principles ensure that the purposes for which it is explained-ability to do all with other universal aspects of equal importance such as non-discrimination, sustainability, privacy or accountability. Challenges continue to exploit the potential of XRealize a Responsive as well as discuss in the next section. Closely related to the concept of fairness, much attention has recently been paid to the concept of data diversity, which essentially refers to the ability of an algorithmic model to ensure that all different types of objects are represented in their equal result[403]. , by obtaining the model shape score, can determine the tendency of the model to produce heterogeneous results rather than highly accurate predictions. Diversity comes into play in human-centered applications with ethical constraints that pervade the modeling phase [404]. Trade: In the event that any trends arise due to the application of the above requirements, trade should only be considered if they are ethically acceptable. Such trade must be justified, explicitly acknowledged, recognized and documented in a software and documented manner. the decision maker must be responsible for making the right trade and the trade decision must be constantly reviewed to ensure the appropriateness of the decision.

Through the fusion of heterogeneous information, data fusion has been proven to improve the performance of ML models in many applications, such as industrial prognosis[348], cyber-physical social systems[407] or the Internet of Things [408], among others. arned.To deliver this, a web flights overview on different data fusion paradigms, and later analyze them from the perspective of data privacy. As we will later, reverence is relevant in the context of Responsible AI, the influence between XAI and data fusion is an uncharted research area in the current mainstream of research. We depart from the different levels of data fusion identified in comprehensive survey on theme[409–412]. In the context of this subsection, we will distinguish between fusion at the data level, fusion at model land and fusion at the knowledge level. Centralized and Distributed Methods for Data Fusion In a centralized approach, nodes deliver their locally captured data to a centralized processing system to aggregate them. In contrast, in a distributed approach, each of the nodes is locally captured information, which ultimately shares the result of the local fusion with its peers. Other data fusion approaches exist besides the one presented in Fig.13. As such, data-level fusion can be performed either by a technique specifically dedicated to it (as depicted in Fig.13.b) or, instead, performed alongside the learning process on an ML model, a DL model . by combining the decisions of different models (as done in tree ensembles).

In BigDatafusion (Fig.13.d), local models are learned from the division of the original data sources, each of which is submitted to a Worker node responsible for carrying out this learning process (Maptask). It is the same to spread the complexity of learning ML model over a pool of worker nodes, in which the strategy of designing how information/models are fused between the map and the reduction task is what determines the quality of the finally generated output [413]. By contrast, in Federated Learning[414–416], the computation of ML models is made on data captured locally by remote client devices (Fig.13.e). At local model training, clients send encrypted in formation about their learned knowledge, get a central server, which can adopt the ML model-dependent-dependent-dependent-model. same content. The central server aggregates (fuses) the knowledge contributions received from all clients to produce a vetted model using the collected information from the pool of clients. It is important to note that no client data is delivered to the central server, which evokes the privacy-preserving nature of Federated Learning. Furthermore, calculation. Notwithstanding this express concern of regulatory bodies, loss of privacy has been compromised by DL methods in scenarios where no data fusion is performed. For example, a few images are enough to threaten users' privacy even in the presence of image obscuration[420], and the model parameters of DNN[can] display on 5 applications[5 application] on 5 applications. overt loss of privacy using privacy loss and intent loss subjective scores. The former provides a subjective measure of the severity of the privacy violation depending on the role of the face in the image, while the latter captures the intent of the bystanders to appear in the image. privacy We strongly advocate more effort invested in this direction, namely, ensuring that XAI methods are not threatened with respect to the privacy of the data used for training the ML model under target.

When data are brought into the picture, various complications arise with the context of explainability covered in this survey. First, with classical techniques for data fusion level only with the data and are not related to the ML model, so they have directly, after all, the difference between information fusion and predictive modeling. The first layers of the DL architecture are tasked with learning high-level features from raw data that possess relevance to the task and. solved. In this context, many techniques in the field of XA have been proposed to deal with correlation analysis between features. Ultimately, this information gained on fusion can not only improve the usability of the models as a result of improving its understanding by the user, but can help identify other data sources of potential interest that may be included in the model and more. Unfortunately, this previously mentioned data-level concept envisions data under certain constraints of known form and provenance. concern with the context of the model makes it impossible for XAItechnika to be sufficiently explained to compromise the confidentiality of private data. This may eventually happen if sensitive information (eg ownership) can be extracted from the disclosed and protected by the merger.

If aimed at fusion at the knowledge level, a similar reasoning applies: XAI involves a technique that extracts knowledge from ML model(s). This ability to explain models could impact the need to discover new knowledge through the complex interactions formed within ML models. If so, XAI can enrich knowledge fusion paradigms, providing the opportunity to discover new knowledge extractors relevant to the task. For this purpose, it is of utmost importance that the knowledge extracted from a model using XAI techniques can be understood and extrapolated to the domain in which knowledge extractors operate. The concept fits easily with that of transfer learning as described in [425].

Fig. 13. Diagrams showing diﬀerent levels at which data fusion can be performed: (a) data level; (b) model level; (c) knowledge level; (d) Big Data fusion; (e) Federated Learning and (f) Multiview Learning.

Conclusions and outlook

Bottou, Discovering causal signals in images, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, p. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in: KDD p. Rao, Rule extraction from linear support vector machines, in: ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ACM, 2005, p.

Zhang, Verbetering van interpreteerbaarheid van diep neurale netwerke met semantiese inligting, in: IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. Behnke, Interpretable and ﬁne-grained visual explanations for convolutional neural networks, in: Proceedings of the IEEE Conference oor Computer Vision and Pattern Recognition, 2019, pp. Hooker, Discovering additive structure in black box functions, in: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2004, pp.

Liang, Understanding black box predictions via influence functions, in: Proceedings of the 34th International Conference on Machine Learning-Volume 70, JMLR. Lalmas , Interpretable tree-based ensemble predictions via action feature regularization, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2017, p. Proceedings of the Eleventh International ACM SIGKDD Conference on Knowledge Discovery in Data Mining, ACM, 2005, p.

Janssen, Deep rule extraction from deep neural networks, in: International Conference on Discovery Science, Springer, 2016, p. vision and pattern recognition, 2015, p. Tang, Residual attention network for image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.

Niculescu-Mizil, Modelcompressie, in: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2006, pp. Torr, Conditional Random Fields as Recurrent Neural Networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. Konukoglu, Visual feature attributie met behulp van wasserstein gans, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp.