As stated in [@lundbergUnifiedApproachInterpreting] and explained in Fidelity-Interpretability trade-off, deep learning models are opaque and so not interpretable by default, like other linear models like Linear Regression or Decision Trees.
Because of this, since “The best explanation of a simple model is the model itself [@lundbergUnifiedApproachInterpreting]”, a good way of explaining opaque models is to create a simpler explanation model which is an interpretable approximation of the original model.
This approach is also called local surrogate models, and it’s explored in papers like [@ribeiroWhyShouldTrust2016] and [@guidottiLocalRuleBasedExplanations2018].
NOTE #todo I don’t know if the last sentence is correct