Deep Learning (DL) is becoming a popular paradigm in a broad category of decision systems that are crucial to the well-being of our society. Self-driving vehicles, online dating, social network content recommendation, chest X-Ray screening, etc. are all examples that show how the quality of our lives is tied to the decisions of these systems. We must take into account that these systems may be gamed to make favorable decisions for unqualified instances by malicious actors. For instance, if a self-driving car's traffic-sign detection model can classify a traffic stop sign as speed-limit if the pattern that triggers the faulty behavior is present.
Our initial investigation result show, given we can generate/access a rich and high-quality dataset of random images, we may be able to build meta-models that can distinguish the poisoned/clean models with acceptable performance.
|