Hello everyone! This last week I have spent my time creating the setup for my project. This involved creating and training two different networks. The first of which is an autoencoder network that uses the MNIST dataset(a large dataset of handwritten digits). An autoencoder is a network that gets an image as input and then has to recreate it as an output. The catch is that information loss is introduced by creating a bottleneck in the network architecture. For my autoencoder, I settled on reducing the 784 input values(one for each pixel in the image) to 14 numbers(this value is called the latent dimension). The output of the network can be seen below:
I also tried experimenting with reducing the amount of information further:
Latent Dimension 7:
Latent Dimension 3:
latent Dimension 1:
Overall I found that a latent dimension size of 14 was just enough to accurately represent the numbers without significant blur while also not keeping too much information.
The other network I created was a more simple MNIST classifier. This is the model that will be used to test adversarial attacks and defenses on. Through experimentation, I found an architecture that achieved 99.25% accuracy which is on par with industry standards. I created some visuals of the confidence scores of the network next to the input image (the highest is the predicted output). Note that it is on a logarithmic scale.
Thank you for reading!