{
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  },
  "name": "",
  "signature": "sha256:f03b615a793c73761f110550c43e43c6ed2df6f8b5f102c5d553454a252ee84f"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# MLP for Fashion MNIST in PyTorch\n",
      "We will extend our previous MLP from scratch example by re-implementing the same content in PyTorch. This may seem like a tour-de-force, but will show just exactly how much of the complicated underlying implementation is abstracted away from the user in modern Deep Learning frameworks."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {
      "collapsed": true
     },
     "source": [
      "### Dataset class extended to use directly in PyTorch\n",
      "We can basically take our given dataset loader and use it almost as is.\n",
      "There is one modification that we absolutely have to make which is converting the numpy arrays to torch tensors.\n",
      "The function \"torch.from_numpy()\" can be used for this purpose. \n",
      "\n",
      "Two additional features we can add is the use of PyTorch dataset and dataloader structures that are very convenient to use and highly efficient. \n",
      "These are called \"torch.utils.data.TensorDataset\" and \"torch.utils.data.DataLoader\" and allow for the use of a multi-threaded dataset loader.  "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [
      "import torch\n",
      "import torch.utils.data\n",
      "import torchvision.datasets as datasets\n",
      "import os\n",
      "import struct\n",
      "import gzip\n",
      "import errno\n",
      "import numpy as np\n",
      "\n",
      "class FashionMNIST:\n",
      "    \"\"\"\n",
      "    Fashion MNIST dataset featuring gray-scale 28x28 images of\n",
      "    fashion items belonging to ten different classes.\n",
      "    Dataloader adapted from MNIST.\n",
      "    We do not define __getitem__ and __len__ in this class\n",
      "    as we are using torch.utils.data.TensorDataSet which\n",
      "    already implements these methods.\n",
      "\n",
      "    Parameters:\n",
      "        args (dict): Dictionary of (command line) arguments.\n",
      "            Needs to contain batch_size (int) and workers(int).\n",
      "        is_gpu (bool): True if CUDA is enabled.\n",
      "            Sets value of pin_memory in DataLoader.\n",
      "\n",
      "    Attributes:\n",
      "        trainset (torch.utils.data.TensorDataset): Training set wrapper.\n",
      "        valset (torch.utils.data.TensorDataset): Validation set wrapper.\n",
      "        train_loader (torch.utils.data.DataLoader): Training set loader with shuffling.\n",
      "        val_loader (torch.utils.data.DataLoader): Validation set loader.\n",
      "    \"\"\"\n",
      "\n",
      "    def __init__(self, is_gpu, batch_size, workers):\n",
      "        self.path = os.path.expanduser('datasets/FashionMNIST')\n",
      "        self.__download()\n",
      "\n",
      "        self.trainset, self.valset = self.get_dataset()\n",
      "\n",
      "        self.train_loader, self.val_loader = self.get_dataset_loader(batch_size, workers, is_gpu)\n",
      "\n",
      "        self.val_loader.dataset.class_to_idx = {'T-shirt/top': 0,\n",
      "                                                'Trouser': 1,\n",
      "                                                'Pullover': 2,\n",
      "                                                'Dress': 3,\n",
      "                                                'Coat': 4,\n",
      "                                                'Sandal': 5,\n",
      "                                                'Shirt': 6,\n",
      "                                                'Sneaker': 7,\n",
      "                                                'Bag': 8,\n",
      "                                                'Ankle boot': 9}\n",
      "\n",
      "    def __check_exists(self):\n",
      "        \"\"\"\n",
      "        Checks if dataset has already been downloaded\n",
      "\n",
      "        Returns:\n",
      "             bool: True if downloaded dataset has been found\n",
      "        \"\"\"\n",
      "\n",
      "        return os.path.exists(os.path.join(self.path, 'train-images-idx3-ubyte.gz')) and \\\n",
      "               os.path.exists(os.path.join(self.path, 'train-labels-idx1-ubyte.gz')) and \\\n",
      "               os.path.exists(os.path.join(self.path, 't10k-images-idx3-ubyte.gz')) and \\\n",
      "               os.path.exists(os.path.join(self.path, 't10k-labels-idx1-ubyte.gz'))\n",
      "\n",
      "    def __download(self):\n",
      "        \"\"\"\n",
      "        Downloads the Fashion-MNIST dataset from the web if dataset\n",
      "        hasn't already been downloaded.\n",
      "        \"\"\"\n",
      "\n",
      "        from six.moves import urllib\n",
      "\n",
      "        if self.__check_exists():\n",
      "            return\n",
      "\n",
      "        print(\"Downloading FashionMNIST dataset\")\n",
      "        urls = [\n",
      "            'https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/train-images-idx3-ubyte.gz',\n",
      "            'https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/train-labels-idx1-ubyte.gz',\n",
      "            'https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/t10k-images-idx3-ubyte.gz',\n",
      "            'https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/t10k-labels-idx1-ubyte.gz',\n",
      "        ]\n",
      "\n",
      "        # download files\n",
      "        try:\n",
      "            os.makedirs(self.path)\n",
      "        except OSError as e:\n",
      "            if e.errno == errno.EEXIST:\n",
      "                pass\n",
      "            else:\n",
      "                raise\n",
      "\n",
      "        for url in urls:\n",
      "            print('Downloading ' + url)\n",
      "            data = urllib.request.urlopen(url)\n",
      "            filename = url.rpartition('/')[2]\n",
      "            file_path = os.path.join(self.path, filename)\n",
      "            with open(file_path, 'wb') as f:\n",
      "                f.write(data.read())\n",
      "\n",
      "        print('Done!')\n",
      "\n",
      "    def __get_fashion_mnist(self, path, kind='train'):\n",
      "        \"\"\"\n",
      "        Load Fashion-MNIST data\n",
      "\n",
      "        Parameters:\n",
      "            path (str): Base directory path containing .gz files for\n",
      "                the Fashion-MNIST dataset\n",
      "            kind (str): Accepted types are 'train' and 't10k' for\n",
      "                training and validation set stored in .gz files\n",
      "\n",
      "        Returns:\n",
      "            numpy.array: images, labels\n",
      "        \"\"\"\n",
      "\n",
      "        labels_path = os.path.join(path,\n",
      "                                   '%s-labels-idx1-ubyte.gz'\n",
      "                                   % kind)\n",
      "        images_path = os.path.join(path,\n",
      "                                   '%s-images-idx3-ubyte.gz'\n",
      "                                   % kind)\n",
      "\n",
      "        with gzip.open(labels_path, 'rb') as lbpath:\n",
      "            struct.unpack('>II', lbpath.read(8))\n",
      "            labels = np.frombuffer(lbpath.read(), dtype=np.uint8)\n",
      "\n",
      "        with gzip.open(images_path, 'rb') as imgpath:\n",
      "            struct.unpack(\">IIII\", imgpath.read(16))\n",
      "            images = np.frombuffer(imgpath.read(), dtype=np.uint8).reshape(len(labels), 784)\n",
      "\n",
      "        return images, labels\n",
      "\n",
      "    def get_dataset(self):\n",
      "        \"\"\"\n",
      "        Loads and wraps training and validation datasets\n",
      "\n",
      "        Returns:\n",
      "             torch.utils.data.TensorDataset: trainset, valset\n",
      "        \"\"\"\n",
      "\n",
      "        x_train, y_train = self.__get_fashion_mnist(self.path, kind='train')\n",
      "        x_val, y_val = self.__get_fashion_mnist(self.path, kind='t10k')\n",
      "\n",
      "        # convert to torch tensors in range [0, 1]\n",
      "        x_train = torch.from_numpy(x_train).float() / 255\n",
      "        y_train = torch.from_numpy(y_train).int()\n",
      "        x_val = torch.from_numpy(x_val).float() / 255\n",
      "        y_val = torch.from_numpy(y_val).int()\n",
      "\n",
      "        # resize flattened array of images for input to a CNN\n",
      "        x_train.resize_(x_train.size(0), 1, 28, 28)\n",
      "        x_val.resize_(x_val.size(0), 1, 28, 28)\n",
      "\n",
      "        # TensorDataset wrapper\n",
      "        trainset = torch.utils.data.TensorDataset(x_train, y_train)\n",
      "        valset = torch.utils.data.TensorDataset(x_val, y_val)\n",
      "\n",
      "        return trainset, valset\n",
      "\n",
      "    def get_dataset_loader(self, batch_size, workers, is_gpu):\n",
      "        \"\"\"\n",
      "        Defines the dataset loader for wrapped dataset\n",
      "\n",
      "        Parameters:\n",
      "            batch_size (int): Defines the batch size in data loader\n",
      "            workers (int): Number of parallel threads to be used by data loader\n",
      "            is_gpu (bool): True if CUDA is enabled so pin_memory is set to True\n",
      "\n",
      "        Returns:\n",
      "             torch.utils.data.TensorDataset: trainset, valset\n",
      "        \"\"\"\n",
      "\n",
      "        train_loader = torch.utils.data.DataLoader(self.trainset, batch_size=batch_size, shuffle=True,\n",
      "                                                   num_workers=workers, pin_memory=is_gpu, sampler=None)\n",
      "        test_loader = torch.utils.data.DataLoader(self.valset, batch_size=batch_size, shuffle=True,\n",
      "                                                  num_workers=workers, pin_memory=is_gpu, sampler=None)\n",
      "\n",
      "        return train_loader, test_loader\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "is_gpu = torch.cuda.is_available()\n",
      "batch_size = 128\n",
      "workers = 4\n",
      "    \n",
      "dataset = FashionMNIST(is_gpu, batch_size, workers)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Downloading FashionMNIST dataset\n",
        "Downloading https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/train-images-idx3-ubyte.gz\n",
        "Downloading https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/train-labels-idx1-ubyte.gz"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Downloading https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/t10k-images-idx3-ubyte.gz"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Downloading https://cdn.rawgit.com/zalandoresearch/fashion-mnist/ed8e4f3b/data/fashion/t10k-labels-idx1-ubyte.gz"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Done!"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### The MLP model in PyTorch\n",
      "We now show how to implement a 2 hidden layer MLP in PyTorch. \n",
      "Suitable hidden-layer sizes for this task could be 300 and 100. \n",
      "Depending on the optimization criterion you use, you may want to add something like a Softmax function to your network. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import torch.nn as nn\n",
      "\n",
      "class MLP(nn.Module):\n",
      "    def __init__(self, img_size, num_classes):\n",
      "        super(MLP, self).__init__()\n",
      "        \n",
      "        self.img_size = img_size\n",
      "        \n",
      "        self.fc1 = nn.Linear(img_size, 300)\n",
      "        self.act1 = nn.ReLU()\n",
      "\n",
      "        self.fc2 = nn.Linear(300, 100)\n",
      "        self.act2 = nn.ReLU()\n",
      "        \n",
      "        self.fc3 = nn.Linear(100, num_classes)\n",
      "\n",
      "    def forward(self, x):\n",
      "        # The view flattens the data to a vector (the representation needed by the MLP)\n",
      "        x = x.view(-1, self.img_size)\n",
      "        x = self.act1(self.fc1(x))\n",
      "        x = self.act2(self.fc2(x))\n",
      "        x = self.fc3(x)\n",
      "        return x"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 3
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "img_size = 28*28\n",
      "num_classes = 10\n",
      "\n",
      "model = MLP(img_size, num_classes)\n",
      "if is_gpu:\n",
      "    model = model.cuda()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Defining optimization criterion and optimizer\n",
      "A good baseline is a Cross Entropy loss (Log Softmax + negative log-likelihood) and a stochastic gradient descent (SGD) algorithm with a baseline learning rate of 0.01. If we want to we can use additional momenta or regularization terms (such as L2 - Tikhonov regularization commonly reffered to as weight-decay in ML). "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [
      "# Define optimizer and loss function (criterion)\n",
      "criterion = nn.CrossEntropyLoss()\n",
      "if is_gpu:\n",
      "    criterion = criterion.cuda()\n",
      "\n",
      "optimizer = torch.optim.SGD(model.parameters(), lr = 0.01,\n",
      "                            momentum=0.9,\n",
      "                            weight_decay=5e-4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Monitoring and calculating accuracy\n",
      "We add a convenience class to keep track and average concepts such as processing or data loading speeds, losses and accuracies. For this we need to define a function to define accuracy, which could be based on the absolute accuracy, or top-1 accuracy. Often times in Machine Learning other metrics are employed. For example, in the ImageNet ILSVRC challenge with a classification problem containing a 1000 classes, it is common to report the top-5 accuracy. Here a prediction is counted as accurate if the correct class lies within the top-5 most likely output classes. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [
      "class AverageMeter(object):\n",
      "    \"\"\"\n",
      "    Computes and stores the average and current value\n",
      "    \"\"\"\n",
      "    def __init__(self):\n",
      "        self.reset()\n",
      "\n",
      "    def reset(self):\n",
      "        self.val = 0\n",
      "        self.avg = 0\n",
      "        self.sum = 0\n",
      "        self.count = 0\n",
      "\n",
      "    def update(self, val, n=1):\n",
      "        self.val = val\n",
      "        self.sum += val * n\n",
      "        self.count += n\n",
      "        self.avg = self.sum / self.count\n",
      "\n",
      "\n",
      "def accuracy(output, target, topk=(1,)):\n",
      "    \"\"\"\n",
      "    Evaluates a model's top k accuracy\n",
      "\n",
      "    Parameters:\n",
      "        output (torch.autograd.Variable): model output\n",
      "        target (torch.autograd.Variable): ground-truths/labels\n",
      "        topk (list): list of integers specifying top-k precisions\n",
      "            to be computed\n",
      "\n",
      "    Returns:\n",
      "        float: percentage of correct predictions\n",
      "    \"\"\"\n",
      "\n",
      "    maxk = max(topk)\n",
      "    batch_size = target.size(0)\n",
      "\n",
      "    _, pred = output.topk(maxk, 1, True, True)\n",
      "    pred = pred.t()\n",
      "    correct = pred.eq(target.view(1, -1).expand_as(pred))\n",
      "\n",
      "    res = []\n",
      "    for k in topk:\n",
      "        correct_k = correct[:k].view(-1).float().sum(0)\n",
      "        res.append(correct_k.mul_(100.0 / batch_size))\n",
      "    return res"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Training function (sometimes referred to as \"hook\")\n",
      "The training function needs to loop through the entire dataset in steps of mini-batches (for SGD). For each mini-batch the output of the model and losses are calculated and a \"backward\" pass is done in order to do an update to the model's weights. When the entire dataset has been processed once, one epoch of the training has been conducted. It is common to shuffle the dataset after each epoch. In this implementation this is handled by the \"sampler\" of the dataset loader. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def train(train_loader, model, criterion, optimizer, is_gpu):\n",
      "    \"\"\"\n",
      "    Trains/updates the model for one epoch on the training dataset.\n",
      "\n",
      "    Parameters:\n",
      "        train_loader (torch.utils.data.DataLoader): The trainset dataloader\n",
      "        model (torch.nn.module): Model to be trained\n",
      "        criterion (torch.nn.criterion): Loss function\n",
      "        optimizer (torch.optim.optimizer): optimizer instance like SGD or Adam\n",
      "        is_gpu (bool): True if CUDA is enabled so pin_memory is set to True\n",
      "    \"\"\"\n",
      "\n",
      "    losses = AverageMeter()\n",
      "    top1 = AverageMeter()\n",
      "\n",
      "    # switch to train mode\n",
      "    model.train()\n",
      "\n",
      "    for i, (input, target) in enumerate(train_loader):\n",
      "\n",
      "        if is_gpu:\n",
      "            input = input.cuda()\n",
      "            target = target.cuda()\n",
      "\n",
      "        input_var = torch.autograd.Variable(input)\n",
      "        target_var = torch.autograd.Variable(target)\n",
      "\n",
      "        # compute output\n",
      "        output = model(input_var)\n",
      "        loss = criterion(output, target_var)\n",
      "\n",
      "        # measure accuracy and record loss\n",
      "        prec1, _ = accuracy(output.data, target, topk=(1, 5))\n",
      "        losses.update(loss.data[0], input.size(0))\n",
      "        top1.update(prec1[0], input.size(0))\n",
      "\n",
      "        # compute gradient and do SGD step\n",
      "        optimizer.zero_grad()\n",
      "        loss.backward()\n",
      "        optimizer.step()\n",
      "\n",
      "\n",
      "        if i % 100 == 0:\n",
      "            print('Loss {loss.val:.4f} ({loss.avg:.4f})\\t'\n",
      "                  'Prec@1 {top1.val:.3f} ({top1.avg:.3f})'.format(\n",
      "                   loss=losses, top1=top1))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Validation function\n",
      "Validation is similar to the training loop, but on a separate dataset with the exception that no update to the weights is performed. This way we can monitor the generalization ability of our model and check whether it is overfitting (memorizing) the training dataset.  "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [
      "from torchnet import meter\n",
      "\n",
      "def validate(val_loader, model, criterion, is_gpu):\n",
      "    \"\"\"\n",
      "    Evaluates/validates the model\n",
      "\n",
      "    Parameters:\n",
      "        val_loader (torch.utils.data.DataLoader): The validation or testset dataloader\n",
      "        model (torch.nn.module): Model to be evaluated/validated\n",
      "        criterion (torch.nn.criterion): Loss function\n",
      "        is_gpu (bool): True if CUDA is enabled so pin_memory is set to True\n",
      "    \"\"\"\n",
      "\n",
      "    losses = AverageMeter()\n",
      "    top1 = AverageMeter()\n",
      "\n",
      "    confusion = meter.ConfusionMeter(len(val_loader.dataset.class_to_idx))\n",
      "\n",
      "    # switch to evaluate mode\n",
      "    model.eval()\n",
      "\n",
      "    for i, (input, target) in enumerate(val_loader):\n",
      "        if is_gpu:\n",
      "            input = input.cuda()\n",
      "            target = target.cuda()\n",
      "\n",
      "        input_var = torch.autograd.Variable(input, volatile=True)\n",
      "        target_var = torch.autograd.Variable(target, volatile=True)\n",
      "\n",
      "        # compute output\n",
      "        output = model(input_var)\n",
      "\n",
      "        # compute loss\n",
      "        loss = criterion(output, target_var)\n",
      "\n",
      "        # measure accuracy and record loss\n",
      "        prec1, _ = accuracy(output.data, target, topk=(1, 5))\n",
      "        losses.update(loss.data[0], input.size(0))\n",
      "        top1.update(prec1[0], input.size(0))\n",
      "\n",
      "        # add to confusion matrix\n",
      "        confusion.add(output.data, target)\n",
      "\n",
      "    print(' * Validation accuracy: Prec@1 {top1.avg:.3f} '.format(top1=top1))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 8
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Running the training of the model"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "total_epochs = 10\n",
      "for epoch in range(total_epochs):\n",
      "    print(\"EPOCH:\", epoch + 1)\n",
      "    print(\"TRAIN\")\n",
      "    train(dataset.train_loader, model, criterion, optimizer, is_gpu)\n",
      "    print(\"VALIDATION\")\n",
      "    validate(dataset.val_loader, model, criterion, is_gpu)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "('EPOCH:', 1)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 2.3009 (2.3009)\tPrec@1 2.344 (2.344)\n",
        "Loss 0.8279 (1.5875)\tPrec@1 68.750 (48.275)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.6766 (1.1792)\tPrec@1 75.000 (59.507)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.5668 (1.0043)\tPrec@1 78.906 (65.085)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4821 (0.8982)\tPrec@1 81.250 (68.779)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 78.800 \n",
        "('EPOCH:', 2)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.5039 (0.5039)\tPrec@1 84.375 (84.375)\n",
        "Loss 0.4623 (0.5168)\tPrec@1 86.719 (81.869)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4028 (0.5109)\tPrec@1 89.844 (82.070)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3846 (0.5059)\tPrec@1 84.375 (82.335)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.5033 (0.4965)\tPrec@1 78.125 (82.643)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 84.720 \n",
        "('EPOCH:', 3)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.5558 (0.5558)\tPrec@1 82.812 (82.812)\n",
        "Loss 0.4738 (0.4476)\tPrec@1 84.375 (83.718)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4228 (0.4463)\tPrec@1 84.375 (83.924)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.5838 (0.4418)\tPrec@1 75.781 (84.245)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4309 (0.4373)\tPrec@1 87.500 (84.381)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 85.690 \n",
        "('EPOCH:', 4)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.4047 (0.4047)\tPrec@1 89.844 (89.844)\n",
        "Loss 0.3747 (0.4173)\tPrec@1 90.625 (85.442)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3793 (0.4115)\tPrec@1 85.156 (85.386)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3039 (0.4066)\tPrec@1 92.188 (85.504)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3928 (0.4022)\tPrec@1 85.938 (85.597)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 86.390 \n",
        "('EPOCH:', 5)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.4020 (0.4020)\tPrec@1 82.812 (82.812)\n",
        "Loss 0.3836 (0.3764)\tPrec@1 86.719 (86.541)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4085 (0.3801)\tPrec@1 83.594 (86.400)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3086 (0.3807)\tPrec@1 89.844 (86.415)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.2858 (0.3789)\tPrec@1 94.531 (86.427)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 87.310 \n",
        "('EPOCH:', 6)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.3019 (0.3019)\tPrec@1 89.062 (89.062)\n",
        "Loss 0.3542 (0.3500)\tPrec@1 87.500 (87.647)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3979 (0.3614)\tPrec@1 85.156 (87.166)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4257 (0.3595)\tPrec@1 86.719 (87.113)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3802 (0.3608)\tPrec@1 85.938 (87.064)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 87.450 \n",
        "('EPOCH:', 7)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.3142 (0.3142)\tPrec@1 89.844 (89.844)\n",
        "Loss 0.2976 (0.3349)\tPrec@1 87.500 (88.011)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3600 (0.3408)\tPrec@1 85.156 (87.718)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.2525 (0.3419)\tPrec@1 93.750 (87.669)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.2975 (0.3459)\tPrec@1 92.188 (87.519)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 87.960 \n",
        "('EPOCH:', 8)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.1874 (0.1874)\tPrec@1 94.531 (94.531)\n",
        "Loss 0.2994 (0.3332)\tPrec@1 87.500 (88.188)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.4080 (0.3356)\tPrec@1 84.375 (87.994)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3204 (0.3336)\tPrec@1 84.375 (87.928)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3679 (0.3359)\tPrec@1 86.719 (87.779)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 87.770 \n",
        "('EPOCH:', 9)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.2874 (0.2874)\tPrec@1 88.281 (88.281)\n",
        "Loss 0.4169 (0.3298)\tPrec@1 86.719 (88.219)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3329 (0.3284)\tPrec@1 87.500 (88.036)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3841 (0.3254)\tPrec@1 88.281 (88.193)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.2389 (0.3253)\tPrec@1 92.969 (88.106)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 88.400 \n",
        "('EPOCH:', 10)\n",
        "TRAIN\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Loss 0.3129 (0.3129)\tPrec@1 85.156 (85.156)\n",
        "Loss 0.2555 (0.3128)\tPrec@1 90.625 (88.676)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.2810 (0.3100)\tPrec@1 89.062 (88.662)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3115 (0.3112)\tPrec@1 87.500 (88.655)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Loss 0.3738 (0.3136)\tPrec@1 88.281 (88.603)"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "VALIDATION"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        " * Validation accuracy: Prec@1 88.480 \n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### How well did the model do?\n",
      "In Machine Learning research it is crucial to compare and contrast a model to other researchers implementations. Many of the current Machine Learning datasets are posed as benchmarks where results are rigorously tracked in order to examine the efficiency and efficacy of a model or algorithm proposition.\n",
      "\n",
      "For the fashion MNIST dataset you can check how well both of your models (from scratch and in PyTorch) perform here:\n",
      "http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/#\n",
      "\n",
      "Do keep in mind that in order to analyze the usefulness of a method one should always compare and contrast on a variety of different datasets with varying task and complexity. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": true,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": null
    }
   ],
   "metadata": {}
  }
 ]
}