2019 March 11

Location: Coors Tech

Attending:

Thomas Iga Jihyun Hayden Andy Antoine Bin

Key takeaways

  • Variability in NN performance observed when using tanh vs. sigmoid
  • Hard to verify real-world applicability with Bins project
    • Bin needs real data before understanding results of synthetics
  • Consider using hot-encoding to turn your regression problem into a classification problem

Problems

  • Hayden is not sure whether to use tanh or sigmoid
  • tanh shows more variablity compared to sigmoid
  • Bin may have a dynamic range problem on inputs
    • 0, 1, 2, … 254, 255
      • 1E-12 to 1
  • Bin’s validation and training data are extremely similar so cross-validation is probably biased.
  • Without prior setup, Google Cloud will use CPU by default.

Comments

  • tanh outputs [-1,1] whereas sigmoid outputs [0,1], perhaps explaining accuracy disparity Hayden is observing
  • Bin did a PhD involving earthquake mechanics and finite element modelling.
  • Is currently trying to take slip-rate outputs to time to failure.
  • Inputs to FEM: material properties and physical parameters
  • Outputs of FEM: slip rate in a volume
  • Inputs to NN: slip rate on a sliced portion of volume
  • Outputs of NN: time to failure
  • Using transfer learning approach
    • TLA option 1: reuse architecture
    • TLA option 2: reuse architecture + learned weights
    • TLA option 2: forward, get features, use features to train separate simpler NN
  • Bin’s process notes:
    • Wants time to failure accuracy of one year
    • InceptionV3 Google Cloud CPU forward 600 images in 1.5 minutes
    • VGG16 Google Cloud CPU forward 600 images in 5 minutes
    • Avoids high forward cost by using network to get features that are used to train a separate simple NN.
    • Duplicate along each channel for input to both image-based pre-trained models as if representing RGB data
    • Summarizes regression output by avg. pooling at the last layer of each model
    • Google Cloud is currently being used by Bin right now
    • make sure GC knows you want GPUs
    • InceptionV3 takes longer to train, but it’s output is bigger
    • VGG16 is faster to train, but its output is smaller
  • Freezing layers, and adding an initial layer (to help with dynamic range), is a bad idea.
    • All future weights are influenced by past weights.
  • Consider using hot-encoding to turn your regression problem into a classification problem
    • Year 1 = [1, 0, 0, … 0]
    • Year 2 = [0, 1, 0, … 0]

Topics of discussion for next week

  • TBD

Comments