2019 March 11¶
Location: Coors Tech¶
Attending:¶
Thomas Iga Jihyun Hayden Andy Antoine Bin
Key takeaways¶
- Variability in NN performance observed when using tanh vs. sigmoid
- Hard to verify real-world applicability with Bins project
- Bin needs real data before understanding results of synthetics
- Consider using hot-encoding to turn your regression problem into a classification problem
Problems¶
- Hayden is not sure whether to use tanh or sigmoid
- tanh shows more variablity compared to sigmoid
- Bin may have a dynamic range problem on inputs
- 0, 1, 2, … 254, 255
- 1E-12 to 1
- 0, 1, 2, … 254, 255
- Bin’s validation and training data are extremely similar so cross-validation is probably biased.
- Without prior setup, Google Cloud will use CPU by default.
Comments¶
- tanh outputs [-1,1] whereas sigmoid outputs [0,1], perhaps explaining accuracy disparity Hayden is observing
- Bin did a PhD involving earthquake mechanics and finite element modelling.
- Is currently trying to take slip-rate outputs to time to failure.
- Inputs to FEM: material properties and physical parameters
- Outputs of FEM: slip rate in a volume
- Inputs to NN: slip rate on a sliced portion of volume
- Outputs of NN: time to failure
- Using transfer learning approach
- TLA option 1: reuse architecture
- TLA option 2: reuse architecture + learned weights
- TLA option 2: forward, get features, use features to train separate simpler NN
- Bin’s process notes:
- Wants time to failure accuracy of one year
- InceptionV3 Google Cloud CPU forward 600 images in 1.5 minutes
- VGG16 Google Cloud CPU forward 600 images in 5 minutes
- Avoids high forward cost by using network to get features that are used to train a separate simple NN.
- Duplicate along each channel for input to both image-based pre-trained models as if representing RGB data
- Summarizes regression output by avg. pooling at the last layer of each model
- Google Cloud is currently being used by Bin right now
- make sure GC knows you want GPUs
- InceptionV3 takes longer to train, but it’s output is bigger
- VGG16 is faster to train, but its output is smaller
- Freezing layers, and adding an initial layer (to help with dynamic range), is a bad idea.
- All future weights are influenced by past weights.
- Consider using hot-encoding to turn your regression problem into a classification problem
- Year 1 = [1, 0, 0, … 0]
- Year 2 = [0, 1, 0, … 0]
Topics of discussion for next week¶
- TBD