Autoencoders

Autoencoders can be used to reduce the dimensionality of a dataset, cluster, denoise, interpolate, separate signals, and perform many nifty digital signal processing techniques. Examples of how to build, train, and utilize autoencoders are collected here. For now, we just have a simple 1D sequence to sequence autoencoder for denoising monochromatic signals.

Autoencoding example: sine waves


This notebook shows how to autoencode 30 Hz sine waves with varying phases. After autoencoding, the denoising properties of an autoencoder are showcased. The code is heavily commented for those just starting with keras.

In [1]:
import numpy as np 
import matplotlib.pyplot as plt

from keras.models import Input, Model, load_model
from keras.layers import Dense
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.utils import plot_model

from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import scale, StandardScaler, MinMaxScaler

import petname
Using TensorFlow backend.
In [2]:
# generate training, test, and validation data
n = 4096
nt = 128
f = 3.0                  # frequency in Hz
t = np.linspace(0,1,nt)  # time stamps in s
x = np.zeros((n,nt))
phase = np.random.uniform(-np.pi, np.pi, size=n)
for i in range(n):
    x[i,:] = np.sin(2*np.pi*f*t + phase[i] )
In [3]:
# QC generated data is phase shifted but one frequency
plt.figure(figsize=(8,2))
for i in range(3):
    plt.plot(t,x[np.random.randint(0,nt-1), :])
plt.show()
In [4]:
# QC generated phase in [-pi,pi]
plt.figure(figsize=(8,2))
plt.hist(phase,bins=31)
plt.xlabel('phase')
plt.ylabel('number of occurence')
plt.show()
In [5]:
# split into test, validation, and training sets
x_temp, x_test, _, _ = train_test_split(x, x, test_size=0.05)
x_train, x_valid, _, _ = train_test_split(x_temp,
                                          x_temp,
                                          test_size=0.1)
n_train = len(x_train)
n_valid = len(x_valid)
n_test = len(x_test)
In [6]:
# construct autoencoder network structure
encoding_dim = 11

# input layer is full time series of length nt
inputs = Input((nt,))

# add more hidden layers here
encoded = Dense(64, activation='tanh')(inputs) 
encoded = Dense(32, activation='tanh')(encoded)
encoded = Dense(encoding_dim, activation='tanh')(encoded)
decoded = Dense(32, activation='tanh')(encoded)
decoded = Dense(64, activation='tanh')(decoded)

# output layer is same length as input
outputs = Dense(nt, activation='tanh')(decoded)

# consolidate to define autoencoder model inputs and outputs
ae = Model(inputs=inputs, outputs=outputs)

# specify encoder and decoder model for easy encoding and decoding later
encoder = Model(inputs=inputs, outputs=encoded)
encoded_input = Input(shape=(encoding_dim,))

# string together decoder layers
# TODO: pythonic way to do this?
decoded_output = ae.layers[-3](encoded_input)
decoded_output = ae.layers[-2](decoded_output)
decoded_output = ae.layers[-1](decoded_output)
decoder = Model(inputs=encoded_input, outputs=decoded_output)
In [7]:
print('Full autoencoder')
print(ae.summary())
print('\n Encoder portion of autoencoder')
print(encoder.summary())
Full autoencoder
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_2 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_3 (Dense)              (None, 11)                363       
_________________________________________________________________
dense_4 (Dense)              (None, 32)                384       
_________________________________________________________________
dense_5 (Dense)              (None, 64)                2112      
_________________________________________________________________
dense_6 (Dense)              (None, 128)               8320      
=================================================================
Total params: 21,515
Trainable params: 21,515
Non-trainable params: 0
_________________________________________________________________
None

 Encoder portion of autoencoder
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_2 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_3 (Dense)              (None, 11)                363       
=================================================================
Total params: 10,699
Trainable params: 10,699
Non-trainable params: 0
_________________________________________________________________
None
In [8]:
# specify opt. strategy
ae.compile(optimizer='adam',
            loss='mse',
            metrics=['mse'])
In [9]:
# specify training parameters and callback functions

# batch size for stochastic solver  
batch_size = 16

# number of times entire dataset is considered in stochastic solver
epochs = 100

# unique name for the network for saving
unique_name = petname.name()
model_filename = 'aen_sin_%03dHz_n=%05d_'%(int(f),nt)+unique_name+'.h5'

# training history file name
history_filename = 'results_'+unique_name+'.npz'

# stop early after no improvement past epochs=patience and be verbose
earlystopper = EarlyStopping(patience=100, verbose=1)

# checkpoint and save model when improvement occurs 
checkpointer = ModelCheckpoint(model_filename, verbose=1, save_best_only=True)

# consolidate callback functions for convenience 
callbacks = [earlystopper, checkpointer]
In [10]:
# train autoencoder
results = ae.fit(x_train,x_train,
                  batch_size = batch_size, 
                  epochs = epochs,
                  validation_data = (x_valid,x_valid),
                  callbacks = callbacks)
Train on 3501 samples, validate on 390 samples
Epoch 1/100
3501/3501 [==============================] - 1s 355us/step - loss: 0.0459 - mean_squared_error: 0.0459 - val_loss: 0.0041 - val_mean_squared_error: 0.0041

Epoch 00001: val_loss improved from inf to 0.00415, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 2/100
3501/3501 [==============================] - 1s 144us/step - loss: 0.0028 - mean_squared_error: 0.0028 - val_loss: 0.0015 - val_mean_squared_error: 0.0015

Epoch 00002: val_loss improved from 0.00415 to 0.00153, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 3/100
3501/3501 [==============================] - 1s 154us/step - loss: 0.0010 - mean_squared_error: 0.0010 - val_loss: 8.3525e-04 - val_mean_squared_error: 8.3525e-04

Epoch 00003: val_loss improved from 0.00153 to 0.00084, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 4/100
3501/3501 [==============================] - 1s 145us/step - loss: 7.1871e-04 - mean_squared_error: 7.1871e-04 - val_loss: 6.5505e-04 - val_mean_squared_error: 6.5505e-04

Epoch 00004: val_loss improved from 0.00084 to 0.00066, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 5/100
3501/3501 [==============================] - 1s 148us/step - loss: 6.2149e-04 - mean_squared_error: 6.2149e-04 - val_loss: 5.7450e-04 - val_mean_squared_error: 5.7450e-04

Epoch 00005: val_loss improved from 0.00066 to 0.00057, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 6/100
3501/3501 [==============================] - 0s 142us/step - loss: 5.4520e-04 - mean_squared_error: 5.4520e-04 - val_loss: 4.9975e-04 - val_mean_squared_error: 4.9975e-04

Epoch 00006: val_loss improved from 0.00057 to 0.00050, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 7/100
3501/3501 [==============================] - 1s 144us/step - loss: 5.7018e-04 - mean_squared_error: 5.7018e-04 - val_loss: 3.9644e-04 - val_mean_squared_error: 3.9644e-04

Epoch 00007: val_loss improved from 0.00050 to 0.00040, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 8/100
3501/3501 [==============================] - 1s 143us/step - loss: 3.5918e-04 - mean_squared_error: 3.5918e-04 - val_loss: 3.1523e-04 - val_mean_squared_error: 3.1523e-04

Epoch 00008: val_loss improved from 0.00040 to 0.00032, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 9/100
3501/3501 [==============================] - 1s 148us/step - loss: 2.8607e-04 - mean_squared_error: 2.8607e-04 - val_loss: 2.6537e-04 - val_mean_squared_error: 2.6537e-04

Epoch 00009: val_loss improved from 0.00032 to 0.00027, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 10/100
3501/3501 [==============================] - 0s 140us/step - loss: 3.0162e-04 - mean_squared_error: 3.0162e-04 - val_loss: 2.1948e-04 - val_mean_squared_error: 2.1948e-04

Epoch 00010: val_loss improved from 0.00027 to 0.00022, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 11/100
3501/3501 [==============================] - 1s 153us/step - loss: 2.0970e-04 - mean_squared_error: 2.0970e-04 - val_loss: 2.5081e-04 - val_mean_squared_error: 2.5081e-04

Epoch 00011: val_loss did not improve from 0.00022
Epoch 12/100
3501/3501 [==============================] - 0s 139us/step - loss: 2.0760e-04 - mean_squared_error: 2.0760e-04 - val_loss: 1.7316e-04 - val_mean_squared_error: 1.7316e-04

Epoch 00012: val_loss improved from 0.00022 to 0.00017, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 13/100
3501/3501 [==============================] - 1s 150us/step - loss: 1.7070e-04 - mean_squared_error: 1.7070e-04 - val_loss: 1.6057e-04 - val_mean_squared_error: 1.6057e-04

Epoch 00013: val_loss improved from 0.00017 to 0.00016, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 14/100
3501/3501 [==============================] - 1s 153us/step - loss: 1.6828e-04 - mean_squared_error: 1.6828e-04 - val_loss: 1.5692e-04 - val_mean_squared_error: 1.5692e-04

Epoch 00014: val_loss improved from 0.00016 to 0.00016, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 15/100
3501/3501 [==============================] - 1s 147us/step - loss: 2.1605e-04 - mean_squared_error: 2.1605e-04 - val_loss: 1.6475e-04 - val_mean_squared_error: 1.6475e-04

Epoch 00015: val_loss did not improve from 0.00016
Epoch 16/100
3501/3501 [==============================] - 1s 145us/step - loss: 1.4884e-04 - mean_squared_error: 1.4884e-04 - val_loss: 1.3565e-04 - val_mean_squared_error: 1.3565e-04

Epoch 00016: val_loss improved from 0.00016 to 0.00014, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 17/100
3501/3501 [==============================] - 1s 144us/step - loss: 1.5588e-04 - mean_squared_error: 1.5588e-04 - val_loss: 3.6032e-04 - val_mean_squared_error: 3.6032e-04

Epoch 00017: val_loss did not improve from 0.00014
Epoch 18/100
3501/3501 [==============================] - 1s 149us/step - loss: 1.4744e-04 - mean_squared_error: 1.4744e-04 - val_loss: 1.3256e-04 - val_mean_squared_error: 1.3256e-04

Epoch 00018: val_loss improved from 0.00014 to 0.00013, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 19/100
3501/3501 [==============================] - 1s 145us/step - loss: 1.8761e-04 - mean_squared_error: 1.8761e-04 - val_loss: 1.8393e-04 - val_mean_squared_error: 1.8393e-04

Epoch 00019: val_loss did not improve from 0.00013
Epoch 20/100
3501/3501 [==============================] - 1s 151us/step - loss: 1.2678e-04 - mean_squared_error: 1.2678e-04 - val_loss: 9.2544e-05 - val_mean_squared_error: 9.2544e-05

Epoch 00020: val_loss improved from 0.00013 to 0.00009, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 21/100
3501/3501 [==============================] - 1s 155us/step - loss: 1.1560e-04 - mean_squared_error: 1.1560e-04 - val_loss: 1.2224e-04 - val_mean_squared_error: 1.2224e-04

Epoch 00021: val_loss did not improve from 0.00009
Epoch 22/100
3501/3501 [==============================] - 1s 152us/step - loss: 1.1742e-04 - mean_squared_error: 1.1742e-04 - val_loss: 1.2506e-04 - val_mean_squared_error: 1.2506e-04

Epoch 00022: val_loss did not improve from 0.00009
Epoch 23/100
3501/3501 [==============================] - 1s 150us/step - loss: 1.2605e-04 - mean_squared_error: 1.2605e-04 - val_loss: 9.8797e-05 - val_mean_squared_error: 9.8797e-05

Epoch 00023: val_loss did not improve from 0.00009
Epoch 24/100
3501/3501 [==============================] - 1s 149us/step - loss: 2.1077e-04 - mean_squared_error: 2.1077e-04 - val_loss: 2.1052e-04 - val_mean_squared_error: 2.1052e-04

Epoch 00024: val_loss did not improve from 0.00009
Epoch 25/100
3501/3501 [==============================] - 1s 147us/step - loss: 1.5182e-04 - mean_squared_error: 1.5182e-04 - val_loss: 6.7037e-05 - val_mean_squared_error: 6.7037e-05

Epoch 00025: val_loss improved from 0.00009 to 0.00007, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 26/100
3501/3501 [==============================] - 1s 145us/step - loss: 6.6275e-05 - mean_squared_error: 6.6275e-05 - val_loss: 6.4966e-05 - val_mean_squared_error: 6.4966e-05

Epoch 00026: val_loss improved from 0.00007 to 0.00006, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 27/100
3501/3501 [==============================] - 0s 141us/step - loss: 6.3563e-05 - mean_squared_error: 6.3563e-05 - val_loss: 6.3247e-05 - val_mean_squared_error: 6.3247e-05

Epoch 00027: val_loss improved from 0.00006 to 0.00006, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 28/100
3501/3501 [==============================] - 0s 140us/step - loss: 6.5669e-05 - mean_squared_error: 6.5669e-05 - val_loss: 7.6097e-05 - val_mean_squared_error: 7.6097e-05

Epoch 00028: val_loss did not improve from 0.00006
Epoch 29/100
3501/3501 [==============================] - 0s 140us/step - loss: 1.8151e-04 - mean_squared_error: 1.8151e-04 - val_loss: 7.0479e-04 - val_mean_squared_error: 7.0479e-04

Epoch 00029: val_loss did not improve from 0.00006
Epoch 30/100
3501/3501 [==============================] - 1s 146us/step - loss: 4.1471e-04 - mean_squared_error: 4.1471e-04 - val_loss: 5.4083e-05 - val_mean_squared_error: 5.4083e-05

Epoch 00030: val_loss improved from 0.00006 to 0.00005, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 31/100
3501/3501 [==============================] - 1s 146us/step - loss: 5.3065e-05 - mean_squared_error: 5.3065e-05 - val_loss: 5.1097e-05 - val_mean_squared_error: 5.1097e-05

Epoch 00031: val_loss improved from 0.00005 to 0.00005, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 32/100
3501/3501 [==============================] - 1s 168us/step - loss: 5.1508e-05 - mean_squared_error: 5.1508e-05 - val_loss: 4.9787e-05 - val_mean_squared_error: 4.9787e-05

Epoch 00032: val_loss improved from 0.00005 to 0.00005, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 33/100
3501/3501 [==============================] - 1s 176us/step - loss: 5.0157e-05 - mean_squared_error: 5.0157e-05 - val_loss: 4.8304e-05 - val_mean_squared_error: 4.8304e-05

Epoch 00033: val_loss improved from 0.00005 to 0.00005, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 34/100
3501/3501 [==============================] - 0s 141us/step - loss: 6.3884e-05 - mean_squared_error: 6.3884e-05 - val_loss: 1.5991e-04 - val_mean_squared_error: 1.5991e-04

Epoch 00034: val_loss did not improve from 0.00005
Epoch 35/100
3501/3501 [==============================] - 1s 235us/step - loss: 8.3365e-05 - mean_squared_error: 8.3365e-05 - val_loss: 6.0017e-05 - val_mean_squared_error: 6.0017e-05

Epoch 00035: val_loss did not improve from 0.00005
Epoch 36/100
3501/3501 [==============================] - 1s 281us/step - loss: 5.7621e-05 - mean_squared_error: 5.7621e-05 - val_loss: 4.7106e-05 - val_mean_squared_error: 4.7106e-05

Epoch 00036: val_loss improved from 0.00005 to 0.00005, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 37/100
3501/3501 [==============================] - 1s 172us/step - loss: 5.5007e-05 - mean_squared_error: 5.5007e-05 - val_loss: 5.9474e-05 - val_mean_squared_error: 5.9474e-05

Epoch 00037: val_loss did not improve from 0.00005
Epoch 38/100
3501/3501 [==============================] - 1s 200us/step - loss: 3.2148e-04 - mean_squared_error: 3.2148e-04 - val_loss: 6.6656e-04 - val_mean_squared_error: 6.6656e-04

Epoch 00038: val_loss did not improve from 0.00005
Epoch 39/100
3501/3501 [==============================] - 1s 148us/step - loss: 1.5987e-04 - mean_squared_error: 1.5987e-04 - val_loss: 4.3525e-05 - val_mean_squared_error: 4.3525e-05

Epoch 00039: val_loss improved from 0.00005 to 0.00004, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 40/100
3501/3501 [==============================] - 1s 143us/step - loss: 4.0375e-05 - mean_squared_error: 4.0375e-05 - val_loss: 3.8799e-05 - val_mean_squared_error: 3.8799e-05

Epoch 00040: val_loss improved from 0.00004 to 0.00004, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 41/100
3501/3501 [==============================] - 1s 173us/step - loss: 3.8916e-05 - mean_squared_error: 3.8916e-05 - val_loss: 3.7477e-05 - val_mean_squared_error: 3.7477e-05

Epoch 00041: val_loss improved from 0.00004 to 0.00004, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 42/100
3501/3501 [==============================] - 0s 142us/step - loss: 3.8079e-05 - mean_squared_error: 3.8079e-05 - val_loss: 3.7386e-05 - val_mean_squared_error: 3.7386e-05

Epoch 00042: val_loss improved from 0.00004 to 0.00004, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 43/100
3501/3501 [==============================] - 1s 181us/step - loss: 3.8363e-05 - mean_squared_error: 3.8363e-05 - val_loss: 4.6886e-05 - val_mean_squared_error: 4.6886e-05

Epoch 00043: val_loss did not improve from 0.00004
Epoch 44/100
3501/3501 [==============================] - 1s 207us/step - loss: 9.3752e-05 - mean_squared_error: 9.3752e-05 - val_loss: 8.5156e-05 - val_mean_squared_error: 8.5156e-05

Epoch 00044: val_loss did not improve from 0.00004
Epoch 45/100
3501/3501 [==============================] - 1s 234us/step - loss: 1.9868e-04 - mean_squared_error: 1.9868e-04 - val_loss: 4.4134e-05 - val_mean_squared_error: 4.4134e-05

Epoch 00045: val_loss did not improve from 0.00004
Epoch 46/100
3501/3501 [==============================] - 1s 245us/step - loss: 3.5457e-05 - mean_squared_error: 3.5457e-05 - val_loss: 3.3888e-05 - val_mean_squared_error: 3.3888e-05

Epoch 00046: val_loss improved from 0.00004 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 47/100
3501/3501 [==============================] - 1s 244us/step - loss: 3.5797e-05 - mean_squared_error: 3.5797e-05 - val_loss: 4.5024e-05 - val_mean_squared_error: 4.5024e-05

Epoch 00047: val_loss did not improve from 0.00003
Epoch 48/100
3501/3501 [==============================] - 1s 150us/step - loss: 3.4105e-05 - mean_squared_error: 3.4105e-05 - val_loss: 3.1702e-05 - val_mean_squared_error: 3.1702e-05

Epoch 00048: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 49/100
3501/3501 [==============================] - 1s 162us/step - loss: 3.0570e-04 - mean_squared_error: 3.0570e-04 - val_loss: 0.0013 - val_mean_squared_error: 0.0013

Epoch 00049: val_loss did not improve from 0.00003
Epoch 50/100
3501/3501 [==============================] - 1s 213us/step - loss: 1.1956e-04 - mean_squared_error: 1.1956e-04 - val_loss: 3.1140e-05 - val_mean_squared_error: 3.1140e-05

Epoch 00050: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 51/100
3501/3501 [==============================] - 1s 159us/step - loss: 3.1376e-05 - mean_squared_error: 3.1376e-05 - val_loss: 3.0275e-05 - val_mean_squared_error: 3.0275e-05

Epoch 00051: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 52/100
3501/3501 [==============================] - 1s 150us/step - loss: 4.8290e-05 - mean_squared_error: 4.8290e-05 - val_loss: 1.2244e-04 - val_mean_squared_error: 1.2244e-04

Epoch 00052: val_loss did not improve from 0.00003
Epoch 53/100
3501/3501 [==============================] - 1s 158us/step - loss: 4.6348e-05 - mean_squared_error: 4.6348e-05 - val_loss: 3.0070e-05 - val_mean_squared_error: 3.0070e-05

Epoch 00053: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 54/100
3501/3501 [==============================] - 1s 144us/step - loss: 4.4264e-05 - mean_squared_error: 4.4264e-05 - val_loss: 3.8164e-05 - val_mean_squared_error: 3.8164e-05

Epoch 00054: val_loss did not improve from 0.00003
Epoch 55/100
3501/3501 [==============================] - 1s 147us/step - loss: 4.0062e-05 - mean_squared_error: 4.0062e-05 - val_loss: 7.6668e-05 - val_mean_squared_error: 7.6668e-05

Epoch 00055: val_loss did not improve from 0.00003
Epoch 56/100
3501/3501 [==============================] - 1s 150us/step - loss: 8.1536e-05 - mean_squared_error: 8.1536e-05 - val_loss: 3.6079e-05 - val_mean_squared_error: 3.6079e-05

Epoch 00056: val_loss did not improve from 0.00003
Epoch 57/100
3501/3501 [==============================] - 1s 146us/step - loss: 1.0047e-04 - mean_squared_error: 1.0047e-04 - val_loss: 2.0018e-04 - val_mean_squared_error: 2.0018e-04

Epoch 00057: val_loss did not improve from 0.00003
Epoch 58/100
3501/3501 [==============================] - 1s 212us/step - loss: 1.0954e-04 - mean_squared_error: 1.0954e-04 - val_loss: 3.3508e-05 - val_mean_squared_error: 3.3508e-05

Epoch 00058: val_loss did not improve from 0.00003
Epoch 59/100
3501/3501 [==============================] - 1s 197us/step - loss: 3.8543e-05 - mean_squared_error: 3.8543e-05 - val_loss: 2.6846e-05 - val_mean_squared_error: 2.6846e-05

Epoch 00059: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 60/100
3501/3501 [==============================] - 1s 169us/step - loss: 3.0555e-05 - mean_squared_error: 3.0555e-05 - val_loss: 4.8613e-05 - val_mean_squared_error: 4.8613e-05

Epoch 00060: val_loss did not improve from 0.00003
Epoch 61/100
3501/3501 [==============================] - 1s 146us/step - loss: 2.5328e-04 - mean_squared_error: 2.5328e-04 - val_loss: 4.4946e-04 - val_mean_squared_error: 4.4946e-04

Epoch 00061: val_loss did not improve from 0.00003
Epoch 62/100
3501/3501 [==============================] - 1s 147us/step - loss: 7.1338e-05 - mean_squared_error: 7.1338e-05 - val_loss: 2.5187e-05 - val_mean_squared_error: 2.5187e-05

Epoch 00062: val_loss improved from 0.00003 to 0.00003, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 63/100
3501/3501 [==============================] - 1s 217us/step - loss: 2.6905e-05 - mean_squared_error: 2.6905e-05 - val_loss: 3.0878e-05 - val_mean_squared_error: 3.0878e-05

Epoch 00063: val_loss did not improve from 0.00003
Epoch 64/100
3501/3501 [==============================] - 1s 266us/step - loss: 2.6612e-05 - mean_squared_error: 2.6612e-05 - val_loss: 2.9492e-05 - val_mean_squared_error: 2.9492e-05

Epoch 00064: val_loss did not improve from 0.00003
Epoch 65/100
3501/3501 [==============================] - 1s 167us/step - loss: 7.0185e-05 - mean_squared_error: 7.0185e-05 - val_loss: 3.9116e-05 - val_mean_squared_error: 3.9116e-05

Epoch 00065: val_loss did not improve from 0.00003
Epoch 66/100
3501/3501 [==============================] - 0s 140us/step - loss: 8.2021e-05 - mean_squared_error: 8.2021e-05 - val_loss: 2.5925e-04 - val_mean_squared_error: 2.5925e-04

Epoch 00066: val_loss did not improve from 0.00003
Epoch 67/100
3501/3501 [==============================] - 1s 190us/step - loss: 7.5346e-05 - mean_squared_error: 7.5346e-05 - val_loss: 4.5074e-05 - val_mean_squared_error: 4.5074e-05

Epoch 00067: val_loss did not improve from 0.00003
Epoch 68/100
3501/3501 [==============================] - 1s 188us/step - loss: 3.0216e-05 - mean_squared_error: 3.0216e-05 - val_loss: 2.6978e-05 - val_mean_squared_error: 2.6978e-05

Epoch 00068: val_loss did not improve from 0.00003
Epoch 69/100
3501/3501 [==============================] - 1s 163us/step - loss: 1.0620e-04 - mean_squared_error: 1.0620e-04 - val_loss: 4.7271e-05 - val_mean_squared_error: 4.7271e-05

Epoch 00069: val_loss did not improve from 0.00003
Epoch 70/100
3501/3501 [==============================] - 1s 195us/step - loss: 2.8511e-04 - mean_squared_error: 2.8511e-04 - val_loss: 8.3107e-05 - val_mean_squared_error: 8.3107e-05

Epoch 00070: val_loss did not improve from 0.00003
Epoch 71/100
3501/3501 [==============================] - 1s 199us/step - loss: 3.4888e-05 - mean_squared_error: 3.4888e-05 - val_loss: 2.1939e-05 - val_mean_squared_error: 2.1939e-05

Epoch 00071: val_loss improved from 0.00003 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 72/100
3501/3501 [==============================] - 1s 151us/step - loss: 2.2851e-05 - mean_squared_error: 2.2851e-05 - val_loss: 2.8569e-05 - val_mean_squared_error: 2.8569e-05

Epoch 00072: val_loss did not improve from 0.00002
Epoch 73/100
3501/3501 [==============================] - 1s 153us/step - loss: 2.3349e-05 - mean_squared_error: 2.3349e-05 - val_loss: 2.2561e-05 - val_mean_squared_error: 2.2561e-05

Epoch 00073: val_loss did not improve from 0.00002
Epoch 74/100
3501/3501 [==============================] - 0s 138us/step - loss: 3.5593e-05 - mean_squared_error: 3.5593e-05 - val_loss: 4.1796e-05 - val_mean_squared_error: 4.1796e-05

Epoch 00074: val_loss did not improve from 0.00002
Epoch 75/100
3501/3501 [==============================] - 1s 147us/step - loss: 3.6719e-05 - mean_squared_error: 3.6719e-05 - val_loss: 8.7785e-05 - val_mean_squared_error: 8.7785e-05

Epoch 00075: val_loss did not improve from 0.00002
Epoch 76/100
3501/3501 [==============================] - 0s 140us/step - loss: 1.3555e-04 - mean_squared_error: 1.3555e-04 - val_loss: 2.8934e-05 - val_mean_squared_error: 2.8934e-05

Epoch 00076: val_loss did not improve from 0.00002
Epoch 77/100
3501/3501 [==============================] - 1s 149us/step - loss: 2.8458e-05 - mean_squared_error: 2.8458e-05 - val_loss: 3.1365e-05 - val_mean_squared_error: 3.1365e-05

Epoch 00077: val_loss did not improve from 0.00002
Epoch 78/100
3501/3501 [==============================] - 1s 144us/step - loss: 5.7461e-05 - mean_squared_error: 5.7461e-05 - val_loss: 3.6815e-05 - val_mean_squared_error: 3.6815e-05

Epoch 00078: val_loss did not improve from 0.00002
Epoch 79/100
3501/3501 [==============================] - 1s 148us/step - loss: 6.4324e-05 - mean_squared_error: 6.4324e-05 - val_loss: 8.9769e-05 - val_mean_squared_error: 8.9769e-05

Epoch 00079: val_loss did not improve from 0.00002
Epoch 80/100
3501/3501 [==============================] - 1s 150us/step - loss: 5.9422e-05 - mean_squared_error: 5.9422e-05 - val_loss: 9.6168e-05 - val_mean_squared_error: 9.6168e-05

Epoch 00080: val_loss did not improve from 0.00002
Epoch 81/100
3501/3501 [==============================] - 1s 144us/step - loss: 2.4627e-04 - mean_squared_error: 2.4627e-04 - val_loss: 6.1650e-04 - val_mean_squared_error: 6.1650e-04

Epoch 00081: val_loss did not improve from 0.00002
Epoch 82/100
3501/3501 [==============================] - 0s 139us/step - loss: 6.7262e-05 - mean_squared_error: 6.7262e-05 - val_loss: 2.0998e-05 - val_mean_squared_error: 2.0998e-05

Epoch 00082: val_loss improved from 0.00002 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 83/100
3501/3501 [==============================] - 1s 144us/step - loss: 2.1539e-05 - mean_squared_error: 2.1539e-05 - val_loss: 4.2927e-05 - val_mean_squared_error: 4.2927e-05

Epoch 00083: val_loss did not improve from 0.00002
Epoch 84/100
3501/3501 [==============================] - 1s 150us/step - loss: 6.0519e-05 - mean_squared_error: 6.0519e-05 - val_loss: 3.9719e-05 - val_mean_squared_error: 3.9719e-05

Epoch 00084: val_loss did not improve from 0.00002
Epoch 85/100
3501/3501 [==============================] - 1s 146us/step - loss: 2.9836e-05 - mean_squared_error: 2.9836e-05 - val_loss: 2.8181e-05 - val_mean_squared_error: 2.8181e-05

Epoch 00085: val_loss did not improve from 0.00002
Epoch 86/100
3501/3501 [==============================] - 0s 141us/step - loss: 5.5841e-05 - mean_squared_error: 5.5841e-05 - val_loss: 2.7892e-05 - val_mean_squared_error: 2.7892e-05

Epoch 00086: val_loss did not improve from 0.00002
Epoch 87/100
3501/3501 [==============================] - 0s 140us/step - loss: 4.5011e-05 - mean_squared_error: 4.5011e-05 - val_loss: 8.4272e-05 - val_mean_squared_error: 8.4272e-05

Epoch 00087: val_loss did not improve from 0.00002
Epoch 88/100
3501/3501 [==============================] - 1s 152us/step - loss: 9.3875e-05 - mean_squared_error: 9.3875e-05 - val_loss: 3.2714e-04 - val_mean_squared_error: 3.2714e-04

Epoch 00088: val_loss did not improve from 0.00002
Epoch 89/100
3501/3501 [==============================] - 1s 147us/step - loss: 1.0599e-04 - mean_squared_error: 1.0599e-04 - val_loss: 2.5802e-05 - val_mean_squared_error: 2.5802e-05

Epoch 00089: val_loss did not improve from 0.00002
Epoch 90/100
3501/3501 [==============================] - 1s 146us/step - loss: 2.1064e-05 - mean_squared_error: 2.1064e-05 - val_loss: 2.1859e-05 - val_mean_squared_error: 2.1859e-05

Epoch 00090: val_loss did not improve from 0.00002
Epoch 91/100
3501/3501 [==============================] - 1s 155us/step - loss: 2.4710e-05 - mean_squared_error: 2.4710e-05 - val_loss: 3.3201e-05 - val_mean_squared_error: 3.3201e-05

Epoch 00091: val_loss did not improve from 0.00002
Epoch 92/100
3501/3501 [==============================] - 1s 143us/step - loss: 1.9832e-04 - mean_squared_error: 1.9832e-04 - val_loss: 0.0012 - val_mean_squared_error: 0.0012

Epoch 00092: val_loss did not improve from 0.00002
Epoch 93/100
3501/3501 [==============================] - 0s 138us/step - loss: 1.8404e-04 - mean_squared_error: 1.8404e-04 - val_loss: 1.9404e-05 - val_mean_squared_error: 1.9404e-05

Epoch 00093: val_loss improved from 0.00002 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 94/100
3501/3501 [==============================] - 0s 141us/step - loss: 1.7544e-05 - mean_squared_error: 1.7544e-05 - val_loss: 1.6940e-05 - val_mean_squared_error: 1.6940e-05

Epoch 00094: val_loss improved from 0.00002 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 95/100
3501/3501 [==============================] - 0s 139us/step - loss: 1.7011e-05 - mean_squared_error: 1.7011e-05 - val_loss: 1.6420e-05 - val_mean_squared_error: 1.6420e-05

Epoch 00095: val_loss improved from 0.00002 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
Epoch 96/100
3501/3501 [==============================] - 1s 145us/step - loss: 1.6887e-05 - mean_squared_error: 1.6887e-05 - val_loss: 1.6874e-05 - val_mean_squared_error: 1.6874e-05

Epoch 00096: val_loss did not improve from 0.00002
Epoch 97/100
3501/3501 [==============================] - 0s 143us/step - loss: 1.7039e-05 - mean_squared_error: 1.7039e-05 - val_loss: 2.4081e-05 - val_mean_squared_error: 2.4081e-05

Epoch 00097: val_loss did not improve from 0.00002
Epoch 98/100
3501/3501 [==============================] - 1s 149us/step - loss: 1.3684e-04 - mean_squared_error: 1.3684e-04 - val_loss: 0.0012 - val_mean_squared_error: 0.0012

Epoch 00098: val_loss did not improve from 0.00002
Epoch 99/100
3501/3501 [==============================] - 1s 147us/step - loss: 2.4728e-04 - mean_squared_error: 2.4728e-04 - val_loss: 1.8387e-05 - val_mean_squared_error: 1.8387e-05

Epoch 00099: val_loss did not improve from 0.00002
Epoch 100/100
3501/3501 [==============================] - 1s 145us/step - loss: 1.6816e-05 - mean_squared_error: 1.6816e-05 - val_loss: 1.5990e-05 - val_mean_squared_error: 1.5990e-05

Epoch 00100: val_loss improved from 0.00002 to 0.00002, saving model to aen_sin_003Hz_n=00128_rodent.h5
In [11]:
# QC training and validation curves (should follow eachother)
plt.figure(figsize=(8,2))
plt.plot(results.history['val_loss'], label='val')
plt.plot(results.history['loss'], label='train')
plt.xlabel('epoch index')
plt.ylabel('loss value (MSE)')
plt.legend()
plt.show()
In [12]:
# QC that autoencoder can autoencode a sine wave from the test set
decoded_sin = ae.predict(x_test)
plt.figure(figsize=(8,2))
plt.plot(t,decoded_sin[np.random.randint(0,n_test-1),:])
plt.xlabel('time (s)')
plt.ylabel('amplitude')
plt.show()

Inspect latent space


  1. What does the encoded representation of our sine wave look like?
  2. What does the decoded version of each latent dimension look like?
In [13]:
# encoded representation of the test set
encoded_test = encoder.predict(x_test)

# decoded representation of each dimension of the latent space 
decoded_latent = decoder.predict( np.eye(11) ) 
In [14]:
# QC each latent space dimension distribution and corresponding decoded representation
cmap = plt.cm.get_cmap('Dark2',11)

plt.figure(figsize=(10,5))

plt.subplot(121)
violins = plt.violinplot(encoded_test, vert=False)
for i,violin in enumerate(violins['bodies']):
    violin.set_color(cmap(i))
plt.title('Distribution of latent space values')
plt.xlabel('latent space value')
plt.ylabel('latent space dimension index')

plt.subplot(122)
for i in range(11):
    plt.plot(t, decoded_latent[i,:]*0.5+i, c=cmap(i))
plt.title('Decoded latent space dimensions')
plt.xlabel('time (s)')
plt.gca().yaxis.set_ticklabels([])

plt.show()

Showcase denoising properties of autoencoder


Here is where the fun begins with denoising using autoencoders. Let's add Gaussian noise with standard devations between one and seven then observe how well the autoencoder denoises.

Note that we have basically trained nonlinear combination of connected numbers (a network) that collectively excel at encoding and decoding 30 Hz sine waves.

So, we expect a 30 Hz sine wave upon output for a wide variety of inputs through this network. Recall that the input signal is a sine wave with a peak-to-peak amplitude of 2 - our noise levels are considerably high here.

(We could improve denoising by including noisy sine waves in our training data)

In [15]:
# specific sine wave index of interest
sin_index = np.random.randint(0,n_test-1)
In [17]:
# plot effects of noise on input
slope = 3
fcos = 10
for i in range(7):
    plt.figure(figsize=(8,2))

    sin_noisy = x_test + np.random.randn(n_test,nt)*i
    #sin_noisy = x_test + i*np.cos(2*np.pi*np.linspace(0,1,128)*2)
    #sin_noisy = x_test + np.random.randn(n_test,nt)*i + np.cos(2*np.pi*np.linspace(0,1,128)*fcos)
    decoded_sin = ae.predict(sin_noisy)

    plt.plot(t,x_test[sin_index,:], 'k', lw=10, alpha=0.2, label='no noise')
    plt.plot(t, sin_noisy[sin_index,:], label='noisy input')
    plt.plot(t,decoded_sin[sin_index,:], label='denoised')

    plt.title('noise $\sigma$ = %2.1f'%i)
    plt.legend(loc='upper left')

    plt.show()
In [ ]:
 

Comments