Machine learning with tensorflow#
Aim(s) for this section 🎯#
get practical experience with
ANN
s, specificallyCNN
sbuilt
,train
andevalaute
aCNN
discuss important building blocks and learn how to interpret outcomes
Outline for this section 📝#
The tutorial dataset
preparing the data
Building
andtraining
anANN
- a2D CNN
examplepython
anddeep learning
defining the basics
building an
ANN
how to train your network
Evaluating
anANN
The
test set
Confusion matrix
Generalization
Transfer learning
import random
random.seed(0)
The tutorial dataset#
In order to demonstrate how you can build
and train
an ANN
we need a dataset that fits several requirements:
we are all here with
laptops
that most likely don’t have the computational power ofHPC
s and graphic cards (if you do, good for you!), thus the dataset needs to be small enough so that we can actually train ourANN
within a short amount of time and withoutGPU
sthinking this further: we also might not want to test the most simplest
ANN
, but one with a fewhidden layers
it would be cool to use a
dataset
with at least some real world feeling to demonstrate a somewhat typical workflow
We thus decided on a small fMRI
dataset from Zhang et al. with the following specs:
two resting-state sessions from
48
participantsone with
eyes-closed
and one witheyes-open
we will use a subset of
volumes
of each session

This will allow us to:
address a (somewhat) realistic
image processing
task viasupervised learning
for which we can employ aCNN
showcase how parameters might change the
ANN
evaluate
representations
acrosslayers
Let’s gather our dataset
import urllib.request
url = 'https://github.com/miykael/workshop_pybrain/raw/master/workshop/notebooks/data/dataset_ML.nii.gz'
urllib.request.urlretrieve(url, 'dataest_ML.nii.gz')
('dataest_ML.nii.gz', <http.client.HTTPMessage at 0x10d21c640>)
and check its dimensions as well as visually inspect it:
import nibabel as nb
data = nb.load('dataest_ML.nii.gz')
data.shape
(40, 51, 41, 384)
data.orthoview()
<OrthoSlicer3D: dataest_ML.nii.gz (40, 51, 41, 384)>

We can also plot the mean image
across time
to get an idea about signal variation
:
from nilearn.image import mean_img
from nilearn.plotting import view_img
view_img(mean_img(data))
/Users/peerherholz/anaconda3/envs/nowaschool/lib/python3.10/site-packages/numpy/core/fromnumeric.py:758: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedArray.
a.partition(kth, axis=axis, kind=kind, order=order)
Well well well, there should be something in there that an ANN
can learn…
the task:
we know that there are
images
where participants had theireyes open
orclosed
we now want to
build
anANN
totrain
it to recognize and distinguish the respectiveimages
we also want to know what representations our
ANN
learnsthus, we have a
supervised learning problem
which we want to solve viaimage processing
what we need to do:
prepare the data
decide on a model, build and train it
Preparing the data
From our adventures in "classic" machine learning
we know, that we need labels
to address a supervised learning problem
. Checking the dimensions of our dataset
again:
data.shape
(40, 51, 41, 384)
We see that we have a 4 dimensional dataset
, with the first three dimensions being spatial, i.e. x
, y
and z
, and the fourth being time. So we need to specify during which of the images
participants had their eyes closed
and during which they had their eyes open
. Without going into further detail, we know that it’s always 4 volumes
of eyes closed
, followed by 4 volumes
of eyes open
, etc. and given that we have 48
participants, we can define our labels
as follows:
import numpy as np
labels = np.ravel([[['closed'] * 4, ['open'] * 4] for i in range(48)])
labels[:20]
array(['closed', 'closed', 'closed', 'closed', 'open', 'open', 'open',
'open', 'closed', 'closed', 'closed', 'closed', 'open', 'open',
'open', 'open', 'closed', 'closed', 'closed', 'closed'],
dtype='<U6')
Going back to the aspect of computation time
and resources
, as well as given that this is a showcase, it might be a good idea to not utilize the entire fMRI volume
, but only certain parts where we expect some things to happen. (Please note: this is of course a form of inductive bias
comparable to feature engineering
in "classic" machine learning
and something you won’t do in a “real-world situation” (depending on the data and goal of course)).
In our case, we could try to not train the neural network only on one very thin slab (a few slices) of the brain. So, instead of taking the data matrix of the whole brain, we just take 2 slices in the region that we think is most likely to be predictive for the question at hand.
We know (or suspect) that the regions with the most predictive power are probably somewhere around the eyes and in the visual cortex. So let’s try to specify a few slices that cover those regions.
So, let’s try to just take a few slices around the eyes:
from nilearn.plotting import plot_img
plot_img(mean_img(data).slicer[...,5:-25], cmap='magma', colorbar=False,
display_mode='x', vmax=2, annotate=False, cut_coords=range(0, 49, 12),
title='Slab of the mean image');

This worked only so and so, but with a few lines of code the mighty power of python
and its packages
can help us achieve a better training dataset
. For example, we could rotate the volume
(depending on the data and goal, this sort of image processing is actually sometimes done in “real-world situations”):
# Rotation parameters
phi = 0.35
cos = np.cos(phi)
sin = np.sin(phi)
# Compute rotation matrix around x-axis
rotation_affine = np.array([[1, 0, 0, 0],
[0, cos, -sin, 0],
[0, sin, cos, 0],
[0, 0, 0, 1]])
new_affine = rotation_affine.dot(data.affine)
Now we can use this new affine
to resample
our volumes
:
from nilearn.image import resample_img
new_img = nb.Nifti1Image(data.get_fdata(), new_affine)
img_rot = resample_img(new_img, data.affine, interpolation='continuous')
How do our volumes
look now?
plot_img(mean_img(img_rot).slicer[...,5:-25], cmap='magma', colorbar=False,
display_mode='x', vmax=2, annotate=False, cut_coords=range(0, 49, 12),
title='Slab of the mean rotated image');

Coolio! Now we can check what set of slices
of our volumes
might constitute feasible inputs to our ANN
:
from nilearn.plotting import plot_stat_map
img_slab = img_rot.slicer[..., 12:15, :]
plot_stat_map(mean_img(img_slab), cmap='magma', bg_img=mean_img(img_slab), colorbar=False,
display_mode='x', vmax=2, annotate=False, cut_coords=range(-20, 30, 12),
title='Slices of the rotated image');

Now this is something we can definitely work with, even if we have only limited time and resources.
Building and training an ANN
- a 2D CNN
example#
Not that we have checked and further prepared our dataset
, it’s finally time to get to work. Given that we’re working with fMRI volumes
, i.e. images
and what we’ve heard about the different ANN architectures
, using a CNN
might be a good idea.
But where to start? Is there any software I can use that makes the building, training and evaluating of ANN
s “comparably easy”?
Well, say no more…Python
obviously also has your back when it’s about deep learning
(gotta love python
, eh?)! It actually has not only but a bunch of different packages that focus on deep learning
. Let’s have a brief look on the things that are out there.
Python and deep learning
#
As outlined before python
is a very powerful all purpose language
, including a broad user base and support for machine learning
, both “classic” and deep learning
.
https://miro.medium.com/max/1400/1*RIrPOCyMFwFC-XULbja3rw.png
lots of well
documented
andtested
libraries
lots of
tutorials
to learn things (you + theANN
):youtube videos
blog posts
other open workshops
jupyter notebooks
lots of
pre-trained models
to use for your researchlots of support in forums
completely free and open source!

all work a bit different, but the basic concepts and steps are comparable
nevertheless: always check the documentation as e.g.
default values
might vary
crucial in all: tensors
have a look at this great introduction to tensors from tensorflow
the question which one to choose is of course not an easy one and might also depend on external factors:
the type and amount of data you have
the time and computational resources available to you
specific functionality that only exists in a certain package
utilization of pre-trained
ANN
swhat you’ve heard about and others show you (that’s obviously on us…)
here we will use keras which is build on top of tensorflow because:
high-level
API
easy to grasp implementation of
ANN
building blocksfast experimentation
for a fantastic resource that includes all things we talked about/will talk and way more in much greater detail, please check the deep learning part of Neuromatch Academy
important: we’re not saying that
keras
/tensorflow
is better than the otherpython deep learning libraries
, it just works very well for tutorials/workshops like the one you’re currently at given the very limited time we have
Now it’s finally go time, get your machines ready!

Defining the basics#
Before we can actually assemble our ANN
, we need to set a few things. However, first things first: importing
modules
and classes
:
from tensorflow.python.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, AvgPool2D, BatchNormalization
from tensorflow.python.keras.layers import Activation, Dropout, Flatten, Dense
from tensorflow.keras.optimizers import Adam
2024-02-14 20:58:05.021252: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Next, we need to take a look at our dataset
again, specifically its dimensions
:
img_slab.shape
(40, 56, 3, 384)
Again, we have the x
, y
and z
of our images
, i.e. the images
themselves, in the first three dimensions which are stacked in the fourth dimension. For this type of data
to work with keras
/tensorflow
we actually need to adapt, that is swap, some of the dimensions, as these modules
/functions
expect them in a different way. This part of getting your data
ready as input
into a given ANN
is crucial and can cause one or the other problem. Therefore, always make sure to carefully read the documentation
of a class
, module
or pre-trained model
you want to use. They are usually very good and show entail examples of how to get data ready for the ANN
.
That being said, here we need basically only need to make the last dimension
the first, so that we have the volumes
/images
stacked in the first dimension
and the images
themselves within the subsequent three:
data = np.rollaxis(img_slab.get_fdata(), 3, 0)
data.shape
(384, 40, 56, 3)
Specifically, the last dimension
, here 3
, are considered as channels
.
There are some central parameters
we can set before building the ANN
itself. For example, we know the shape of the input
. That is, the dimensions
our input layer
will receive:
data_shape = tuple(data.shape[1:])
data_shape
(40, 56, 3)
We also want to set the kernel size
of our convolutional kernel
. As heard before, this can be a tremendously important hyperparamter
that can drastically affect the behavior of your ANN
. It is thus something you have to carefully think about and even might want to evaluate via cross-validation
. Here, we will use a kernel size
of (3,3)
.
kernel_size = (3, 3)
The same holds true for the filters
we want our convolutional layers
to use:
filters = 32
Given that we want to work with a supervised learning problem
and know that there are 2 classes
we want our ANN
to learn to learn distinguish, we can set the number of classes
accordingly:
n_classes = 2
With that, we ready to start building our ANN
!
Building an ANN
#
You heard right, it’s finally ANN
time! Initially, we have to decide on an architecture
, that is the type of ANN
we want to build. As we want to test a simple CNN
, a feedforward ANN
without multiple inputs
and/or outputs
, we will employ what is called a sequential model
in keras
/tensorflows
within which we define layer
by layer
. Note: It’s the easiest but also the most restrictive one.
model = Sequential()
Now that the basic structure is defined, we can start adding
layers
to our ANN
. This is achieved by the following syntax (pseudocode
):
model.add(layer_typ(layer_settings, layer_parameters))
Defining the input layer
#
The first step? Obviously defining an input layer
, i.e. the layer
that receives the external input
. We want to build a CNN
, so let’s make it a convolutional layer
. What do we need for that?
help(Conv2D)
Help on class Conv2D in module keras.layers.convolutional.conv2d:
class Conv2D(keras.layers.convolutional.base_conv.Conv)
| Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), groups=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, **kwargs)
|
| 2D convolution layer (e.g. spatial convolution over images).
|
| This layer creates a convolution kernel that is convolved
| with the layer input to produce a tensor of
| outputs. If `use_bias` is True,
| a bias vector is created and added to the outputs. Finally, if
| `activation` is not `None`, it is applied to the outputs as well.
|
| When using this layer as the first layer in a model,
| provide the keyword argument `input_shape`
| (tuple of integers or `None`, does not include the sample axis),
| e.g. `input_shape=(128, 128, 3)` for 128x128 RGB pictures
| in `data_format="channels_last"`. You can use `None` when
| a dimension has variable size.
|
| Examples:
|
| >>> # The inputs are 28x28 RGB images with `channels_last` and the batch
| >>> # size is 4.
| >>> input_shape = (4, 28, 28, 3)
| >>> x = tf.random.normal(input_shape)
| >>> y = tf.keras.layers.Conv2D(
| ... 2, 3, activation='relu', input_shape=input_shape[1:])(x)
| >>> print(y.shape)
| (4, 26, 26, 2)
|
| >>> # With `dilation_rate` as 2.
| >>> input_shape = (4, 28, 28, 3)
| >>> x = tf.random.normal(input_shape)
| >>> y = tf.keras.layers.Conv2D(
| ... 2, 3,
| ... activation='relu',
| ... dilation_rate=2,
| ... input_shape=input_shape[1:])(x)
| >>> print(y.shape)
| (4, 24, 24, 2)
|
| >>> # With `padding` as "same".
| >>> input_shape = (4, 28, 28, 3)
| >>> x = tf.random.normal(input_shape)
| >>> y = tf.keras.layers.Conv2D(
| ... 2, 3, activation='relu', padding="same", input_shape=input_shape[1:])(x)
| >>> print(y.shape)
| (4, 28, 28, 2)
|
| >>> # With extended batch shape [4, 7]:
| >>> input_shape = (4, 7, 28, 28, 3)
| >>> x = tf.random.normal(input_shape)
| >>> y = tf.keras.layers.Conv2D(
| ... 2, 3, activation='relu', input_shape=input_shape[2:])(x)
| >>> print(y.shape)
| (4, 7, 26, 26, 2)
|
|
| Args:
| filters: Integer, the dimensionality of the output space (i.e. the number
| of output filters in the convolution).
| kernel_size: An integer or tuple/list of 2 integers, specifying the height
| and width of the 2D convolution window. Can be a single integer to
| specify the same value for all spatial dimensions.
| strides: An integer or tuple/list of 2 integers, specifying the strides of
| the convolution along the height and width. Can be a single integer to
| specify the same value for all spatial dimensions. Specifying any stride
| value != 1 is incompatible with specifying any `dilation_rate` value !=
| 1.
| padding: one of `"valid"` or `"same"` (case-insensitive).
| `"valid"` means no padding. `"same"` results in padding with zeros
| evenly to the left/right or up/down of the input. When `padding="same"`
| and `strides=1`, the output has the same size as the input.
| data_format: A string, one of `channels_last` (default) or
| `channels_first`. The ordering of the dimensions in the inputs.
| `channels_last` corresponds to inputs with shape `(batch_size, height,
| width, channels)` while `channels_first` corresponds to inputs with
| shape `(batch_size, channels, height, width)`. It defaults to the
| `image_data_format` value found in your Keras config file at
| `~/.keras/keras.json`. If you never set it, then it will be
| `channels_last`. Note that the `channels_first` format is currently not
| supported by TensorFlow on CPU.
| dilation_rate: an integer or tuple/list of 2 integers, specifying the
| dilation rate to use for dilated convolution. Can be a single integer to
| specify the same value for all spatial dimensions. Currently, specifying
| any `dilation_rate` value != 1 is incompatible with specifying any
| stride value != 1.
| groups: A positive integer specifying the number of groups in which the
| input is split along the channel axis. Each group is convolved
| separately with `filters / groups` filters. The output is the
| concatenation of all the `groups` results along the channel axis. Input
| channels and `filters` must both be divisible by `groups`.
| activation: Activation function to use. If you don't specify anything, no
| activation is applied (see `keras.activations`).
| use_bias: Boolean, whether the layer uses a bias vector.
| kernel_initializer: Initializer for the `kernel` weights matrix (see
| `keras.initializers`). Defaults to 'glorot_uniform'.
| bias_initializer: Initializer for the bias vector (see
| `keras.initializers`). Defaults to 'zeros'.
| kernel_regularizer: Regularizer function applied to the `kernel` weights
| matrix (see `keras.regularizers`).
| bias_regularizer: Regularizer function applied to the bias vector (see
| `keras.regularizers`).
| activity_regularizer: Regularizer function applied to the output of the
| layer (its "activation") (see `keras.regularizers`).
| kernel_constraint: Constraint function applied to the kernel matrix (see
| `keras.constraints`).
| bias_constraint: Constraint function applied to the bias vector (see
| `keras.constraints`).
|
| Input shape:
| 4+D tensor with shape: `batch_shape + (channels, rows, cols)` if
| `data_format='channels_first'`
| or 4+D tensor with shape: `batch_shape + (rows, cols, channels)` if
| `data_format='channels_last'`.
|
| Output shape:
| 4+D tensor with shape: `batch_shape + (filters, new_rows, new_cols)` if
| `data_format='channels_first'` or 4+D tensor with shape: `batch_shape +
| (new_rows, new_cols, filters)` if `data_format='channels_last'`. `rows`
| and `cols` values might have changed due to padding.
|
| Returns:
| A tensor of rank 4+ representing
| `activation(conv2d(inputs, kernel) + bias)`.
|
| Raises:
| ValueError: if `padding` is `"causal"`.
| ValueError: when both `strides > 1` and `dilation_rate > 1`.
|
| Method resolution order:
| Conv2D
| keras.layers.convolutional.base_conv.Conv
| keras.engine.base_layer.Layer
| tensorflow.python.module.module.Module
| tensorflow.python.trackable.autotrackable.AutoTrackable
| tensorflow.python.trackable.base.Trackable
| keras.utils.version_utils.LayerVersionSelector
| builtins.object
|
| Methods defined here:
|
| __init__(self, filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), groups=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, **kwargs)
|
| ----------------------------------------------------------------------
| Methods inherited from keras.layers.convolutional.base_conv.Conv:
|
| build(self, input_shape)
| Creates the variables of the layer (for subclass implementers).
|
| This is a method that implementers of subclasses of `Layer` or `Model`
| can override if they need a state-creation step in-between
| layer instantiation and layer call. It is invoked automatically before
| the first execution of `call()`.
|
| This is typically used to create the weights of `Layer` subclasses
| (at the discretion of the subclass implementer).
|
| Args:
| input_shape: Instance of `TensorShape`, or list of instances of
| `TensorShape` if the layer expects a list of inputs
| (one instance per input).
|
| call(self, inputs)
| This is where the layer's logic lives.
|
| The `call()` method may not create state (except in its first
| invocation, wrapping the creation of variables or other resources in
| `tf.init_scope()`). It is recommended to create state, including
| `tf.Variable` instances and nested `Layer` instances,
| in `__init__()`, or in the `build()` method that is
| called automatically before `call()` executes for the first time.
|
| Args:
| inputs: Input tensor, or dict/list/tuple of input tensors.
| The first positional `inputs` argument is subject to special rules:
| - `inputs` must be explicitly passed. A layer cannot have zero
| arguments, and `inputs` cannot be provided via the default value
| of a keyword argument.
| - NumPy array or Python scalar values in `inputs` get cast as
| tensors.
| - Keras mask metadata is only collected from `inputs`.
| - Layers are built (`build(input_shape)` method)
| using shape info from `inputs` only.
| - `input_spec` compatibility is only checked against `inputs`.
| - Mixed precision input casting is only applied to `inputs`.
| If a layer has tensor arguments in `*args` or `**kwargs`, their
| casting behavior in mixed precision should be handled manually.
| - The SavedModel input specification is generated using `inputs`
| only.
| - Integration with various ecosystem packages like TFMOT, TFLite,
| TF.js, etc is only supported for `inputs` and not for tensors in
| positional and keyword arguments.
| *args: Additional positional arguments. May contain tensors, although
| this is not recommended, for the reasons above.
| **kwargs: Additional keyword arguments. May contain tensors, although
| this is not recommended, for the reasons above.
| The following optional keyword arguments are reserved:
| - `training`: Boolean scalar tensor of Python boolean indicating
| whether the `call` is meant for training or inference.
| - `mask`: Boolean input mask. If the layer's `call()` method takes a
| `mask` argument, its default value will be set to the mask
| generated for `inputs` by the previous layer (if `input` did come
| from a layer that generated a corresponding mask, i.e. if it came
| from a Keras layer with masking support).
|
| Returns:
| A tensor or list/tuple of tensors.
|
| compute_output_shape(self, input_shape)
| Computes the output shape of the layer.
|
| This method will cause the layer's state to be built, if that has not
| happened before. This requires that the layer will later be used with
| inputs that match the input shape provided here.
|
| Args:
| input_shape: Shape tuple (tuple of integers) or `tf.TensorShape`,
| or structure of shape tuples / `tf.TensorShape` instances
| (one per output tensor of the layer).
| Shape tuples can include None for free dimensions,
| instead of an integer.
|
| Returns:
| A `tf.TensorShape` instance
| or structure of `tf.TensorShape` instances.
|
| convolution_op(self, inputs, kernel)
|
| get_config(self)
| Returns the config of the layer.
|
| A layer config is a Python dictionary (serializable)
| containing the configuration of a layer.
| The same layer can be reinstantiated later
| (without its trained weights) from this configuration.
|
| The config of a layer does not include connectivity
| information, nor the layer class name. These are handled
| by `Network` (one layer of abstraction above).
|
| Note that `get_config()` does not guarantee to return a fresh copy of
| dict every time it is called. The callers should make a copy of the
| returned dict if they want to modify it.
|
| Returns:
| Python dictionary.
|
| ----------------------------------------------------------------------
| Methods inherited from keras.engine.base_layer.Layer:
|
| __call__(self, *args, **kwargs)
| Wraps `call`, applying pre- and post-processing steps.
|
| Args:
| *args: Positional arguments to be passed to `self.call`.
| **kwargs: Keyword arguments to be passed to `self.call`.
|
| Returns:
| Output tensor(s).
|
| Note:
| - The following optional keyword arguments are reserved for specific
| uses:
| * `training`: Boolean scalar tensor of Python boolean indicating
| whether the `call` is meant for training or inference.
| * `mask`: Boolean input mask.
| - If the layer's `call` method takes a `mask` argument (as some Keras
| layers do), its default value will be set to the mask generated
| for `inputs` by the previous layer (if `input` did come from
| a layer that generated a corresponding mask, i.e. if it came from
| a Keras layer with masking support.
| - If the layer is not built, the method will call `build`.
|
| Raises:
| ValueError: if the layer's `call` method returns None (an invalid
| value).
| RuntimeError: if `super().__init__()` was not called in the
| constructor.
|
| __delattr__(self, name)
| Implement delattr(self, name).
|
| __getstate__(self)
|
| __setattr__(self, name, value)
| Support self.foo = trackable syntax.
|
| __setstate__(self, state)
|
| add_loss(self, losses, **kwargs)
| Add loss tensor(s), potentially dependent on layer inputs.
|
| Some losses (for instance, activity regularization losses) may be
| dependent on the inputs passed when calling a layer. Hence, when reusing
| the same layer on different inputs `a` and `b`, some entries in
| `layer.losses` may be dependent on `a` and some on `b`. This method
| automatically keeps track of dependencies.
|
| This method can be used inside a subclassed layer or model's `call`
| function, in which case `losses` should be a Tensor or list of Tensors.
|
| Example:
|
| ```python
| class MyLayer(tf.keras.layers.Layer):
| def call(self, inputs):
| self.add_loss(tf.abs(tf.reduce_mean(inputs)))
| return inputs
| ```
|
| The same code works in distributed training: the input to `add_loss()`
| is treated like a regularization loss and averaged across replicas
| by the training loop (both built-in `Model.fit()` and compliant custom
| training loops).
|
| The `add_loss` method can also be called directly on a Functional Model
| during construction. In this case, any loss Tensors passed to this Model
| must be symbolic and be able to be traced back to the model's `Input`s.
| These losses become part of the model's topology and are tracked in
| `get_config`.
|
| Example:
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| # Activity regularization.
| model.add_loss(tf.abs(tf.reduce_mean(x)))
| ```
|
| If this is not the case for your loss (if, for example, your loss
| references a `Variable` of one of the model's layers), you can wrap your
| loss in a zero-argument lambda. These losses are not tracked as part of
| the model's topology since they can't be serialized.
|
| Example:
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| d = tf.keras.layers.Dense(10)
| x = d(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| # Weight regularization.
| model.add_loss(lambda: tf.reduce_mean(d.kernel))
| ```
|
| Args:
| losses: Loss tensor, or list/tuple of tensors. Rather than tensors,
| losses may also be zero-argument callables which create a loss
| tensor.
| **kwargs: Used for backwards compatibility only.
|
| add_metric(self, value, name=None, **kwargs)
| Adds metric tensor to the layer.
|
| This method can be used inside the `call()` method of a subclassed layer
| or model.
|
| ```python
| class MyMetricLayer(tf.keras.layers.Layer):
| def __init__(self):
| super(MyMetricLayer, self).__init__(name='my_metric_layer')
| self.mean = tf.keras.metrics.Mean(name='metric_1')
|
| def call(self, inputs):
| self.add_metric(self.mean(inputs))
| self.add_metric(tf.reduce_sum(inputs), name='metric_2')
| return inputs
| ```
|
| This method can also be called directly on a Functional Model during
| construction. In this case, any tensor passed to this Model must
| be symbolic and be able to be traced back to the model's `Input`s. These
| metrics become part of the model's topology and are tracked when you
| save the model via `save()`.
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| model.add_metric(math_ops.reduce_sum(x), name='metric_1')
| ```
|
| Note: Calling `add_metric()` with the result of a metric object on a
| Functional Model, as shown in the example below, is not supported. This
| is because we cannot trace the metric result tensor back to the model's
| inputs.
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| model.add_metric(tf.keras.metrics.Mean()(x), name='metric_1')
| ```
|
| Args:
| value: Metric tensor.
| name: String metric name.
| **kwargs: Additional keyword arguments for backward compatibility.
| Accepted values:
| `aggregation` - When the `value` tensor provided is not the result
| of calling a `keras.Metric` instance, it will be aggregated by
| default using a `keras.Metric.Mean`.
|
| add_update(self, updates)
| Add update op(s), potentially dependent on layer inputs.
|
| Weight updates (for instance, the updates of the moving mean and
| variance in a BatchNormalization layer) may be dependent on the inputs
| passed when calling a layer. Hence, when reusing the same layer on
| different inputs `a` and `b`, some entries in `layer.updates` may be
| dependent on `a` and some on `b`. This method automatically keeps track
| of dependencies.
|
| This call is ignored when eager execution is enabled (in that case,
| variable updates are run on the fly and thus do not need to be tracked
| for later execution).
|
| Args:
| updates: Update op, or list/tuple of update ops, or zero-arg callable
| that returns an update op. A zero-arg callable should be passed in
| order to disable running the updates by setting `trainable=False`
| on this Layer, when executing in Eager mode.
|
| add_variable(self, *args, **kwargs)
| Deprecated, do NOT use! Alias for `add_weight`.
|
| add_weight(self, name=None, shape=None, dtype=None, initializer=None, regularizer=None, trainable=None, constraint=None, use_resource=None, synchronization=<VariableSynchronization.AUTO: 0>, aggregation=<VariableAggregationV2.NONE: 0>, **kwargs)
| Adds a new variable to the layer.
|
| Args:
| name: Variable name.
| shape: Variable shape. Defaults to scalar if unspecified.
| dtype: The type of the variable. Defaults to `self.dtype`.
| initializer: Initializer instance (callable).
| regularizer: Regularizer instance (callable).
| trainable: Boolean, whether the variable should be part of the layer's
| "trainable_variables" (e.g. variables, biases)
| or "non_trainable_variables" (e.g. BatchNorm mean and variance).
| Note that `trainable` cannot be `True` if `synchronization`
| is set to `ON_READ`.
| constraint: Constraint instance (callable).
| use_resource: Whether to use a `ResourceVariable` or not.
| See [this guide](
| https://www.tensorflow.org/guide/migrate/tf1_vs_tf2#resourcevariables_instead_of_referencevariables)
| for more information.
| synchronization: Indicates when a distributed a variable will be
| aggregated. Accepted values are constants defined in the class
| `tf.VariableSynchronization`. By default the synchronization is set
| to `AUTO` and the current `DistributionStrategy` chooses when to
| synchronize. If `synchronization` is set to `ON_READ`, `trainable`
| must not be set to `True`.
| aggregation: Indicates how a distributed variable will be aggregated.
| Accepted values are constants defined in the class
| `tf.VariableAggregation`.
| **kwargs: Additional keyword arguments. Accepted values are `getter`,
| `collections`, `experimental_autocast` and `caching_device`.
|
| Returns:
| The variable created.
|
| Raises:
| ValueError: When giving unsupported dtype and no initializer or when
| trainable has been set to True with synchronization set as
| `ON_READ`.
|
| build_from_config(self, config)
|
| compute_mask(self, inputs, mask=None)
| Computes an output mask tensor.
|
| Args:
| inputs: Tensor or list of tensors.
| mask: Tensor or list of tensors.
|
| Returns:
| None or a tensor (or list of tensors,
| one per output tensor of the layer).
|
| compute_output_signature(self, input_signature)
| Compute the output tensor signature of the layer based on the inputs.
|
| Unlike a TensorShape object, a TensorSpec object contains both shape
| and dtype information for a tensor. This method allows layers to provide
| output dtype information if it is different from the input dtype.
| For any layer that doesn't implement this function,
| the framework will fall back to use `compute_output_shape`, and will
| assume that the output dtype matches the input dtype.
|
| Args:
| input_signature: Single TensorSpec or nested structure of TensorSpec
| objects, describing a candidate input for the layer.
|
| Returns:
| Single TensorSpec or nested structure of TensorSpec objects,
| describing how the layer would transform the provided input.
|
| Raises:
| TypeError: If input_signature contains a non-TensorSpec object.
|
| count_params(self)
| Count the total number of scalars composing the weights.
|
| Returns:
| An integer count.
|
| Raises:
| ValueError: if the layer isn't yet built
| (in which case its weights aren't yet defined).
|
| finalize_state(self)
| Finalizes the layers state after updating layer weights.
|
| This function can be subclassed in a layer and will be called after
| updating a layer weights. It can be overridden to finalize any
| additional layer state after a weight update.
|
| This function will be called after weights of a layer have been restored
| from a loaded model.
|
| get_build_config(self)
|
| get_input_at(self, node_index)
| Retrieves the input tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first input node of the layer.
|
| Returns:
| A tensor (or list of tensors if the layer has multiple inputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_input_mask_at(self, node_index)
| Retrieves the input mask tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A mask tensor
| (or list of tensors if the layer has multiple inputs).
|
| get_input_shape_at(self, node_index)
| Retrieves the input shape(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A shape tuple
| (or list of shape tuples if the layer has multiple inputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_output_at(self, node_index)
| Retrieves the output tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first output node of the layer.
|
| Returns:
| A tensor (or list of tensors if the layer has multiple outputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_output_mask_at(self, node_index)
| Retrieves the output mask tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A mask tensor
| (or list of tensors if the layer has multiple outputs).
|
| get_output_shape_at(self, node_index)
| Retrieves the output shape(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A shape tuple
| (or list of shape tuples if the layer has multiple outputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_weights(self)
| Returns the current weights of the layer, as NumPy arrays.
|
| The weights of a layer represent the state of the layer. This function
| returns both trainable and non-trainable weight values associated with
| this layer as a list of NumPy arrays, which can in turn be used to load
| state into similarly parameterized layers.
|
| For example, a `Dense` layer returns a list of two values: the kernel
| matrix and the bias vector. These can be used to set the weights of
| another `Dense` layer:
|
| >>> layer_a = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(1.))
| >>> a_out = layer_a(tf.convert_to_tensor([[1., 2., 3.]]))
| >>> layer_a.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(2.))
| >>> b_out = layer_b(tf.convert_to_tensor([[10., 20., 30.]]))
| >>> layer_b.get_weights()
| [array([[2.],
| [2.],
| [2.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b.set_weights(layer_a.get_weights())
| >>> layer_b.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
|
| Returns:
| Weights values as a list of NumPy arrays.
|
| set_weights(self, weights)
| Sets the weights of the layer, from NumPy arrays.
|
| The weights of a layer represent the state of the layer. This function
| sets the weight values from numpy arrays. The weight values should be
| passed in the order they are created by the layer. Note that the layer's
| weights must be instantiated before calling this function, by calling
| the layer.
|
| For example, a `Dense` layer returns a list of two values: the kernel
| matrix and the bias vector. These can be used to set the weights of
| another `Dense` layer:
|
| >>> layer_a = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(1.))
| >>> a_out = layer_a(tf.convert_to_tensor([[1., 2., 3.]]))
| >>> layer_a.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(2.))
| >>> b_out = layer_b(tf.convert_to_tensor([[10., 20., 30.]]))
| >>> layer_b.get_weights()
| [array([[2.],
| [2.],
| [2.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b.set_weights(layer_a.get_weights())
| >>> layer_b.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
|
| Args:
| weights: a list of NumPy arrays. The number
| of arrays and their shape must match
| number of the dimensions of the weights
| of the layer (i.e. it should match the
| output of `get_weights`).
|
| Raises:
| ValueError: If the provided weights list does not match the
| layer's specifications.
|
| ----------------------------------------------------------------------
| Class methods inherited from keras.engine.base_layer.Layer:
|
| from_config(config) from builtins.type
| Creates a layer from its config.
|
| This method is the reverse of `get_config`,
| capable of instantiating the same layer from the config
| dictionary. It does not handle layer connectivity
| (handled by Network), nor weights (handled by `set_weights`).
|
| Args:
| config: A Python dictionary, typically the
| output of get_config.
|
| Returns:
| A layer instance.
|
| ----------------------------------------------------------------------
| Static methods inherited from keras.engine.base_layer.Layer:
|
| __new__(cls, *args, **kwargs)
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Readonly properties inherited from keras.engine.base_layer.Layer:
|
| compute_dtype
| The dtype of the layer's computations.
|
| This is equivalent to `Layer.dtype_policy.compute_dtype`. Unless
| mixed precision is used, this is the same as `Layer.dtype`, the dtype of
| the weights.
|
| Layers automatically cast their inputs to the compute dtype, which
| causes computations and the output to be in the compute dtype as well.
| This is done by the base Layer class in `Layer.__call__`, so you do not
| have to insert these casts if implementing your own layer.
|
| Layers often perform certain internal computations in higher precision
| when `compute_dtype` is float16 or bfloat16 for numeric stability. The
| output will still typically be float16 or bfloat16 in such cases.
|
| Returns:
| The layer's compute dtype.
|
| dtype
| The dtype of the layer weights.
|
| This is equivalent to `Layer.dtype_policy.variable_dtype`. Unless
| mixed precision is used, this is the same as `Layer.compute_dtype`, the
| dtype of the layer's computations.
|
| dtype_policy
| The dtype policy associated with this layer.
|
| This is an instance of a `tf.keras.mixed_precision.Policy`.
|
| dynamic
| Whether the layer is dynamic (eager-only); set in the constructor.
|
| inbound_nodes
| Return Functional API nodes upstream of this layer.
|
| input
| Retrieves the input tensor(s) of a layer.
|
| Only applicable if the layer has exactly one input,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Input tensor or list of input tensors.
|
| Raises:
| RuntimeError: If called in Eager mode.
| AttributeError: If no inbound nodes are found.
|
| input_mask
| Retrieves the input mask tensor(s) of a layer.
|
| Only applicable if the layer has exactly one inbound node,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Input mask tensor (potentially None) or list of input
| mask tensors.
|
| Raises:
| AttributeError: if the layer is connected to
| more than one incoming layers.
|
| input_shape
| Retrieves the input shape(s) of a layer.
|
| Only applicable if the layer has exactly one input,
| i.e. if it is connected to one incoming layer, or if all inputs
| have the same shape.
|
| Returns:
| Input shape, as an integer shape tuple
| (or list of shape tuples, one tuple per input tensor).
|
| Raises:
| AttributeError: if the layer has no defined input_shape.
| RuntimeError: if called in Eager mode.
|
| losses
| List of losses added using the `add_loss()` API.
|
| Variable regularization tensors are created when this property is
| accessed, so it is eager safe: accessing `losses` under a
| `tf.GradientTape` will propagate gradients back to the corresponding
| variables.
|
| Examples:
|
| >>> class MyLayer(tf.keras.layers.Layer):
| ... def call(self, inputs):
| ... self.add_loss(tf.abs(tf.reduce_mean(inputs)))
| ... return inputs
| >>> l = MyLayer()
| >>> l(np.ones((10, 1)))
| >>> l.losses
| [1.0]
|
| >>> inputs = tf.keras.Input(shape=(10,))
| >>> x = tf.keras.layers.Dense(10)(inputs)
| >>> outputs = tf.keras.layers.Dense(1)(x)
| >>> model = tf.keras.Model(inputs, outputs)
| >>> # Activity regularization.
| >>> len(model.losses)
| 0
| >>> model.add_loss(tf.abs(tf.reduce_mean(x)))
| >>> len(model.losses)
| 1
|
| >>> inputs = tf.keras.Input(shape=(10,))
| >>> d = tf.keras.layers.Dense(10, kernel_initializer='ones')
| >>> x = d(inputs)
| >>> outputs = tf.keras.layers.Dense(1)(x)
| >>> model = tf.keras.Model(inputs, outputs)
| >>> # Weight regularization.
| >>> model.add_loss(lambda: tf.reduce_mean(d.kernel))
| >>> model.losses
| [<tf.Tensor: shape=(), dtype=float32, numpy=1.0>]
|
| Returns:
| A list of tensors.
|
| metrics
| List of metrics added using the `add_metric()` API.
|
| Example:
|
| >>> input = tf.keras.layers.Input(shape=(3,))
| >>> d = tf.keras.layers.Dense(2)
| >>> output = d(input)
| >>> d.add_metric(tf.reduce_max(output), name='max')
| >>> d.add_metric(tf.reduce_min(output), name='min')
| >>> [m.name for m in d.metrics]
| ['max', 'min']
|
| Returns:
| A list of `Metric` objects.
|
| name
| Name of the layer (string), set in the constructor.
|
| non_trainable_variables
| Sequence of non-trainable variables owned by this module and its submodules.
|
| Note: this method uses reflection to find variables on the current instance
| and submodules. For performance reasons you may wish to cache the result
| of calling this method if you don't expect the return value to change.
|
| Returns:
| A sequence of variables for the current module (sorted by attribute
| name) followed by variables from all submodules recursively (breadth
| first).
|
| non_trainable_weights
| List of all non-trainable weights tracked by this layer.
|
| Non-trainable weights are *not* updated during training. They are
| expected to be updated manually in `call()`.
|
| Returns:
| A list of non-trainable variables.
|
| outbound_nodes
| Return Functional API nodes downstream of this layer.
|
| output
| Retrieves the output tensor(s) of a layer.
|
| Only applicable if the layer has exactly one output,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Output tensor or list of output tensors.
|
| Raises:
| AttributeError: if the layer is connected to more than one incoming
| layers.
| RuntimeError: if called in Eager mode.
|
| output_mask
| Retrieves the output mask tensor(s) of a layer.
|
| Only applicable if the layer has exactly one inbound node,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Output mask tensor (potentially None) or list of output
| mask tensors.
|
| Raises:
| AttributeError: if the layer is connected to
| more than one incoming layers.
|
| output_shape
| Retrieves the output shape(s) of a layer.
|
| Only applicable if the layer has one output,
| or if all outputs have the same shape.
|
| Returns:
| Output shape, as an integer shape tuple
| (or list of shape tuples, one tuple per output tensor).
|
| Raises:
| AttributeError: if the layer has no defined output shape.
| RuntimeError: if called in Eager mode.
|
| trainable_variables
| Sequence of trainable variables owned by this module and its submodules.
|
| Note: this method uses reflection to find variables on the current instance
| and submodules. For performance reasons you may wish to cache the result
| of calling this method if you don't expect the return value to change.
|
| Returns:
| A sequence of variables for the current module (sorted by attribute
| name) followed by variables from all submodules recursively (breadth
| first).
|
| trainable_weights
| List of all trainable weights tracked by this layer.
|
| Trainable weights are updated via gradient descent during training.
|
| Returns:
| A list of trainable variables.
|
| updates
|
| variable_dtype
| Alias of `Layer.dtype`, the dtype of the weights.
|
| variables
| Returns the list of all layer variables/weights.
|
| Alias of `self.weights`.
|
| Note: This will not track the weights of nested `tf.Modules` that are
| not themselves Keras layers.
|
| Returns:
| A list of variables.
|
| weights
| Returns the list of all layer variables/weights.
|
| Returns:
| A list of variables.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from keras.engine.base_layer.Layer:
|
| activity_regularizer
| Optional regularizer function for the output of this layer.
|
| input_spec
| `InputSpec` instance(s) describing the input format for this layer.
|
| When you create a layer subclass, you can set `self.input_spec` to
| enable the layer to run input compatibility checks when it is called.
| Consider a `Conv2D` layer: it can only be called on a single input
| tensor of rank 4. As such, you can set, in `__init__()`:
|
| ```python
| self.input_spec = tf.keras.layers.InputSpec(ndim=4)
| ```
|
| Now, if you try to call the layer on an input that isn't rank 4
| (for instance, an input of shape `(2,)`, it will raise a
| nicely-formatted error:
|
| ```
| ValueError: Input 0 of layer conv2d is incompatible with the layer:
| expected ndim=4, found ndim=1. Full shape received: [2]
| ```
|
| Input checks that can be specified via `input_spec` include:
| - Structure (e.g. a single input, a list of 2 inputs, etc)
| - Shape
| - Rank (ndim)
| - Dtype
|
| For more information, see `tf.keras.layers.InputSpec`.
|
| Returns:
| A `tf.keras.layers.InputSpec` instance, or nested structure thereof.
|
| stateful
|
| supports_masking
| Whether this layer supports computing a mask using `compute_mask`.
|
| trainable
|
| ----------------------------------------------------------------------
| Class methods inherited from tensorflow.python.module.module.Module:
|
| with_name_scope(method) from builtins.type
| Decorator to automatically enter the module name scope.
|
| >>> class MyModule(tf.Module):
| ... @tf.Module.with_name_scope
| ... def __call__(self, x):
| ... if not hasattr(self, 'w'):
| ... self.w = tf.Variable(tf.random.normal([x.shape[1], 3]))
| ... return tf.matmul(x, self.w)
|
| Using the above module would produce `tf.Variable`s and `tf.Tensor`s whose
| names included the module name:
|
| >>> mod = MyModule()
| >>> mod(tf.ones([1, 2]))
| <tf.Tensor: shape=(1, 3), dtype=float32, numpy=..., dtype=float32)>
| >>> mod.w
| <tf.Variable 'my_module/Variable:0' shape=(2, 3) dtype=float32,
| numpy=..., dtype=float32)>
|
| Args:
| method: The method to wrap.
|
| Returns:
| The original method wrapped such that it enters the module's name scope.
|
| ----------------------------------------------------------------------
| Readonly properties inherited from tensorflow.python.module.module.Module:
|
| name_scope
| Returns a `tf.name_scope` instance for this class.
|
| submodules
| Sequence of all sub-modules.
|
| Submodules are modules which are properties of this module, or found as
| properties of modules which are properties of this module (and so on).
|
| >>> a = tf.Module()
| >>> b = tf.Module()
| >>> c = tf.Module()
| >>> a.b = b
| >>> b.c = c
| >>> list(a.submodules) == [b, c]
| True
| >>> list(b.submodules) == [c]
| True
| >>> list(c.submodules) == []
| True
|
| Returns:
| A sequence of all submodules.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from tensorflow.python.trackable.base.Trackable:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
Ok, there are quite a few parameters
to set. However, we are going to keep it light and breezy, setting a few of the things we’ve talked about: the number of filter
, the kernel size
, the activation function
and the shape of the input
which in our case is the shape of our images
.
model.add(Conv2D(filters, kernel_size, activation='relu', input_shape=data_shape))
Batch normalization layer#
As briefly addressed before, batch normalization can be very helpful: speed up the training, addresses internal covariate shift (highly debated), smoothes the loss function, etc. . It does so via re-centering
and re-scaling
the inputs of a given layer
. Thus, we are going to include batch normalization layers
also in our ANN
:
model.add(BatchNormalization())
As you can see, we added the batch normalization layer
right after the convolutional layer
so that the latter’s output will be re-centered
and re-scaled
.
Pooling layer#
Another important part of CNN architectures
is the pooling layer
, i.e. the layer
that reduces the spatial size
of the representation
computed in the previous layer
, i.e. convolutional layer
. In turn, we can reduce the amount of parameters
and thus computation our ANN
needs to perform. Out of the two pooling options, max pooling
and average pooling
, CNN
s typically utilize max pooling
because it helps to detect certain features
more easily and as the representation
becomes more abstract
also helps to reduce overfitting
. Sounds like a good idea, eh?
model.add(MaxPooling2D())
Getting more fine-grained#
In order to get our ANN
and the features
it works on more fine-grained, we will double the filter size
for the next step, i.e. layer
(s).
filters *= 2
Along this line of thought, we will repeat the succession of convolutional
, batch normalization
, pooling
and filter size increase
two more times:
model.add(Conv2D(filters, kernel_size, activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
filters *= 2
model.add(Conv2D(filters, kernel_size, activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
filters *= 2
Please note: we removed the input_shape
parameter from the Conv2D layers
as they are not input layers
.
It’s getting dense#
Now that we’ve sent our input
through several layers
aimed at obtaining representations
, it might be worth a try to think about we can achieve our supervised learning
goal. Given that we want to have a binary outcome
, i.e. eyes open
or eyes closed
, we want to classify
. We can achieve this via dense or fully connected layers (think about MLP
s again). However, for this to work, we need to add a flatten layer before that. The reason: even though we convoluted
and pooled
our input quite a bit, it’s still multidimensional
and we need it linear
to pass it through a dense/fully connected layer
.
model.add(Flatten())
Another thing we need to remember is regularization, that is we need to address overfitting
. A brief recap: given that our ANN
will have a large number of parameters
together with the universal function approximation theorem
, there’s for example the possibility that our ANN
will just “memorize” the dataset
without capturing the information
we want to obtain, thus failing to generalize
to new data
. And why that’s cool in theory (the memorizing
part, not the failed generalization
part), we obviously want to avoid that. Therefore, we need to apply regularization
via imposing constraints
on the ANN
’s parameters
or adapting the cost function
. One way to go would be the application of dropout layers that randomly
and temporally
set nodes
in our layers
to 0
, i.e. deleting
them during the training
.
model.add(Dropout(0.5))
The parameter
we added here, 0.5
, specifies the dropout rate
or in other words the fraction of the input units
, i.e. nodes
, to drop. This is a commonly applied value, but does not mean it should also be the default
!
Time to go dense
and start with our first respective layer
. As with the other layer
types, there a bunch of parameters
we can define:
help(Dense)
Help on class Dense in module tensorflow.python.keras.layers.core:
class Dense(tensorflow.python.keras.engine.base_layer.Layer)
| Dense(units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, **kwargs)
|
| Just your regular densely-connected NN layer.
|
| `Dense` implements the operation:
| `output = activation(dot(input, kernel) + bias)`
| where `activation` is the element-wise activation function
| passed as the `activation` argument, `kernel` is a weights matrix
| created by the layer, and `bias` is a bias vector created by the layer
| (only applicable if `use_bias` is `True`). These are all attributes of
| `Dense`.
|
| Note: If the input to the layer has a rank greater than 2, then `Dense`
| computes the dot product between the `inputs` and the `kernel` along the
| last axis of the `inputs` and axis 0 of the `kernel` (using `tf.tensordot`).
| For example, if input has dimensions `(batch_size, d0, d1)`,
| then we create a `kernel` with shape `(d1, units)`, and the `kernel` operates
| along axis 2 of the `input`, on every sub-tensor of shape `(1, 1, d1)`
| (there are `batch_size * d0` such sub-tensors).
| The output in this case will have shape `(batch_size, d0, units)`.
|
| Besides, layer attributes cannot be modified after the layer has been called
| once (except the `trainable` attribute).
| When a popular kwarg `input_shape` is passed, then keras will create
| an input layer to insert before the current layer. This can be treated
| equivalent to explicitly defining an `InputLayer`.
|
| Example:
|
| >>> # Create a `Sequential` model and add a Dense layer as the first layer.
| >>> model = tf.keras.models.Sequential()
| >>> model.add(tf.keras.Input(shape=(16,)))
| >>> model.add(tf.keras.layers.Dense(32, activation='relu'))
| >>> # Now the model will take as input arrays of shape (None, 16)
| >>> # and output arrays of shape (None, 32).
| >>> # Note that after the first layer, you don't need to specify
| >>> # the size of the input anymore:
| >>> model.add(tf.keras.layers.Dense(32))
| >>> model.output_shape
| (None, 32)
|
| Args:
| units: Positive integer, dimensionality of the output space.
| activation: Activation function to use.
| If you don't specify anything, no activation is applied
| (ie. "linear" activation: `a(x) = x`).
| use_bias: Boolean, whether the layer uses a bias vector.
| kernel_initializer: Initializer for the `kernel` weights matrix.
| bias_initializer: Initializer for the bias vector.
| kernel_regularizer: Regularizer function applied to
| the `kernel` weights matrix.
| bias_regularizer: Regularizer function applied to the bias vector.
| activity_regularizer: Regularizer function applied to
| the output of the layer (its "activation").
| kernel_constraint: Constraint function applied to
| the `kernel` weights matrix.
| bias_constraint: Constraint function applied to the bias vector.
|
| Input shape:
| N-D tensor with shape: `(batch_size, ..., input_dim)`.
| The most common situation would be
| a 2D input with shape `(batch_size, input_dim)`.
|
| Output shape:
| N-D tensor with shape: `(batch_size, ..., units)`.
| For instance, for a 2D input with shape `(batch_size, input_dim)`,
| the output would have shape `(batch_size, units)`.
|
| Method resolution order:
| Dense
| tensorflow.python.keras.engine.base_layer.Layer
| tensorflow.python.module.module.Module
| tensorflow.python.trackable.autotrackable.AutoTrackable
| tensorflow.python.trackable.base.Trackable
| tensorflow.python.keras.utils.version_utils.LayerVersionSelector
| builtins.object
|
| Methods defined here:
|
| __init__(self, units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, **kwargs)
|
| build(self, input_shape)
| Creates the variables of the layer (optional, for subclass implementers).
|
| This is a method that implementers of subclasses of `Layer` or `Model`
| can override if they need a state-creation step in-between
| layer instantiation and layer call.
|
| This is typically used to create the weights of `Layer` subclasses.
|
| Args:
| input_shape: Instance of `TensorShape`, or list of instances of
| `TensorShape` if the layer expects a list of inputs
| (one instance per input).
|
| call(self, inputs)
| This is where the layer's logic lives.
|
| Note here that `call()` method in `tf.keras` is little bit different
| from `keras` API. In `keras` API, you can pass support masking for
| layers as additional arguments. Whereas `tf.keras` has `compute_mask()`
| method to support masking.
|
| Args:
| inputs: Input tensor, or dict/list/tuple of input tensors.
| The first positional `inputs` argument is subject to special rules:
| - `inputs` must be explicitly passed. A layer cannot have zero
| arguments, and `inputs` cannot be provided via the default value
| of a keyword argument.
| - NumPy array or Python scalar values in `inputs` get cast as tensors.
| - Keras mask metadata is only collected from `inputs`.
| - Layers are built (`build(input_shape)` method)
| using shape info from `inputs` only.
| - `input_spec` compatibility is only checked against `inputs`.
| - Mixed precision input casting is only applied to `inputs`.
| If a layer has tensor arguments in `*args` or `**kwargs`, their
| casting behavior in mixed precision should be handled manually.
| - The SavedModel input specification is generated using `inputs` only.
| - Integration with various ecosystem packages like TFMOT, TFLite,
| TF.js, etc is only supported for `inputs` and not for tensors in
| positional and keyword arguments.
| *args: Additional positional arguments. May contain tensors, although
| this is not recommended, for the reasons above.
| **kwargs: Additional keyword arguments. May contain tensors, although
| this is not recommended, for the reasons above.
| The following optional keyword arguments are reserved:
| - `training`: Boolean scalar tensor of Python boolean indicating
| whether the `call` is meant for training or inference.
| - `mask`: Boolean input mask. If the layer's `call()` method takes a
| `mask` argument, its default value will be set to the mask generated
| for `inputs` by the previous layer (if `input` did come from a layer
| that generated a corresponding mask, i.e. if it came from a Keras
| layer with masking support).
|
| Returns:
| A tensor or list/tuple of tensors.
|
| compute_output_shape(self, input_shape)
| Computes the output shape of the layer.
|
| If the layer has not been built, this method will call `build` on the
| layer. This assumes that the layer will later be used with inputs that
| match the input shape provided here.
|
| Args:
| input_shape: Shape tuple (tuple of integers)
| or list of shape tuples (one per output tensor of the layer).
| Shape tuples can include None for free dimensions,
| instead of an integer.
|
| Returns:
| An input shape tuple.
|
| get_config(self)
| Returns the config of the layer.
|
| A layer config is a Python dictionary (serializable)
| containing the configuration of a layer.
| The same layer can be reinstantiated later
| (without its trained weights) from this configuration.
|
| The config of a layer does not include connectivity
| information, nor the layer class name. These are handled
| by `Network` (one layer of abstraction above).
|
| Note that `get_config()` does not guarantee to return a fresh copy of dict
| every time it is called. The callers should make a copy of the returned dict
| if they want to modify it.
|
| Returns:
| Python dictionary.
|
| ----------------------------------------------------------------------
| Methods inherited from tensorflow.python.keras.engine.base_layer.Layer:
|
| __call__(self, *args, **kwargs)
| Wraps `call`, applying pre- and post-processing steps.
|
| Args:
| *args: Positional arguments to be passed to `self.call`.
| **kwargs: Keyword arguments to be passed to `self.call`.
|
| Returns:
| Output tensor(s).
|
| Note:
| - The following optional keyword arguments are reserved for specific uses:
| * `training`: Boolean scalar tensor of Python boolean indicating
| whether the `call` is meant for training or inference.
| * `mask`: Boolean input mask.
| - If the layer's `call` method takes a `mask` argument (as some Keras
| layers do), its default value will be set to the mask generated
| for `inputs` by the previous layer (if `input` did come from
| a layer that generated a corresponding mask, i.e. if it came from
| a Keras layer with masking support.
| - If the layer is not built, the method will call `build`.
|
| Raises:
| ValueError: if the layer's `call` method returns None (an invalid value).
| RuntimeError: if `super().__init__()` was not called in the constructor.
|
| __delattr__(self, name)
| Implement delattr(self, name).
|
| __getstate__(self)
|
| __setattr__(self, name, value)
| Support self.foo = trackable syntax.
|
| __setstate__(self, state)
|
| add_loss(self, losses, **kwargs)
| Add loss tensor(s), potentially dependent on layer inputs.
|
| Some losses (for instance, activity regularization losses) may be dependent
| on the inputs passed when calling a layer. Hence, when reusing the same
| layer on different inputs `a` and `b`, some entries in `layer.losses` may
| be dependent on `a` and some on `b`. This method automatically keeps track
| of dependencies.
|
| This method can be used inside a subclassed layer or model's `call`
| function, in which case `losses` should be a Tensor or list of Tensors.
|
| Example:
|
| ```python
| class MyLayer(tf.keras.layers.Layer):
| def call(self, inputs):
| self.add_loss(tf.abs(tf.reduce_mean(inputs)))
| return inputs
| ```
|
| This method can also be called directly on a Functional Model during
| construction. In this case, any loss Tensors passed to this Model must
| be symbolic and be able to be traced back to the model's `Input`s. These
| losses become part of the model's topology and are tracked in `get_config`.
|
| Example:
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| # Activity regularization.
| model.add_loss(tf.abs(tf.reduce_mean(x)))
| ```
|
| If this is not the case for your loss (if, for example, your loss references
| a `Variable` of one of the model's layers), you can wrap your loss in a
| zero-argument lambda. These losses are not tracked as part of the model's
| topology since they can't be serialized.
|
| Example:
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| d = tf.keras.layers.Dense(10)
| x = d(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| # Weight regularization.
| model.add_loss(lambda: tf.reduce_mean(d.kernel))
| ```
|
| Args:
| losses: Loss tensor, or list/tuple of tensors. Rather than tensors, losses
| may also be zero-argument callables which create a loss tensor.
| **kwargs: Additional keyword arguments for backward compatibility.
| Accepted values:
| inputs - Deprecated, will be automatically inferred.
|
| add_metric(self, value, name=None, **kwargs)
| Adds metric tensor to the layer.
|
| This method can be used inside the `call()` method of a subclassed layer
| or model.
|
| ```python
| class MyMetricLayer(tf.keras.layers.Layer):
| def __init__(self):
| super(MyMetricLayer, self).__init__(name='my_metric_layer')
| self.mean = tf.keras.metrics.Mean(name='metric_1')
|
| def call(self, inputs):
| self.add_metric(self.mean(inputs))
| self.add_metric(tf.reduce_sum(inputs), name='metric_2')
| return inputs
| ```
|
| This method can also be called directly on a Functional Model during
| construction. In this case, any tensor passed to this Model must
| be symbolic and be able to be traced back to the model's `Input`s. These
| metrics become part of the model's topology and are tracked when you
| save the model via `save()`.
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| model.add_metric(math_ops.reduce_sum(x), name='metric_1')
| ```
|
| Note: Calling `add_metric()` with the result of a metric object on a
| Functional Model, as shown in the example below, is not supported. This is
| because we cannot trace the metric result tensor back to the model's inputs.
|
| ```python
| inputs = tf.keras.Input(shape=(10,))
| x = tf.keras.layers.Dense(10)(inputs)
| outputs = tf.keras.layers.Dense(1)(x)
| model = tf.keras.Model(inputs, outputs)
| model.add_metric(tf.keras.metrics.Mean()(x), name='metric_1')
| ```
|
| Args:
| value: Metric tensor.
| name: String metric name.
| **kwargs: Additional keyword arguments for backward compatibility.
| Accepted values:
| `aggregation` - When the `value` tensor provided is not the result of
| calling a `keras.Metric` instance, it will be aggregated by default
| using a `keras.Metric.Mean`.
|
| add_update(self, updates, inputs=None)
| Add update op(s), potentially dependent on layer inputs.
|
| Weight updates (for instance, the updates of the moving mean and variance
| in a BatchNormalization layer) may be dependent on the inputs passed
| when calling a layer. Hence, when reusing the same layer on
| different inputs `a` and `b`, some entries in `layer.updates` may be
| dependent on `a` and some on `b`. This method automatically keeps track
| of dependencies.
|
| This call is ignored when eager execution is enabled (in that case, variable
| updates are run on the fly and thus do not need to be tracked for later
| execution).
|
| Args:
| updates: Update op, or list/tuple of update ops, or zero-arg callable
| that returns an update op. A zero-arg callable should be passed in
| order to disable running the updates by setting `trainable=False`
| on this Layer, when executing in Eager mode.
| inputs: Deprecated, will be automatically inferred.
|
| add_variable(self, *args, **kwargs)
| Deprecated, do NOT use! Alias for `add_weight`.
|
| add_weight(self, name=None, shape=None, dtype=None, initializer=None, regularizer=None, trainable=None, constraint=None, use_resource=None, synchronization=<VariableSynchronization.AUTO: 0>, aggregation=<VariableAggregation.NONE: 0>, **kwargs)
| Adds a new variable to the layer.
|
| Args:
| name: Variable name.
| shape: Variable shape. Defaults to scalar if unspecified.
| dtype: The type of the variable. Defaults to `self.dtype`.
| initializer: Initializer instance (callable).
| regularizer: Regularizer instance (callable).
| trainable: Boolean, whether the variable should be part of the layer's
| "trainable_variables" (e.g. variables, biases)
| or "non_trainable_variables" (e.g. BatchNorm mean and variance).
| Note that `trainable` cannot be `True` if `synchronization`
| is set to `ON_READ`.
| constraint: Constraint instance (callable).
| use_resource: Whether to use `ResourceVariable`.
| synchronization: Indicates when a distributed a variable will be
| aggregated. Accepted values are constants defined in the class
| `tf.VariableSynchronization`. By default the synchronization is set to
| `AUTO` and the current `DistributionStrategy` chooses
| when to synchronize. If `synchronization` is set to `ON_READ`,
| `trainable` must not be set to `True`.
| aggregation: Indicates how a distributed variable will be aggregated.
| Accepted values are constants defined in the class
| `tf.VariableAggregation`.
| **kwargs: Additional keyword arguments. Accepted values are `getter`,
| `collections`, `experimental_autocast` and `caching_device`.
|
| Returns:
| The variable created.
|
| Raises:
| ValueError: When giving unsupported dtype and no initializer or when
| trainable has been set to True with synchronization set as `ON_READ`.
|
| apply(self, inputs, *args, **kwargs)
| Deprecated, do NOT use!
|
| This is an alias of `self.__call__`.
|
| Args:
| inputs: Input tensor(s).
| *args: additional positional arguments to be passed to `self.call`.
| **kwargs: additional keyword arguments to be passed to `self.call`.
|
| Returns:
| Output tensor(s).
|
| compute_mask(self, inputs, mask=None)
| Computes an output mask tensor.
|
| Args:
| inputs: Tensor or list of tensors.
| mask: Tensor or list of tensors.
|
| Returns:
| None or a tensor (or list of tensors,
| one per output tensor of the layer).
|
| compute_output_signature(self, input_signature)
| Compute the output tensor signature of the layer based on the inputs.
|
| Unlike a TensorShape object, a TensorSpec object contains both shape
| and dtype information for a tensor. This method allows layers to provide
| output dtype information if it is different from the input dtype.
| For any layer that doesn't implement this function,
| the framework will fall back to use `compute_output_shape`, and will
| assume that the output dtype matches the input dtype.
|
| Args:
| input_signature: Single TensorSpec or nested structure of TensorSpec
| objects, describing a candidate input for the layer.
|
| Returns:
| Single TensorSpec or nested structure of TensorSpec objects, describing
| how the layer would transform the provided input.
|
| Raises:
| TypeError: If input_signature contains a non-TensorSpec object.
|
| count_params(self)
| Count the total number of scalars composing the weights.
|
| Returns:
| An integer count.
|
| Raises:
| ValueError: if the layer isn't yet built
| (in which case its weights aren't yet defined).
|
| finalize_state(self)
| Finalizes the layers state after updating layer weights.
|
| This function can be subclassed in a layer and will be called after updating
| a layer weights. It can be overridden to finalize any additional layer state
| after a weight update.
|
| get_input_at(self, node_index)
| Retrieves the input tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first input node of the layer.
|
| Returns:
| A tensor (or list of tensors if the layer has multiple inputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_input_mask_at(self, node_index)
| Retrieves the input mask tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A mask tensor
| (or list of tensors if the layer has multiple inputs).
|
| get_input_shape_at(self, node_index)
| Retrieves the input shape(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A shape tuple
| (or list of shape tuples if the layer has multiple inputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_losses_for(self, inputs)
| Deprecated, do NOT use!
|
| Retrieves losses relevant to a specific set of inputs.
|
| Args:
| inputs: Input tensor or list/tuple of input tensors.
|
| Returns:
| List of loss tensors of the layer that depend on `inputs`.
|
| get_output_at(self, node_index)
| Retrieves the output tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first output node of the layer.
|
| Returns:
| A tensor (or list of tensors if the layer has multiple outputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_output_mask_at(self, node_index)
| Retrieves the output mask tensor(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A mask tensor
| (or list of tensors if the layer has multiple outputs).
|
| get_output_shape_at(self, node_index)
| Retrieves the output shape(s) of a layer at a given node.
|
| Args:
| node_index: Integer, index of the node
| from which to retrieve the attribute.
| E.g. `node_index=0` will correspond to the
| first time the layer was called.
|
| Returns:
| A shape tuple
| (or list of shape tuples if the layer has multiple outputs).
|
| Raises:
| RuntimeError: If called in Eager mode.
|
| get_updates_for(self, inputs)
| Deprecated, do NOT use!
|
| Retrieves updates relevant to a specific set of inputs.
|
| Args:
| inputs: Input tensor or list/tuple of input tensors.
|
| Returns:
| List of update ops of the layer that depend on `inputs`.
|
| get_weights(self)
| Returns the current weights of the layer, as NumPy arrays.
|
| The weights of a layer represent the state of the layer. This function
| returns both trainable and non-trainable weight values associated with this
| layer as a list of NumPy arrays, which can in turn be used to load state
| into similarly parameterized layers.
|
| For example, a `Dense` layer returns a list of two values: the kernel matrix
| and the bias vector. These can be used to set the weights of another
| `Dense` layer:
|
| >>> layer_a = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(1.))
| >>> a_out = layer_a(tf.convert_to_tensor([[1., 2., 3.]]))
| >>> layer_a.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(2.))
| >>> b_out = layer_b(tf.convert_to_tensor([[10., 20., 30.]]))
| >>> layer_b.get_weights()
| [array([[2.],
| [2.],
| [2.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b.set_weights(layer_a.get_weights())
| >>> layer_b.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
|
| Returns:
| Weights values as a list of NumPy arrays.
|
| set_weights(self, weights)
| Sets the weights of the layer, from NumPy arrays.
|
| The weights of a layer represent the state of the layer. This function
| sets the weight values from numpy arrays. The weight values should be
| passed in the order they are created by the layer. Note that the layer's
| weights must be instantiated before calling this function, by calling
| the layer.
|
| For example, a `Dense` layer returns a list of two values: the kernel matrix
| and the bias vector. These can be used to set the weights of another
| `Dense` layer:
|
| >>> layer_a = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(1.))
| >>> a_out = layer_a(tf.convert_to_tensor([[1., 2., 3.]]))
| >>> layer_a.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b = tf.keras.layers.Dense(1,
| ... kernel_initializer=tf.constant_initializer(2.))
| >>> b_out = layer_b(tf.convert_to_tensor([[10., 20., 30.]]))
| >>> layer_b.get_weights()
| [array([[2.],
| [2.],
| [2.]], dtype=float32), array([0.], dtype=float32)]
| >>> layer_b.set_weights(layer_a.get_weights())
| >>> layer_b.get_weights()
| [array([[1.],
| [1.],
| [1.]], dtype=float32), array([0.], dtype=float32)]
|
| Args:
| weights: a list of NumPy arrays. The number
| of arrays and their shape must match
| number of the dimensions of the weights
| of the layer (i.e. it should match the
| output of `get_weights`).
|
| Raises:
| ValueError: If the provided weights list does not match the
| layer's specifications.
|
| ----------------------------------------------------------------------
| Class methods inherited from tensorflow.python.keras.engine.base_layer.Layer:
|
| from_config(config) from builtins.type
| Creates a layer from its config.
|
| This method is the reverse of `get_config`,
| capable of instantiating the same layer from the config
| dictionary. It does not handle layer connectivity
| (handled by Network), nor weights (handled by `set_weights`).
|
| Args:
| config: A Python dictionary, typically the
| output of get_config.
|
| Returns:
| A layer instance.
|
| ----------------------------------------------------------------------
| Readonly properties inherited from tensorflow.python.keras.engine.base_layer.Layer:
|
| compute_dtype
| The dtype of the layer's computations.
|
| This is equivalent to `Layer.dtype_policy.compute_dtype`. Unless
| mixed precision is used, this is the same as `Layer.dtype`, the dtype of
| the weights.
|
| Layers automatically cast their inputs to the compute dtype, which causes
| computations and the output to be in the compute dtype as well. This is done
| by the base Layer class in `Layer.__call__`, so you do not have to insert
| these casts if implementing your own layer.
|
| Layers often perform certain internal computations in higher precision when
| `compute_dtype` is float16 or bfloat16 for numeric stability. The output
| will still typically be float16 or bfloat16 in such cases.
|
| Returns:
| The layer's compute dtype.
|
| dtype
| The dtype of the layer weights.
|
| This is equivalent to `Layer.dtype_policy.variable_dtype`. Unless
| mixed precision is used, this is the same as `Layer.compute_dtype`, the
| dtype of the layer's computations.
|
| dtype_policy
| The dtype policy associated with this layer.
|
| This is an instance of a `tf.keras.mixed_precision.Policy`.
|
| dynamic
| Whether the layer is dynamic (eager-only); set in the constructor.
|
| inbound_nodes
| Deprecated, do NOT use! Only for compatibility with external Keras.
|
| input
| Retrieves the input tensor(s) of a layer.
|
| Only applicable if the layer has exactly one input,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Input tensor or list of input tensors.
|
| Raises:
| RuntimeError: If called in Eager mode.
| AttributeError: If no inbound nodes are found.
|
| input_mask
| Retrieves the input mask tensor(s) of a layer.
|
| Only applicable if the layer has exactly one inbound node,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Input mask tensor (potentially None) or list of input
| mask tensors.
|
| Raises:
| AttributeError: if the layer is connected to
| more than one incoming layers.
|
| input_shape
| Retrieves the input shape(s) of a layer.
|
| Only applicable if the layer has exactly one input,
| i.e. if it is connected to one incoming layer, or if all inputs
| have the same shape.
|
| Returns:
| Input shape, as an integer shape tuple
| (or list of shape tuples, one tuple per input tensor).
|
| Raises:
| AttributeError: if the layer has no defined input_shape.
| RuntimeError: if called in Eager mode.
|
| losses
| List of losses added using the `add_loss()` API.
|
| Variable regularization tensors are created when this property is accessed,
| so it is eager safe: accessing `losses` under a `tf.GradientTape` will
| propagate gradients back to the corresponding variables.
|
| Examples:
|
| >>> class MyLayer(tf.keras.layers.Layer):
| ... def call(self, inputs):
| ... self.add_loss(tf.abs(tf.reduce_mean(inputs)))
| ... return inputs
| >>> l = MyLayer()
| >>> l(np.ones((10, 1)))
| >>> l.losses
| [1.0]
|
| >>> inputs = tf.keras.Input(shape=(10,))
| >>> x = tf.keras.layers.Dense(10)(inputs)
| >>> outputs = tf.keras.layers.Dense(1)(x)
| >>> model = tf.keras.Model(inputs, outputs)
| >>> # Activity regularization.
| >>> len(model.losses)
| 0
| >>> model.add_loss(tf.abs(tf.reduce_mean(x)))
| >>> len(model.losses)
| 1
|
| >>> inputs = tf.keras.Input(shape=(10,))
| >>> d = tf.keras.layers.Dense(10, kernel_initializer='ones')
| >>> x = d(inputs)
| >>> outputs = tf.keras.layers.Dense(1)(x)
| >>> model = tf.keras.Model(inputs, outputs)
| >>> # Weight regularization.
| >>> model.add_loss(lambda: tf.reduce_mean(d.kernel))
| >>> model.losses
| [<tf.Tensor: shape=(), dtype=float32, numpy=1.0>]
|
| Returns:
| A list of tensors.
|
| metrics
| List of metrics added using the `add_metric()` API.
|
| Example:
|
| >>> input = tf.keras.layers.Input(shape=(3,))
| >>> d = tf.keras.layers.Dense(2)
| >>> output = d(input)
| >>> d.add_metric(tf.reduce_max(output), name='max')
| >>> d.add_metric(tf.reduce_min(output), name='min')
| >>> [m.name for m in d.metrics]
| ['max', 'min']
|
| Returns:
| A list of `Metric` objects.
|
| name
| Name of the layer (string), set in the constructor.
|
| non_trainable_variables
| Sequence of non-trainable variables owned by this module and its submodules.
|
| Note: this method uses reflection to find variables on the current instance
| and submodules. For performance reasons you may wish to cache the result
| of calling this method if you don't expect the return value to change.
|
| Returns:
| A sequence of variables for the current module (sorted by attribute
| name) followed by variables from all submodules recursively (breadth
| first).
|
| non_trainable_weights
| List of all non-trainable weights tracked by this layer.
|
| Non-trainable weights are *not* updated during training. They are expected
| to be updated manually in `call()`.
|
| Returns:
| A list of non-trainable variables.
|
| outbound_nodes
| Deprecated, do NOT use! Only for compatibility with external Keras.
|
| output
| Retrieves the output tensor(s) of a layer.
|
| Only applicable if the layer has exactly one output,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Output tensor or list of output tensors.
|
| Raises:
| AttributeError: if the layer is connected to more than one incoming
| layers.
| RuntimeError: if called in Eager mode.
|
| output_mask
| Retrieves the output mask tensor(s) of a layer.
|
| Only applicable if the layer has exactly one inbound node,
| i.e. if it is connected to one incoming layer.
|
| Returns:
| Output mask tensor (potentially None) or list of output
| mask tensors.
|
| Raises:
| AttributeError: if the layer is connected to
| more than one incoming layers.
|
| output_shape
| Retrieves the output shape(s) of a layer.
|
| Only applicable if the layer has one output,
| or if all outputs have the same shape.
|
| Returns:
| Output shape, as an integer shape tuple
| (or list of shape tuples, one tuple per output tensor).
|
| Raises:
| AttributeError: if the layer has no defined output shape.
| RuntimeError: if called in Eager mode.
|
| trainable_variables
| Sequence of trainable variables owned by this module and its submodules.
|
| Note: this method uses reflection to find variables on the current instance
| and submodules. For performance reasons you may wish to cache the result
| of calling this method if you don't expect the return value to change.
|
| Returns:
| A sequence of variables for the current module (sorted by attribute
| name) followed by variables from all submodules recursively (breadth
| first).
|
| trainable_weights
| List of all trainable weights tracked by this layer.
|
| Trainable weights are updated via gradient descent during training.
|
| Returns:
| A list of trainable variables.
|
| updates
|
| variable_dtype
| Alias of `Layer.dtype`, the dtype of the weights.
|
| variables
| Returns the list of all layer variables/weights.
|
| Alias of `self.weights`.
|
| Note: This will not track the weights of nested `tf.Modules` that are not
| themselves Keras layers.
|
| Returns:
| A list of variables.
|
| weights
| Returns the list of all layer variables/weights.
|
| Returns:
| A list of variables.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from tensorflow.python.keras.engine.base_layer.Layer:
|
| activity_regularizer
| Optional regularizer function for the output of this layer.
|
| input_spec
| `InputSpec` instance(s) describing the input format for this layer.
|
| When you create a layer subclass, you can set `self.input_spec` to enable
| the layer to run input compatibility checks when it is called.
| Consider a `Conv2D` layer: it can only be called on a single input tensor
| of rank 4. As such, you can set, in `__init__()`:
|
| ```python
| self.input_spec = tf.keras.layers.InputSpec(ndim=4)
| ```
|
| Now, if you try to call the layer on an input that isn't rank 4
| (for instance, an input of shape `(2,)`, it will raise a nicely-formatted
| error:
|
| ```
| ValueError: Input 0 of layer conv2d is incompatible with the layer:
| expected ndim=4, found ndim=1. Full shape received: [2]
| ```
|
| Input checks that can be specified via `input_spec` include:
| - Structure (e.g. a single input, a list of 2 inputs, etc)
| - Shape
| - Rank (ndim)
| - Dtype
|
| For more information, see `tf.keras.layers.InputSpec`.
|
| Returns:
| A `tf.keras.layers.InputSpec` instance, or nested structure thereof.
|
| stateful
|
| supports_masking
| Whether this layer supports computing a mask using `compute_mask`.
|
| trainable
|
| ----------------------------------------------------------------------
| Class methods inherited from tensorflow.python.module.module.Module:
|
| with_name_scope(method) from builtins.type
| Decorator to automatically enter the module name scope.
|
| >>> class MyModule(tf.Module):
| ... @tf.Module.with_name_scope
| ... def __call__(self, x):
| ... if not hasattr(self, 'w'):
| ... self.w = tf.Variable(tf.random.normal([x.shape[1], 3]))
| ... return tf.matmul(x, self.w)
|
| Using the above module would produce `tf.Variable`s and `tf.Tensor`s whose
| names included the module name:
|
| >>> mod = MyModule()
| >>> mod(tf.ones([1, 2]))
| <tf.Tensor: shape=(1, 3), dtype=float32, numpy=..., dtype=float32)>
| >>> mod.w
| <tf.Variable 'my_module/Variable:0' shape=(2, 3) dtype=float32,
| numpy=..., dtype=float32)>
|
| Args:
| method: The method to wrap.
|
| Returns:
| The original method wrapped such that it enters the module's name scope.
|
| ----------------------------------------------------------------------
| Readonly properties inherited from tensorflow.python.module.module.Module:
|
| name_scope
| Returns a `tf.name_scope` instance for this class.
|
| submodules
| Sequence of all sub-modules.
|
| Submodules are modules which are properties of this module, or found as
| properties of modules which are properties of this module (and so on).
|
| >>> a = tf.Module()
| >>> b = tf.Module()
| >>> c = tf.Module()
| >>> a.b = b
| >>> b.c = c
| >>> list(a.submodules) == [b, c]
| True
| >>> list(b.submodules) == [c]
| True
| >>> list(c.submodules) == []
| True
|
| Returns:
| A sequence of all submodules.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from tensorflow.python.trackable.base.Trackable:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Static methods inherited from tensorflow.python.keras.utils.version_utils.LayerVersionSelector:
|
| __new__(cls, *args, **kwargs)
| Create and return a new object. See help(type) for accurate signature.
For now, we will focus on the output size
/dimensionality
of the layer
, the activation function
, as well as the kernel
and bias initializers
. As every node
in a dense/fully connected layer
will receive input
from all nodes
of the previous layer
one of its drawbacks is computation time
based on the amount of parameters
. However, due to its underlying matrx-vector multiplication
and output in n dimensional
vectors, we can use it to change the dimensions
of the vector
, downscaling it from the multidimensional input
it receives from the convolutional layer(s)
. Here, we will set it to 1024
. The activation function
might be old news to you now, but just to be sure: our dense/fully connected layer
will have a non-linear activation function
, specifically, ReLu
. Based on that, we can also choose a kernel initializer
that is optimized
for this activation function
: Kaming He initialization. The bias initializers
will be set to zeros
following common practice backed up by various studies.
model.add(Dense(1024, activation='relu', kernel_initializer='he_normal', bias_initializer='zeros'))
Before we go to the next dense/fully connected layer
, we will integrate a few of the things we talked about again. Namely, batch normalization
and dropout layers
.
model.add(BatchNormalization())
model.add(Dropout(0.5))
To further reduce the number of dimensions
for our final, i.e. the output
, layer
, we will a create a short succession as we’ve done with the convolutional layers
via repeating the dense/fully connected - batch normalization - dropout layer
sequence two times, each time reducing the dimensions
of the output
by a factor of 4
:
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
It’s the final countdown….sorry, layer#
We’ve reached the end of our ANN
, the output layer
. Again: we are confronted with a supervised learning problem
within which we want to train our ANN
to perform a binary classification
between eyes closed
and eyes open
. Thus, the final layer
, will be a dense/fully connected layer
again, which has as many outputs
as we have classes: 2
. Additionally, we change the activation function
to softmax
so that we will obtain a normalized probability distribution
with values ranging between 0
and 1
, indicating the probability of belonging to either class
. These will then be compared to the true labels
.
model.add(Dense(n_classes, activation='softmax'))
There’s a steep learning curve when you curve without learning#
While all of this is definitely amazing and already hard to comprehend (at least for me), one, actually THE ONE, aspect of machine/deep learning
is missing: we haven’t told our ANN
how it should learn
. In more detail, we need to tell our ANN
how it should compare the probabilities
computed in the output layer
to the true labels
and learn
via a loss function
and an optimizer
to minimize the respective error
. Given our learning problem
and dataset
, we will go rather “classic” and use accuracy
as our metric
, sparse_categorical_crossentropy
as our loss function
and adam
as our optimizer
. Importantly, these parameters
will be defined during the compile
step which will finally build
our ANN
.
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
We know how it feels…

A fresh start#
We really dragged this one out, didn’t we? Sorry folks, we thought it might be a good idea to really go step by step…To, however, maybe see everything at once, we will do a version with all the necessary code
in one cell
.
n_classes = 2
filters = 32
kernel_size = (3, 3)
model = Sequential()
model.add(Conv2D(filters, kernel_size, activation='relu', input_shape=data_shape))
model.add(BatchNormalization())
model.add(MaxPooling2D())
filters *= 2
model.add(Conv2D(filters, kernel_size, activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
filters *= 2
model.add(Conv2D(filters, kernel_size, activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
filters *= 2
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(1024, activation='relu', kernel_initializer='he_normal', bias_initializer='zeros'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(n_classes, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Still, that’s a lot. Isn’t there an easy way to check things more conveniently?
There is or more precisely: there are, because of more than one option to do so.
The first one is rather simple. Our ANN
has a .summary()
function which will provide us with a nice overview as well as details about its architecture
. (This is also a great way to check out pre-trained models.)
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
module_wrapper_12 (ModuleWra (None, 38, 54, 32) 896
_________________________________________________________________
module_wrapper_13 (ModuleWra (None, 38, 54, 32) 128
_________________________________________________________________
module_wrapper_14 (ModuleWra (None, 19, 27, 32) 0
_________________________________________________________________
module_wrapper_15 (ModuleWra (None, 17, 25, 64) 18496
_________________________________________________________________
module_wrapper_16 (ModuleWra (None, 17, 25, 64) 256
_________________________________________________________________
module_wrapper_17 (ModuleWra (None, 8, 12, 64) 0
_________________________________________________________________
module_wrapper_18 (ModuleWra (None, 6, 10, 128) 73856
_________________________________________________________________
module_wrapper_19 (ModuleWra (None, 6, 10, 128) 512
_________________________________________________________________
module_wrapper_20 (ModuleWra (None, 3, 5, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1920) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 1920) 0
_________________________________________________________________
dense_4 (Dense) (None, 1024) 1967104
_________________________________________________________________
module_wrapper_21 (ModuleWra (None, 1024) 4096
_________________________________________________________________
dropout_5 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_5 (Dense) (None, 256) 262400
_________________________________________________________________
module_wrapper_22 (ModuleWra (None, 256) 1024
_________________________________________________________________
dropout_6 (Dropout) (None, 256) 0
_________________________________________________________________
dense_6 (Dense) (None, 64) 16448
_________________________________________________________________
module_wrapper_23 (ModuleWra (None, 64) 256
_________________________________________________________________
dropout_7 (Dropout) (None, 64) 0
_________________________________________________________________
dense_7 (Dense) (None, 2) 130
=================================================================
Total params: 2,345,602
Trainable params: 2,342,466
Non-trainable params: 3,136
_________________________________________________________________
We can nicely see all of our layers
and their dimensions
, as well as they change along the ANN
and with them, the respective representations
.
One thing we haven’t really talked about so far for our ANN
but which becomes abundantly clear here: the high number of parameters
: 2,345,602
. Kinda wild, isn’t it? Especially considering that our ANN
isn’t “that complex”. Others have waaaaaay more…

Another cool option to inspect our ANN
is to use tensorboard which will evaluate after the next step.
How to train your network#
As some might say: this is where the real fun starts. We have built and checked our ANN
. Now, it’s time to let it learn
. Comparable to the models
we utilized in the first part of the workshop, the “classic” machine learning
models, we need to fit it in order to train
it. Or more accurately: to let it learn representations
that are helpful to achieve its given task
. Going back to the previous section, we discussed two important parameters
we can define for this endeavor: the epochs and the batch size.
A brief recap:
an epoch refers to one cycle through the entire
training dataset
, i.e. ourANN
went through the entiretraining dataset
once. Thus, the number ofepochs
describes how often theANN
worked through the entiretraining dataset
during thefitting
.a batch refers to the number of
samples
theANN
goes through before it will update itsweights
based on the combination ofmetric
,loss function
andoptimizer
. Thus, the number ofbatches
defines how often theweights
are updated during anepoch
.
Both epoch
and batch
are thus parameters
for the learning
and not parameters
obtained by learning
.
In order to apply it to and understand it based on our example dataset
we need to define a training
and test dataset
as we did before. (The same things about training
, testing
and validating
we talked about during "classic" machine learning
also hold true here.) We can use our old friend scikit-learn
for this.
We define our y
based on our labels
, simply converting it them to true
for eyes open
and false
for eyes closed
:
y = labels =='open'
y.shape
(384,)
y[:10]
array([False, False, False, False, True, True, True, True, False,
False])
With that we can split our dataset
into training
and test
sets:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(data, y, test_size=0.2, random_state=0, shuffle=False)
print('Shapes of X:', X_train.shape, X_test.shape)
print('Shapes of y:', y_train.shape, y_test.shape)
Shapes of X: (307, 40, 56, 3) (77, 40, 56, 3)
Shapes of y: (307,) (77,)
Ok, we got 307
samples
in the train
and 77
samples
in the test set
.
Back to epochs
and batches
: if we set our batch size
to e.g. 32
and our epochs
to e.g. 125
, it would mean we have 32 batches
within each of 125 epochs
. So, the ANN
would go through ~9 images
(307 test images/32 batches
, some batches
will have more images
than others) before updating its weights
. Also, the ANN
will go through the entire training dataset
125 times
and thus through 4000 batches
. Please note: while this sounds like a lot, the number of epochs
is usually waaaaay higher, in the hundreds and thousands! However, once more: within our setting here and the computional resources
we have, we have to keep it short. Additionally, determining the “correct” number of batches
and epochs
is far from being easy and may even present an ill-posed question. That being said, we will use the example sizes we went through.
batch_size = 32
nEpochs = 125
Folks, it’s finally time to train
our ANN
and let it learn
. To keep of the things that happen, will set a few things so that we can utilize tensorboard
later on. For this to work we need to load the respective jupyter extension, define a directory to where we can save the logs
of the training
and the define the so-called callback
which will be included in the .fit()
function.
%load_ext tensorboard
import datetime, os
import tensorflow as tf
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
To kick things off, we use the .fit
function of our model
and start the training
.
%time fit = model.fit(X_train, y_train, epochs=nEpochs, batch_size=batch_size, validation_split=0.2, callbacks=[tensorboard_callback])
WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... Layer ModuleWrapper has arguments in `__init__` and therefore must override `get_config`.
Epoch 1/125
8/8 [==============================] - 3s 148ms/step - loss: 1.1988 - accuracy: 0.5224 - val_loss: 0.6909 - val_accuracy: 0.5000
Epoch 2/125
8/8 [==============================] - 1s 93ms/step - loss: 1.0281 - accuracy: 0.5633 - val_loss: 0.6865 - val_accuracy: 0.5323
Epoch 3/125
8/8 [==============================] - 1s 106ms/step - loss: 1.0745 - accuracy: 0.5837 - val_loss: 0.7057 - val_accuracy: 0.5000
Epoch 4/125
8/8 [==============================] - 1s 103ms/step - loss: 1.0321 - accuracy: 0.5429 - val_loss: 0.7426 - val_accuracy: 0.5000
Epoch 5/125
8/8 [==============================] - 1s 106ms/step - loss: 0.9214 - accuracy: 0.5837 - val_loss: 0.7860 - val_accuracy: 0.5000
Epoch 6/125
8/8 [==============================] - 1s 92ms/step - loss: 0.8949 - accuracy: 0.6163 - val_loss: 0.8379 - val_accuracy: 0.5000
Epoch 7/125
8/8 [==============================] - 1s 88ms/step - loss: 0.9346 - accuracy: 0.6204 - val_loss: 0.8406 - val_accuracy: 0.5000
Epoch 8/125
8/8 [==============================] - 1s 87ms/step - loss: 0.6763 - accuracy: 0.6816 - val_loss: 0.8114 - val_accuracy: 0.5000
Epoch 9/125
8/8 [==============================] - 1s 85ms/step - loss: 0.7967 - accuracy: 0.6694 - val_loss: 0.7574 - val_accuracy: 0.5000
Epoch 10/125
8/8 [==============================] - 1s 91ms/step - loss: 0.7194 - accuracy: 0.6735 - val_loss: 0.8152 - val_accuracy: 0.5000
Epoch 11/125
8/8 [==============================] - 1s 148ms/step - loss: 0.6206 - accuracy: 0.7388 - val_loss: 0.9906 - val_accuracy: 0.5000
Epoch 12/125
8/8 [==============================] - 1s 108ms/step - loss: 0.4687 - accuracy: 0.7918 - val_loss: 1.0829 - val_accuracy: 0.5000
Epoch 13/125
8/8 [==============================] - 1s 132ms/step - loss: 0.5812 - accuracy: 0.7551 - val_loss: 1.0731 - val_accuracy: 0.5000
Epoch 14/125
8/8 [==============================] - 1s 185ms/step - loss: 0.6212 - accuracy: 0.7388 - val_loss: 0.9336 - val_accuracy: 0.5000
Epoch 15/125
8/8 [==============================] - 1s 134ms/step - loss: 0.3744 - accuracy: 0.8449 - val_loss: 0.8622 - val_accuracy: 0.5000
Epoch 16/125
8/8 [==============================] - 1s 128ms/step - loss: 0.3555 - accuracy: 0.8571 - val_loss: 0.8386 - val_accuracy: 0.5000
Epoch 17/125
8/8 [==============================] - 1s 156ms/step - loss: 0.2891 - accuracy: 0.8612 - val_loss: 0.8144 - val_accuracy: 0.5000
Epoch 18/125
8/8 [==============================] - 1s 145ms/step - loss: 0.3114 - accuracy: 0.8531 - val_loss: 0.7844 - val_accuracy: 0.5000
Epoch 19/125
8/8 [==============================] - 1s 124ms/step - loss: 0.2898 - accuracy: 0.9020 - val_loss: 0.8410 - val_accuracy: 0.5000
Epoch 20/125
8/8 [==============================] - 2s 200ms/step - loss: 0.2093 - accuracy: 0.9102 - val_loss: 0.7736 - val_accuracy: 0.5000
Epoch 21/125
8/8 [==============================] - 1s 153ms/step - loss: 0.2491 - accuracy: 0.8980 - val_loss: 0.6871 - val_accuracy: 0.5484
Epoch 22/125
8/8 [==============================] - 1s 92ms/step - loss: 0.2191 - accuracy: 0.9184 - val_loss: 0.6331 - val_accuracy: 0.6290
Epoch 23/125
8/8 [==============================] - 1s 161ms/step - loss: 0.1634 - accuracy: 0.9306 - val_loss: 0.6182 - val_accuracy: 0.6774
Epoch 24/125
8/8 [==============================] - 1s 149ms/step - loss: 0.1767 - accuracy: 0.9388 - val_loss: 0.6489 - val_accuracy: 0.6290
Epoch 25/125
8/8 [==============================] - 1s 99ms/step - loss: 0.1296 - accuracy: 0.9265 - val_loss: 0.7525 - val_accuracy: 0.5645
Epoch 26/125
8/8 [==============================] - 1s 108ms/step - loss: 0.1152 - accuracy: 0.9551 - val_loss: 0.7848 - val_accuracy: 0.5645
Epoch 27/125
8/8 [==============================] - 1s 94ms/step - loss: 0.1103 - accuracy: 0.9551 - val_loss: 0.7708 - val_accuracy: 0.5645
Epoch 28/125
8/8 [==============================] - 1s 98ms/step - loss: 0.1294 - accuracy: 0.9592 - val_loss: 0.7075 - val_accuracy: 0.5968
Epoch 29/125
8/8 [==============================] - 1s 94ms/step - loss: 0.1243 - accuracy: 0.9388 - val_loss: 0.6703 - val_accuracy: 0.6452
Epoch 30/125
8/8 [==============================] - 1s 94ms/step - loss: 0.0901 - accuracy: 0.9755 - val_loss: 0.6868 - val_accuracy: 0.6452
Epoch 31/125
8/8 [==============================] - 1s 95ms/step - loss: 0.1340 - accuracy: 0.9469 - val_loss: 0.6536 - val_accuracy: 0.6774
Epoch 32/125
8/8 [==============================] - 1s 94ms/step - loss: 0.0608 - accuracy: 0.9837 - val_loss: 0.6334 - val_accuracy: 0.7258
Epoch 33/125
8/8 [==============================] - 1s 98ms/step - loss: 0.0364 - accuracy: 0.9959 - val_loss: 0.6279 - val_accuracy: 0.7258
Epoch 34/125
8/8 [==============================] - 1s 94ms/step - loss: 0.0969 - accuracy: 0.9551 - val_loss: 0.6540 - val_accuracy: 0.7419
Epoch 35/125
8/8 [==============================] - 1s 94ms/step - loss: 0.0781 - accuracy: 0.9878 - val_loss: 0.6256 - val_accuracy: 0.7581
Epoch 36/125
8/8 [==============================] - 1s 91ms/step - loss: 0.0574 - accuracy: 0.9837 - val_loss: 0.5470 - val_accuracy: 0.7419
Epoch 37/125
8/8 [==============================] - 1s 93ms/step - loss: 0.0514 - accuracy: 0.9959 - val_loss: 0.5013 - val_accuracy: 0.7581
Epoch 38/125
8/8 [==============================] - 1s 90ms/step - loss: 0.0581 - accuracy: 0.9673 - val_loss: 0.4893 - val_accuracy: 0.7903
Epoch 39/125
8/8 [==============================] - 1s 89ms/step - loss: 0.0609 - accuracy: 0.9673 - val_loss: 0.4950 - val_accuracy: 0.7581
Epoch 40/125
8/8 [==============================] - 1s 102ms/step - loss: 0.0958 - accuracy: 0.9510 - val_loss: 0.4732 - val_accuracy: 0.7903
Epoch 41/125
8/8 [==============================] - 1s 102ms/step - loss: 0.0256 - accuracy: 1.0000 - val_loss: 0.4667 - val_accuracy: 0.7742
Epoch 42/125
8/8 [==============================] - 1s 134ms/step - loss: 0.0509 - accuracy: 0.9755 - val_loss: 0.4755 - val_accuracy: 0.7742
Epoch 43/125
8/8 [==============================] - 1s 121ms/step - loss: 0.0999 - accuracy: 0.9551 - val_loss: 0.4992 - val_accuracy: 0.7581
Epoch 44/125
8/8 [==============================] - 1s 129ms/step - loss: 0.0894 - accuracy: 0.9673 - val_loss: 0.5082 - val_accuracy: 0.7742
Epoch 45/125
8/8 [==============================] - 1s 138ms/step - loss: 0.0438 - accuracy: 0.9837 - val_loss: 0.5381 - val_accuracy: 0.7581
Epoch 46/125
8/8 [==============================] - 1s 113ms/step - loss: 0.0239 - accuracy: 0.9959 - val_loss: 0.5523 - val_accuracy: 0.7581
Epoch 47/125
8/8 [==============================] - 1s 121ms/step - loss: 0.0557 - accuracy: 0.9796 - val_loss: 0.5675 - val_accuracy: 0.7581
Epoch 48/125
8/8 [==============================] - 1s 121ms/step - loss: 0.0460 - accuracy: 0.9796 - val_loss: 0.6193 - val_accuracy: 0.6935
Epoch 49/125
8/8 [==============================] - 1s 128ms/step - loss: 0.0349 - accuracy: 0.9878 - val_loss: 0.6213 - val_accuracy: 0.7258
Epoch 50/125
8/8 [==============================] - 1s 118ms/step - loss: 0.0465 - accuracy: 0.9878 - val_loss: 0.5724 - val_accuracy: 0.7742
Epoch 51/125
8/8 [==============================] - 1s 117ms/step - loss: 0.0545 - accuracy: 0.9796 - val_loss: 0.5927 - val_accuracy: 0.7742
Epoch 52/125
8/8 [==============================] - 1s 107ms/step - loss: 0.0584 - accuracy: 0.9878 - val_loss: 0.6230 - val_accuracy: 0.7419
Epoch 53/125
8/8 [==============================] - 1s 110ms/step - loss: 0.0740 - accuracy: 0.9633 - val_loss: 0.6691 - val_accuracy: 0.7419
Epoch 54/125
8/8 [==============================] - 1s 96ms/step - loss: 0.0392 - accuracy: 0.9918 - val_loss: 0.6813 - val_accuracy: 0.7419
Epoch 55/125
8/8 [==============================] - 1s 101ms/step - loss: 0.0727 - accuracy: 0.9714 - val_loss: 0.6525 - val_accuracy: 0.7419
Epoch 56/125
8/8 [==============================] - 1s 111ms/step - loss: 0.0252 - accuracy: 0.9918 - val_loss: 0.6955 - val_accuracy: 0.7581
Epoch 57/125
8/8 [==============================] - 1s 92ms/step - loss: 0.0367 - accuracy: 0.9918 - val_loss: 0.7208 - val_accuracy: 0.7419
Epoch 58/125
8/8 [==============================] - 1s 88ms/step - loss: 0.0282 - accuracy: 0.9918 - val_loss: 0.6846 - val_accuracy: 0.7258
Epoch 59/125
8/8 [==============================] - 1s 90ms/step - loss: 0.0570 - accuracy: 0.9796 - val_loss: 0.6084 - val_accuracy: 0.7581
Epoch 60/125
8/8 [==============================] - 1s 103ms/step - loss: 0.0489 - accuracy: 0.9796 - val_loss: 0.6389 - val_accuracy: 0.7581
Epoch 61/125
8/8 [==============================] - 1s 96ms/step - loss: 0.0592 - accuracy: 0.9714 - val_loss: 0.6275 - val_accuracy: 0.7742
Epoch 62/125
8/8 [==============================] - 1s 89ms/step - loss: 0.0359 - accuracy: 0.9837 - val_loss: 0.6744 - val_accuracy: 0.7581
Epoch 63/125
8/8 [==============================] - 1s 109ms/step - loss: 0.0191 - accuracy: 0.9918 - val_loss: 0.7881 - val_accuracy: 0.6774
Epoch 64/125
8/8 [==============================] - 1s 99ms/step - loss: 0.0253 - accuracy: 0.9918 - val_loss: 0.7989 - val_accuracy: 0.6935
Epoch 65/125
8/8 [==============================] - 1s 107ms/step - loss: 0.0373 - accuracy: 0.9878 - val_loss: 0.7248 - val_accuracy: 0.7903
Epoch 66/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0294 - accuracy: 0.9918 - val_loss: 0.6788 - val_accuracy: 0.7903
Epoch 67/125
8/8 [==============================] - 1s 109ms/step - loss: 0.0592 - accuracy: 0.9878 - val_loss: 0.6800 - val_accuracy: 0.7581
Epoch 68/125
8/8 [==============================] - 1s 98ms/step - loss: 0.0312 - accuracy: 0.9837 - val_loss: 0.7186 - val_accuracy: 0.7742
Epoch 69/125
8/8 [==============================] - 1s 112ms/step - loss: 0.0266 - accuracy: 0.9918 - val_loss: 0.7660 - val_accuracy: 0.7581
Epoch 70/125
8/8 [==============================] - 1s 137ms/step - loss: 0.0552 - accuracy: 0.9755 - val_loss: 0.7835 - val_accuracy: 0.7742
Epoch 71/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0501 - accuracy: 0.9837 - val_loss: 0.7226 - val_accuracy: 0.7742
Epoch 72/125
8/8 [==============================] - 1s 108ms/step - loss: 0.0372 - accuracy: 0.9878 - val_loss: 0.6700 - val_accuracy: 0.7581
Epoch 73/125
8/8 [==============================] - 1s 111ms/step - loss: 0.0158 - accuracy: 0.9959 - val_loss: 0.6627 - val_accuracy: 0.7581
Epoch 74/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0235 - accuracy: 0.9959 - val_loss: 0.7018 - val_accuracy: 0.7581
Epoch 75/125
8/8 [==============================] - 1s 96ms/step - loss: 0.0122 - accuracy: 0.9959 - val_loss: 0.7287 - val_accuracy: 0.7419
Epoch 76/125
8/8 [==============================] - 1s 102ms/step - loss: 0.0141 - accuracy: 1.0000 - val_loss: 0.7926 - val_accuracy: 0.7419
Epoch 77/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0104 - accuracy: 1.0000 - val_loss: 0.8550 - val_accuracy: 0.7258
Epoch 78/125
8/8 [==============================] - 1s 106ms/step - loss: 0.0319 - accuracy: 0.9837 - val_loss: 0.8847 - val_accuracy: 0.7419
Epoch 79/125
8/8 [==============================] - 1s 95ms/step - loss: 0.0310 - accuracy: 0.9878 - val_loss: 0.8905 - val_accuracy: 0.7742
Epoch 80/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0092 - accuracy: 0.9959 - val_loss: 0.8897 - val_accuracy: 0.7742
Epoch 81/125
8/8 [==============================] - 1s 93ms/step - loss: 0.0167 - accuracy: 1.0000 - val_loss: 0.8847 - val_accuracy: 0.7742
Epoch 82/125
8/8 [==============================] - 1s 102ms/step - loss: 0.0226 - accuracy: 0.9959 - val_loss: 0.8840 - val_accuracy: 0.7742
Epoch 83/125
8/8 [==============================] - 1s 107ms/step - loss: 0.0138 - accuracy: 0.9959 - val_loss: 0.8748 - val_accuracy: 0.7581
Epoch 84/125
8/8 [==============================] - 1s 119ms/step - loss: 0.0698 - accuracy: 0.9673 - val_loss: 0.9306 - val_accuracy: 0.7419
Epoch 85/125
8/8 [==============================] - 1s 93ms/step - loss: 0.0145 - accuracy: 0.9959 - val_loss: 0.9167 - val_accuracy: 0.7419
Epoch 86/125
8/8 [==============================] - 1s 105ms/step - loss: 0.0323 - accuracy: 0.9878 - val_loss: 0.8489 - val_accuracy: 0.7581
Epoch 87/125
8/8 [==============================] - 1s 144ms/step - loss: 0.0322 - accuracy: 0.9878 - val_loss: 0.9211 - val_accuracy: 0.8065
Epoch 88/125
8/8 [==============================] - 1s 120ms/step - loss: 0.0432 - accuracy: 0.9918 - val_loss: 1.0342 - val_accuracy: 0.7742
Epoch 89/125
8/8 [==============================] - 1s 129ms/step - loss: 0.0446 - accuracy: 0.9918 - val_loss: 1.1327 - val_accuracy: 0.7742
Epoch 90/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0413 - accuracy: 0.9878 - val_loss: 1.0143 - val_accuracy: 0.7903
Epoch 91/125
8/8 [==============================] - 1s 117ms/step - loss: 0.0378 - accuracy: 0.9878 - val_loss: 0.6984 - val_accuracy: 0.8387
Epoch 92/125
8/8 [==============================] - 1s 115ms/step - loss: 0.0238 - accuracy: 0.9878 - val_loss: 0.5896 - val_accuracy: 0.8548
Epoch 93/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0112 - accuracy: 0.9959 - val_loss: 0.6759 - val_accuracy: 0.7903
Epoch 94/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0688 - accuracy: 0.9673 - val_loss: 0.7024 - val_accuracy: 0.8226
Epoch 95/125
8/8 [==============================] - 1s 114ms/step - loss: 0.0224 - accuracy: 0.9918 - val_loss: 0.7362 - val_accuracy: 0.8548
Epoch 96/125
8/8 [==============================] - 1s 106ms/step - loss: 0.0420 - accuracy: 0.9837 - val_loss: 0.7038 - val_accuracy: 0.8548
Epoch 97/125
8/8 [==============================] - 1s 104ms/step - loss: 0.0420 - accuracy: 0.9918 - val_loss: 0.7143 - val_accuracy: 0.8387
Epoch 98/125
8/8 [==============================] - 1s 99ms/step - loss: 0.0291 - accuracy: 0.9878 - val_loss: 0.7362 - val_accuracy: 0.8387
Epoch 99/125
8/8 [==============================] - 1s 109ms/step - loss: 0.0224 - accuracy: 0.9918 - val_loss: 0.7105 - val_accuracy: 0.8710
Epoch 100/125
8/8 [==============================] - 1s 103ms/step - loss: 0.0960 - accuracy: 0.9673 - val_loss: 0.7976 - val_accuracy: 0.8387
Epoch 101/125
8/8 [==============================] - 1s 95ms/step - loss: 0.0423 - accuracy: 0.9878 - val_loss: 0.7529 - val_accuracy: 0.8548
Epoch 102/125
8/8 [==============================] - 1s 98ms/step - loss: 0.0225 - accuracy: 0.9918 - val_loss: 0.8484 - val_accuracy: 0.7903
Epoch 103/125
8/8 [==============================] - 1s 109ms/step - loss: 0.0210 - accuracy: 0.9918 - val_loss: 0.9916 - val_accuracy: 0.7097
Epoch 104/125
8/8 [==============================] - 1s 108ms/step - loss: 0.0431 - accuracy: 0.9878 - val_loss: 0.9552 - val_accuracy: 0.7258
Epoch 105/125
8/8 [==============================] - 1s 104ms/step - loss: 0.0523 - accuracy: 0.9796 - val_loss: 0.7122 - val_accuracy: 0.8226
Epoch 106/125
8/8 [==============================] - 1s 110ms/step - loss: 0.0260 - accuracy: 0.9878 - val_loss: 0.7808 - val_accuracy: 0.8065
Epoch 107/125
8/8 [==============================] - 1s 105ms/step - loss: 0.0468 - accuracy: 0.9837 - val_loss: 0.7889 - val_accuracy: 0.8387
Epoch 108/125
8/8 [==============================] - 1s 126ms/step - loss: 0.0494 - accuracy: 0.9796 - val_loss: 0.7407 - val_accuracy: 0.8226
Epoch 109/125
8/8 [==============================] - 1s 116ms/step - loss: 0.0185 - accuracy: 0.9918 - val_loss: 0.7320 - val_accuracy: 0.8065
Epoch 110/125
8/8 [==============================] - 1s 98ms/step - loss: 0.0290 - accuracy: 0.9878 - val_loss: 0.7552 - val_accuracy: 0.8226
Epoch 111/125
8/8 [==============================] - 1s 105ms/step - loss: 0.0244 - accuracy: 0.9878 - val_loss: 0.7473 - val_accuracy: 0.8387
Epoch 112/125
8/8 [==============================] - 1s 114ms/step - loss: 0.1309 - accuracy: 0.9551 - val_loss: 0.7397 - val_accuracy: 0.8548
Epoch 113/125
8/8 [==============================] - 1s 111ms/step - loss: 0.0168 - accuracy: 0.9918 - val_loss: 1.0046 - val_accuracy: 0.7581
Epoch 114/125
8/8 [==============================] - 1s 112ms/step - loss: 0.0792 - accuracy: 0.9633 - val_loss: 0.8232 - val_accuracy: 0.8065
Epoch 115/125
8/8 [==============================] - 1s 179ms/step - loss: 0.1146 - accuracy: 0.9551 - val_loss: 0.7225 - val_accuracy: 0.8387
Epoch 116/125
8/8 [==============================] - 1s 100ms/step - loss: 0.0719 - accuracy: 0.9796 - val_loss: 1.2870 - val_accuracy: 0.6290
Epoch 117/125
8/8 [==============================] - 1s 171ms/step - loss: 0.0321 - accuracy: 0.9796 - val_loss: 1.7005 - val_accuracy: 0.5968
Epoch 118/125
8/8 [==============================] - 1s 113ms/step - loss: 0.0489 - accuracy: 0.9796 - val_loss: 1.0943 - val_accuracy: 0.6935
Epoch 119/125
8/8 [==============================] - 1s 122ms/step - loss: 0.0150 - accuracy: 1.0000 - val_loss: 0.6636 - val_accuracy: 0.7742
Epoch 120/125
8/8 [==============================] - 1s 131ms/step - loss: 0.0270 - accuracy: 0.9878 - val_loss: 0.6675 - val_accuracy: 0.8387
Epoch 121/125
8/8 [==============================] - 1s 132ms/step - loss: 0.0216 - accuracy: 0.9959 - val_loss: 0.6930 - val_accuracy: 0.8226
Epoch 122/125
8/8 [==============================] - 1s 124ms/step - loss: 0.0193 - accuracy: 0.9959 - val_loss: 0.8273 - val_accuracy: 0.7581
Epoch 123/125
8/8 [==============================] - 1s 168ms/step - loss: 0.0277 - accuracy: 0.9918 - val_loss: 1.0150 - val_accuracy: 0.7419
Epoch 124/125
8/8 [==============================] - 1s 156ms/step - loss: 0.0239 - accuracy: 0.9959 - val_loss: 1.0616 - val_accuracy: 0.7419
Epoch 125/125
8/8 [==============================] - 1s 167ms/step - loss: 0.0179 - accuracy: 0.9918 - val_loss: 0.9316 - val_accuracy: 0.7903
CPU times: user 6min 6s, sys: 3min 47s, total: 9min 53s
Wall time: 3min 20s
How does it feel, having built
and trained
your first ANN
? Isn’t it beautiful and wild? Seeing it in action after all this (hopefully not too terrible) theoretical content and preparation is definitely something else. Y’all obviously deserve to party for a minute!

Ok, time to get back to work. We might have built
and trained
our ANN
, but actually have no idea how did it perform during the training
. There were some hints (actually all information we’re interested in) in the output we saw during the training, but let’s visualize it to better grasp it. We will start with the metric
:
Show code cell source
import plotly.graph_objects as go
import numpy as np
from plotly.offline import plot
from IPython.core.display import display, HTML
epoch = np.arange(nEpochs) + 1
fig = go.Figure()
# Add traces
fig.add_trace(go.Scatter(x=epoch, y=fit.history['accuracy'],
mode='lines+markers',
name='training set'))
fig.add_trace(go.Scatter(x=epoch, y=fit.history['val_accuracy'],
mode='lines+markers',
name='validation set'))
fig.update_layout(title="Accuracy in training and validation set",
template='plotly_white')
fig.update_xaxes(title_text='Epoch')
fig.update_yaxes(title_text='Accuracy')
#fig.show()
plot(fig, filename = 'acc_eyes.html')
display(HTML('acc_eyes.html'))
/var/folders/61/0lj9r7px3k52gv9yfyx6ky300000gn/T/ipykernel_85102/4171248763.py:4: DeprecationWarning:
Importing display from IPython.core.display is deprecated since IPython 7.14, please import from IPython display
Question: what do you see and how do you interpret it?
After checking the accuracy metric
of our ANN
, we will have a look at the loss function
.
Show code cell source
import plotly.graph_objects as go
import numpy as np
epoch = np.arange(nEpochs) + 1
fig = go.Figure()
# Add traces
fig.add_trace(go.Scatter(x=epoch, y=fit.history['loss'],
mode='lines+markers',
name='training set'))
fig.add_trace(go.Scatter(x=epoch, y=fit.history['val_loss'],
mode='lines+markers',
name='validation set'))
fig.update_layout(title="Loss in training and validation set",
template='plotly_white')
fig.update_xaxes(title_text='Epoch')
fig.update_yaxes(title_text='Loss')
#fig.show()
plot(fig, filename = 'loss_eyes.html')
display(HTML('loss_eyes.html'))