Commit 0f8dd845 authored by Nicolas Schuler's avatar Nicolas Schuler
Browse files

Update MockupFFNN.py

parents
"""Created on 14.07.2021
@author: Nicolas Schuler
Small mockup FFNN.
Since this is a mockup we do not handle errors and the program
simply stops working if the given values are not of the correct
datatype or if the inputData can not be computed.
Contains a Network and Layer class.
A network consists of Layers. The Layers have a set number n of neurons,
represented as a collection of n-dimensional vectors for weights and
another n-dimensional vector for the biases of the neurons.
Since the neurons are always the same number and feeding all the other neurons,
the inputData used must have the same dimension as the number of neurons.
Contains different activation functions that can be used to transform the
output data of a given layer, before feeding it to the next layer. Note that
in this implementation we use the same function for all layers of a given
network.
Contains also some example runs with a network and a single inputData, activated
via different functions."""
import numpy as np
########################
# Activation functions #
########################
def identity(x, derivate=False):
if derivate:
return 1
return x
def rectLinU(x, derivate=False):
if derivate:
# Not defined for x == 0, we simply return 1
# Alternatively, throw an exception if x == 0
if x < 0:
return 0
return 1
if x <= 0:
return 0
return x
def sigmoid(x, derivate=False):
if derivate:
return sigmoid(self, x) * (1 - sigmoid(self, x))
return 1 / (1 + np.exp(-x))
def softmax(x):
return np.exp(x) / np.sum(np.exp(x))
###########
# Classes #
###########
class Layer:
"""Layer of a network with weights and biases.
Each layer consists of self.numberOfNeurons neurons.
Generate as many weights as a random element of [0,1] for each neuron.
Generate a bias for each neuron in the layer as a random element of [0,1]."""
def __init__(self, numberOfNeurons):
# For weights in the layer: numberOfNeurons x 1 Vector
self.weights = [np.random.default_rng().random((numberOfNeurons, 1)) for _ in range(numberOfNeurons)]
self.biases = np.random.default_rng().random((numberOfNeurons, 1))
# self.biases = np.array(range(numberOfNeurons)) * 1000000
class Network:
"""Mockup of a neural network with layers (see Layer-class).
inputData():
Sets the used inputData data to feed through the network.
Numpy copies the arrays it handles the inputData is unchanged
through different feeding processes.
Since the number of neurons is the same for each layer and we are
feeding all the other neurons, the inputData used must have the same
dimension as the number of neurons, meaning
numberOfNeurons == dimension of inputData.
feedData():
Main function of the module. After a neural network has been created
and feed with the correct inputData data we can feed that data through
the different layers. The resulting output, after going through all
layers, can be called with outputData()
activateOutput():
Applies the given activation function to the provided
data as the result of each layer.
Attribute activationFunction can be "Identity", "RectLinUnit", "Sigmoid"
and "Softmax"
showStatus():
Left over from debugging, simply shows the weights and bias
of a given layer. Since the network does not learn, these
values stay the same once the network is initialized.
"""
def __init__(self, numberOfLayers=5, numberOfNeurons=10):
self.input = None
self.output = None
self.layers = [Layer(numberOfNeurons) for _ in range(numberOfLayers)]
def inputData(self, data):
self.input = data
def outputData(self):
return self.output
def feedData(self, activationFunction="Sigmoid"):
inputData = np.array(self.input, copy=True, dtype="float64")
for layer in self.layers:
output = np.array(range(10), dtype="float64")
for neuron in range(10):
# Multiply inputData with weights via dot product, then add bias of the given neuron
# Finally, calculate output value with the chosen activation function (exception: softmax)
output[neuron] = self.activateOutput(np.dot(inputData, layer.weights[neuron] + layer.biases[neuron]), activationFunction)
# For softmax we calculate the output in the end, since we need the entire output
if (activationFunction == "Softmax"):
output = softmax(output)
inputData = output
self.output = output
def activateOutput(self, x, activationFunction):
if (activationFunction == "Identity"): return identity(x)
elif (activationFunction == "RectLinUnit"): return rectLinU(x)
elif (activationFunction == "Sigmoid"): return sigmoid(x)
# For simplicity, we return x for softmax and handle it later
elif (activationFunction == "Softmax"): return x
def showStatus(self):
i = 1
print("----------")
for layer in self.layers:
print("Layer ", i, ":")
print("\tWeights:", layer.weights)
print("\tBiases:", layer.biases)
print("---")
i += 1
print("----------")
########
# Test #
########
if __name__ == "__main__":
network = Network(5, 10)
network2 = Network(10, 10)
inputData = np.array(range(10), dtype="float64")
network.inputData(inputData)
network2.inputData(inputData)
print("Input")
print(network.input)
network.feedData()
print("\nIdentity")
network.feedData(activationFunction="Identity")
print(network.outputData())
print("\nRectLinUnit")
network.feedData(activationFunction="RectLinUnit")
print(network.outputData())
print("\nSigmoid")
network.feedData(activationFunction="Sigmoid")
print(network.outputData())
print("\nSoftmax")
network.feedData(activationFunction="Softmax")
print(network.outputData())
print("\nSoftmax")
network2.feedData(activationFunction="Softmax")
print(network2.outputData())
####################
# Exercise Answers #
####################
"""
What effect do different activation functions have?
Since I only used positive values (between 0 and 1) as weights/biases, the
behavior of Identity and RectLinUnit are identical. Both functions keep
the output (effectively, since we have no negative values), thus each
consecutive layer has a larger and larger output.
Sigmoid converts towards 1 (for x --> infinity). Since, again, we only operate
with positive values the relatively large values will convert towards 1. That is
true for every iteration, so the values of each layer will always be squashed to
1 and then be given to the next layer.
Finally, Softmax is calculated over the entire output. Like Sigmoid, its range is
between 0 and 1. But since the sum of the entire output (vector) for Softmax is = 1
we are always way below the 1 for the individual values. Example output after
feeding: [0.10320238 0.16048054 0.07592815 0.09503466 0.13095142 0.07203039 0.11736508
0.06868714 0.08664903 0.08967121] = 1
How did you initialize your weights and biases?
In the given exercise, the module does not learn/improve, so I just initialized the weights
and biases with random values between 0 and 1. I also did not initialize with only 0, since
then both weights and values would give the same output, regardless of inputData (since there is
no learning).
To my knowledge, neither method is ideal (random or zero), since especially the random
initialization is prone to lead to the vanishing gradient problem. This is especially true for
functions like Sigmoid, which squash the given values in a small range (of [0,1] in that case).
On the other hand, RectLU can counter this (since we have no squashing down, as seen above).
So ideally, you would like to initialize the values of the weights in a way, that minimizes the
danger for something like that to occur. While I'm aware of this fact, I do not know what the
proper way would be to do so yet.
"""
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment