What do “compile” , “fit” and “predict” do in Keras sequential models?Deriving backpropagation...
What is an explicit bijection in combinatorics?
Renting a 2CV in France
How to deal with an underperforming subordinate?
Is there a celebrity culture in academia and should we discourage it?
How to write Muḥammad ibn Mūsā al-Khwārizmī?
Are there historical references that show that "diatonic" is a version of 'di-tonic' meaning 'two tonics'?
How to know if I am a 'Real Developer'
Was Opportunity's last message to Earth "My battery is low and it's getting dark"?
Why is it that Bernie Sanders is always called a "socialist"?
If I tried and failed to start my own business, how do I apply for a job without job experience?
Are all power cords made equal?
How can I handle players killing my NPC outside of combat?
What could cause an entire planet of humans to become aphasic?
Would water spill from a bowl in a Bag of Holding?
In a post-apocalypse world, with no power and few survivors, would Satnav still work?
Performance and power usage for Raspberry Pi in the Stratosphere
Buying a "Used" Router
Can I use a single resistor for multiple LED with different +ve sources?
How do I draw a function along with a particular tangent line at a specific point?
Using time travel without creating plot holes
Boss asked me to sign a resignation paper without a date on it along with my new contract
Sing Baby Shark
Is it really OK to use "because of"?
Tikz: Perpendicular FROM a line
What do “compile” , “fit” and “predict” do in Keras sequential models?
Deriving backpropagation equations “natively” in tensor formMerge two models - KerasNeed help understanding LSTMs' backpropagation and carousel of errorMulti-label classifciation: keras custom metricsKeras: apply masking to non-sequential dataWhat is the difference between fit() and fit_generator() in Keras?Tensorflow dense layers worse than keras sequentialValidation-split of Keras fit functionKeras - Error when using HDF5Matrix to fit the modelModeling keras LSTM sequential vs functional api
$begingroup$
I am a little confused between these two parts of Keras sequential models
functions. May someone explains what is exactly the job of each one? I mean compile
doing forward pass and calculating cost function
then pass it through fit
to do backward pass and calculating derivatives and updating weights
? Or what?
I have seen in some codes, they only used compile
function for some of their LSTM
s and fit
for some other ones! So I need to know each of these functions do what part of the work(training a neural network).
It's also interesting for me to know what exactly do predict
function as well.
Very thank you in advanced!
keras prediction backpropagation cost-function methods
$endgroup$
add a comment |
$begingroup$
I am a little confused between these two parts of Keras sequential models
functions. May someone explains what is exactly the job of each one? I mean compile
doing forward pass and calculating cost function
then pass it through fit
to do backward pass and calculating derivatives and updating weights
? Or what?
I have seen in some codes, they only used compile
function for some of their LSTM
s and fit
for some other ones! So I need to know each of these functions do what part of the work(training a neural network).
It's also interesting for me to know what exactly do predict
function as well.
Very thank you in advanced!
keras prediction backpropagation cost-function methods
$endgroup$
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago
add a comment |
$begingroup$
I am a little confused between these two parts of Keras sequential models
functions. May someone explains what is exactly the job of each one? I mean compile
doing forward pass and calculating cost function
then pass it through fit
to do backward pass and calculating derivatives and updating weights
? Or what?
I have seen in some codes, they only used compile
function for some of their LSTM
s and fit
for some other ones! So I need to know each of these functions do what part of the work(training a neural network).
It's also interesting for me to know what exactly do predict
function as well.
Very thank you in advanced!
keras prediction backpropagation cost-function methods
$endgroup$
I am a little confused between these two parts of Keras sequential models
functions. May someone explains what is exactly the job of each one? I mean compile
doing forward pass and calculating cost function
then pass it through fit
to do backward pass and calculating derivatives and updating weights
? Or what?
I have seen in some codes, they only used compile
function for some of their LSTM
s and fit
for some other ones! So I need to know each of these functions do what part of the work(training a neural network).
It's also interesting for me to know what exactly do predict
function as well.
Very thank you in advanced!
keras prediction backpropagation cost-function methods
keras prediction backpropagation cost-function methods
asked 1 hour ago
user145959user145959
284
284
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago
add a comment |
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Let's first see what we need to do when we want to train a model.
- First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)
- Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)
- Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)
Let's go through an example using the mnist database.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Now let's define our model. We will use a vanilla CNN for this example.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.
You can save this model to disk to use later.
# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)
So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.
First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.
# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))
Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.
print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)
Predicted classes: [0 6 9 0 1 5 9 7 3 4]
The code to print the MNIST database nicely
import matplotlib.pyplot as plt
%matplotlib inline
# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
$endgroup$
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Let's first see what we need to do when we want to train a model.
- First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)
- Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)
- Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)
Let's go through an example using the mnist database.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Now let's define our model. We will use a vanilla CNN for this example.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.
You can save this model to disk to use later.
# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)
So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.
First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.
# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))
Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.
print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)
Predicted classes: [0 6 9 0 1 5 9 7 3 4]
The code to print the MNIST database nicely
import matplotlib.pyplot as plt
%matplotlib inline
# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
$endgroup$
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
add a comment |
$begingroup$
Let's first see what we need to do when we want to train a model.
- First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)
- Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)
- Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)
Let's go through an example using the mnist database.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Now let's define our model. We will use a vanilla CNN for this example.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.
You can save this model to disk to use later.
# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)
So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.
First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.
# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))
Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.
print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)
Predicted classes: [0 6 9 0 1 5 9 7 3 4]
The code to print the MNIST database nicely
import matplotlib.pyplot as plt
%matplotlib inline
# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
$endgroup$
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
add a comment |
$begingroup$
Let's first see what we need to do when we want to train a model.
- First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)
- Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)
- Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)
Let's go through an example using the mnist database.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Now let's define our model. We will use a vanilla CNN for this example.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.
You can save this model to disk to use later.
# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)
So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.
First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.
# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))
Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.
print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)
Predicted classes: [0 6 9 0 1 5 9 7 3 4]
The code to print the MNIST database nicely
import matplotlib.pyplot as plt
%matplotlib inline
# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
$endgroup$
Let's first see what we need to do when we want to train a model.
- First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)
- Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)
- Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)
Let's go through an example using the mnist database.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Now let's define our model. We will use a vanilla CNN for this example.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.
You can save this model to disk to use later.
# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)
So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.
First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.
# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))
Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.
print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)
Predicted classes: [0 6 9 0 1 5 9 7 3 4]
The code to print the MNIST database nicely
import matplotlib.pyplot as plt
%matplotlib inline
# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
edited 50 mins ago
answered 1 hour ago
JahKnowsJahKnows
4,747525
4,747525
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
add a comment |
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
1
1
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago