What do “compile” , “fit” and “predict” do in Keras sequential models?Deriving backpropagation...

What is an explicit bijection in combinatorics?

Renting a 2CV in France

How to deal with an underperforming subordinate?

Is there a celebrity culture in academia and should we discourage it?

How to write Muḥammad ibn Mūsā al-Khwārizmī?

Are there historical references that show that "diatonic" is a version of 'di-tonic' meaning 'two tonics'?

How to know if I am a 'Real Developer'

Was Opportunity's last message to Earth "My battery is low and it's getting dark"?

Why is it that Bernie Sanders is always called a "socialist"?

If I tried and failed to start my own business, how do I apply for a job without job experience?

Are all power cords made equal?

How can I handle players killing my NPC outside of combat?

What could cause an entire planet of humans to become aphasic?

Would water spill from a bowl in a Bag of Holding?

In a post-apocalypse world, with no power and few survivors, would Satnav still work?

Performance and power usage for Raspberry Pi in the Stratosphere

Buying a "Used" Router

Can I use a single resistor for multiple LED with different +ve sources?

How do I draw a function along with a particular tangent line at a specific point?

Using time travel without creating plot holes

Boss asked me to sign a resignation paper without a date on it along with my new contract

Sing Baby Shark

Is it really OK to use "because of"?

Tikz: Perpendicular FROM a line



What do “compile” , “fit” and “predict” do in Keras sequential models?


Deriving backpropagation equations “natively” in tensor formMerge two models - KerasNeed help understanding LSTMs' backpropagation and carousel of errorMulti-label classifciation: keras custom metricsKeras: apply masking to non-sequential dataWhat is the difference between fit() and fit_generator() in Keras?Tensorflow dense layers worse than keras sequentialValidation-split of Keras fit functionKeras - Error when using HDF5Matrix to fit the modelModeling keras LSTM sequential vs functional api













3












$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$












  • $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    45 mins ago
















3












$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$












  • $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    45 mins ago














3












3








3





$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$




I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!







keras prediction backpropagation cost-function methods






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 1 hour ago









user145959user145959

284




284












  • $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    45 mins ago


















  • $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    45 mins ago
















$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago




$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
45 mins ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

Let's first see what we need to do when we want to train a model.




  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)




Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]






The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    44 mins ago











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

Let's first see what we need to do when we want to train a model.




  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)




Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]






The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    44 mins ago
















2












$begingroup$

Let's first see what we need to do when we want to train a model.




  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)




Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]






The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    44 mins ago














2












2








2





$begingroup$

Let's first see what we need to do when we want to train a model.




  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)




Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]






The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$



Let's first see what we need to do when we want to train a model.




  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)




Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]






The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()






share|improve this answer














share|improve this answer



share|improve this answer








edited 50 mins ago

























answered 1 hour ago









JahKnowsJahKnows

4,747525




4,747525








  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    44 mins ago














  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    44 mins ago








1




1




$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago




$begingroup$
Nice answer +1!
$endgroup$
– Aditya
44 mins ago


















draft saved

draft discarded




















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

Armoriale delle famiglie italiane (Car) Indice Armi | Bibliografia | Menu di navigazioneBlasone...

Lupi Siderali Indice Storia | Organizzazione | La Tredicesima Compagnia | Aspetto | Membri Importanti...