TensorFlow, Save and Load a model in a serious way, from different files

via: https://kevincodeidea.wordpress.com/2016/08/02/tensorflow-save-and-load-a-model-in-a-serious-way-from-different-files/

It has been a long time since my last post. Recently I am working in a group developing a deep, online, traceable, better-than-current-method neural network. After carefully comparing theano and tensorflow, we decide to use the latter. The main reason is actually not technical, we simply “predict” tensorflow will have a bright future and will be better maintained.
Back to the topic. Since it is an online algorithm, one important requirement is that one has to be able to save the model (not just some script-like operations, but also meta data, trained weights and the whole structure) to the disk and should be able to load the whole thing without a problem. The way I construct a model can be simplified as this:
  1. A basic model class contains all the tensorflow variables, in this application, the weights for each layer
  2. Several training functions that construct a graph and do the training or other operations. Those functions return a basic model object.
  3. A wrapper call functions in 2 and provide API’s for users.
It should be noted that the basic model class does not contain any member variable as operation node but only the tensorflow variables. It is the training function defines how the variables’ roles in the neural network.
To save the whole thing onto disk, I first tried pickle. It failed due to the constraints of the tensorflow objects. Frankly speaking I don’t want to even figure out how the tensorflow variables are actually implemented, since there is a class called tensorflow.train.saver, with two functions save() and restore(). It sounds like a good solution already isn’t it?
But the documentation of tensorflow is too simple to understand fully. Save variables? Yes you can do that, but how do you use it in other file? By using the annoying name space system? I am not going to use that. Here is a simple example on the problem:
#simplesave.py
import tensorflow as tf

with tf.Graph().as_default() as g:#yes you have to have a graph first
  with tf.Session() as sess:
    b = tf.Variable(1.0, name="bias")
    saver = tf.train.Saver()
    saver.save(sess,'model') #b should be saved in the model file
#simpleload.py
import tensorflow as tf

with tf.Graph().as_default() as g:#yes you have to have a graph first
  with tf.Session() as sess:
    #now what?
    saver = tf.train.Saver() #will complain about no variable to save
It becomes confusing, ideally in normal python case, if you load something like in pickle, you only need to use a variable to contain whatever in the pickle. Now I only want to load things, but it complains that no variable to save.
After I scratch my head for a while, I tried a “crazy” idea. Maybe tensorflow’s restore function uses the name (here is the “bias”) to link the definition to the local variable in the current script (maybe with some type check). The working script actually looks like this:
#simpleload.py

import tensorflow as tf

with tf.Graph().as_default() as g:
  with tf.Session() as sess:
    #still need the definition, again
    b = tf.Variable(0.0, name="bias")
    saver = tf.train.Saver() #now it is satisfied...
    saver.restore(sess,model)
Now whatever inside the saved b will be assigned to the current b, magically.
The problem is, for a large project when the there are complicated structures, no one will type all the things every time it tries to reload a model, it is also extremely hard to maintain the code. It will be the best if we can simply save the whole model somewhere and load whenever we want. But how to save all the tensorflow stuffs and all the logics?
The answer is extremely simple: we have already saved them–the file contains all the “tensorflow stuffs” is just the files containing the basic model class and graph-building functions. All you need to do, is import.
The actual procedure is like this: after building a model, 1 . save all the tensorflow variables; 2. save all the member variables of the wrapper class on disk (need to set the member variable point to tensorflow variable to be None); when load a model, load the normal member variables first, then reconstruct a basic model class, fill in the values by calling the saver.restore(). After that, the operation functions can be simply called . 
Below is an illustration on the save model part for a linear regression model (note that the optimizer part can be put in a function and also be imported):
#savemodel.py
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
import sys
import pickle
rng = numpy.random

class Wrapper(object):
 def __init__(self,sess,fakeid='fake'):
 self.fakeid=fakeid
 self.sess = sess
 self.W = tf.Variable(rng.randn(), name="weight")
 self.b = tf.Variable(0.0, name="bias")

 def change(self,n=0):
 sess.run(self.b.assign(100+n))

 def construct_pred(self,X,Y):
 self.pred_op = tf.add(tf.mul(X, self.W), self.b)
 return(self.pred_op)

 def prediction(self):
 print(self.sess.run(self.W))


# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50

# Training Data
train_X = numpy.asarray([3.3,4.4,5.5,6.71])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19])
n_samples = train_X.shape[0]




# Launch the graph

with tf.Graph().as_default() as g:
 X = tf.placeholder("float")
 Y = tf.placeholder("float")
 with tf.Session() as sess:

 model = Wrapper(sess)
 pred = model.construct_pred(X,Y)
 # Mean squared error
 cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
 # Gradient descent
 optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)


 saver = tf.train.Saver()
 # Initializing the variables
 init = tf.initialize_all_variables()
 sess.run(init)
 print([v.op.name for v in tf.all_variables()])
 model.change()
 saver.save(sess,'model')
 model.change(n=1)
 print(sess.run(model.b)) #101
 saver.restore(sess,'model')
 print(sess.run(model.b)) #100
Now we can reuse the model smoothly.
#loadmodel.py
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
import sys
rng = numpy.random

#*****************************#
from savemodel import Wrapper
#*****************************#


with tf.Graph().as_default() as g:
 with tf.Session() as sess:
 model = Wrapper(sess)
 with tf.Session() as sess:
 saver = tf.train.Saver()
 saver.restore(sess,'model')
 # Initializing the variables
 print([v.op.name for v in tf.all_variables()])
 print(sess.run(model.b)) #100
Frankly speaking, I don’t like this design (or this side effect). To some extent, the tensorflow provides extra anchors to each tf.Variables. It feels like global variables. A better design should provide a special wrapper that allows the whole thing to be stored and accessed by member variables instead of string names. But maybe the wrapper I developed in my project eventually will be just like one…
No matter what, developing a piece of fine codes is still much easier than developing a piece of fine algorithm….


Comments