Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the whole codes to tensorflow 2.x for enabling eager_mode for eclipse debugger #7

Open
sekigh opened this issue Jul 21, 2021 · 0 comments

Comments

@sekigh
Copy link

sekigh commented Jul 21, 2021

I am a almost beginner for tensorflow.
I am trying to convert the whole codes, priority is seq_group.py, to tensorflow 2.4.1 for enabling eager_mode (define by run mode).
I followed instructions upto "Top-level behavioral changes" on tensorflow documents (https://www.tensorflow.org/guide/migrate) and successfully had them run on 2.4.1 in session.run() mode. I attach successfully running codes on 2.4.1 with session.run() here.
Now I am trying to do "Create code for TensorFlow 2.x" on that document and try to remove session.run() but do not know how to do for following three things;

  1. How to build and treat multiple asynchronous training data threading without session.run() and feed those threadings into training flow. This question is associated with queue/dequeue parts and threading parts as follows;

q = tf.queue.PaddingFIFOQueue(50, [tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.int32], shapes=[[None, n_input], [None, n_input], [None, n_output], [None, n_output], [None, n_output], [None, n_output], [None, n_output2], [None, n_output2], [None]])
enqueue_op = q.enqueue([xr_batch, xi_batch, y1r_batch, y1i_batch, y2r_batch, y2i_batch, est1_batch, est2_batch, seq_len_batch])
queue_size = q.size()

Define dequeue operation without predefined batch size

xr_, xi_, y1r_, y1i_, y2r_, y2i_, est1_, est2_, seq_len_ = q.dequeue()

and

def load_and_enqueue(sess, enqueue_op, coord, queue_index, total_queues):

and

coord = tf.train.Coordinator()
num_threads = 5 # Use 5 threads for data loading, each thread touches different part of training data
t = [threading.Thread(target=load_and_enqueue, args=(sess,enqueue_op,coord,i,num_threads)) for i in range(num_threads)]
for tmp_t in t:
tmp_t.start()

  1. How to implement optimizer and forward/backpropagation for training without session.run(). This question is associated with defining optimizer and forward/backpropagation in training as follows;

Define loss and optimizer

extra_update_ops = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
if FLAGS.is_adam == 1:
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=lr,beta2=0.9).minimize(cost) # Adam Optimizer
else:
optimizer = tf.compat.v1.train.MomentumOptimizer(learning_rate=lr, momentum=0.9).minimize(cost) # Momentum Optimizer

and

Run one batch of training

    _, train_cost, train_cost_pit, q_size\
    = sess.run([optimizer, cost, cost_pit, queue_size], \
    feed_dict={lr: learning_rate, keep_prob: FLAGS.keep_prob, is_training: True})
  1. How to replace model Saver save/restore without session.run() ? This question is associated with defining tf.train.Saver and its methods for save and restore as follows;

Model dir and model saver

model_dir = os.getcwd()+"/exp/"+FLAGS.exp_name+"/models/"+FLAGS.time_stamp
if not os.path.exists(model_dir):
os.makedirs(model_dir)
saver = tf.compat.v1.train.Saver(max_to_keep = None)

and

If there is a pickle file, then load to restore previous training states

with open(training_info_pickle, 'rb') as f_pickle:
step, best_cv_cost, best_cv_step, learning_rate = pickle.load(f_pickle)
saver.restore(sess, os.getcwd()+"/exp/"+FLAGS.exp_name+"/models/"+FLAGS.time_stamp+'/'+FLAGS.exp_name+"_model.ckpt" + "step"+ str(step))

and

Saving models and training specs after evaluation

save_path = saver.save(sess, os.getcwd()+"/exp/"+FLAGS.exp_name+"/models/"+FLAGS.time_stamp+'/'+FLAGS.exp_name+"_model.ckpt" + "step"+ str(step))

The end of text.
seq_group_and_utility_running on tf2.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant