I'm implementing a classification model using TensorFlow

The problem that I'm facing is that my weights and error are not being updated when I run the training step. As a result, my network keeps returning the same results.

I've developed my model based on the MNIST example from the TensorFlow website.

import numpy as np

import tensorflow as tf

sess = tf.InteractiveSession()

#load dataset

dataset = np.loadtxt('char8k.txt', dtype='float', comments='#', delimiter=",")

Y = np.asmatrix( dataset[:,0] )

X = np.asmatrix( dataset[:,1:1201] )

m = 11527

labels = 26

# y is update to 11527x26

Yt = np.zeros((m,labels))

for i in range(0,m):

index = Y[0,i] - 1

Yt[i,index]= 1

Y = Yt

Y = np.asmatrix(Y)

#------------------------------------------------------------------------------

#graph settings

x = tf.placeholder(tf.float32, shape=[None, 1200])

y_ = tf.placeholder(tf.float32, shape=[None, 26])

Wtest = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

b = tf.Variable(tf.zeros([26]))

sess.run(tf.initialize_all_variables())

y = tf.nn.softmax(tf.matmul(x,W) + b)

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

Wtest = W

for i in range(10):

print("iteracao:")

print(i)

Xbatch = X[np.random.randint(X.shape[0],size=100),:]

Ybatch = Y[np.random.randint(Y.shape[0],size=100),:]

train_step.run(feed_dict={x: Xbatch, y_: Ybatch})

print("atualizacao de pesos")

print(Wtest==W)#monitora atualizaçao dos pesos

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print("precisao:Y")

print accuracy.eval(feed_dict={x: X, y_: Y})

print(" ")

print(" ")

解决方案

The issue probably arises from how you initialize the weight matrix, W. If it is initialized to all zeroes, all of the neurons will follow the same gradient in each step, which leads to the network not training. Replacing the line

W = tf.Variable(tf.zeros([1200,26]))

...with something like

W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

...should cause it to start training.

This question on the CrossValidated site has a good explanation of why you should not initialize all of your weights to zero.

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐