Preface
I am learning Cloud Computing, Software Engineer and Operation System. And artificial intelligence is an interesting and full of application value field. I’d like to solve my major field in artificial intelligence or use artificial intelligence to work out my subject’s problem.
Using Deep Nerual Network in Tensorflow - Tensorflow Day 1
What is DNN(Deep Nerual Network)?
Refer to WatermelonBook, I introduce it in a past article.
DNN in Tensorflow
We use a iris flower data to simulate the real data.
And we build a dnn network.
The features in csv is 4 line and each line represents a feature of iris.
We use a DNN and train the model. the function name of DNN is
DNNClassifier
and the usage of DNN can be defined as follows:
- hidden_units:每层隐藏单元的 Iterable 数.所有层都完全连接.注意:[64, 32]意味着第一层有64个节点,第二层有32个节点.
- feature_columns:包含模型使用的所有特征列的iterable.集合中的所有项目都应该是从 _FeatureColumn 派生的类的实例.
- model_dir:用来保存模型参数,图形等的目录.这也可用于将检查点从目录加载到 estimator 中,以继续训练以前保存的模型.
- n_classes:标签类的数量.默认为 2,即二进制分类,必须大于1.
- weight_column:通过tf.feature_column.numeric_column 创建的一个字符串或者_NumericColumn,用来定义表示权重的特征列.在训练过程中,它用于降低权重或增加实例.它将乘以示例的损失.如果它是一个字符串,则被用作从特征中中获取权重张量的 key;如果是 _NumericColumn,则通过键 weight_column.key 获取原始张量,然后在其上应用 weight_column.normalizer_fn 以获得权重张量.
- label_vocabulary:字符串列表,表示可能的标签值.如果给定,标签必须是字符串类型,并且 label_vocabulary 具有任何值.如果没有给出,这意味着标签已经被编码为整数或者在[0,1]内浮动, n_classes=2 ;并且被编码为{0,1,…,n_classes-1}中的整数值,n_classes> 2.如果没有提供词汇表并且标签是字符串,也会出现错误.
- optimizer:tf.Optimizer 用于训练模型的实例.默认为 Adagrad 优化器.
activation_fn:激活函数应用于每个层.如果为 None,将使用 tf.nn.relu. - dropout:当不是 None 时,我们将放弃给定坐标的概率.
- input_layer_partitioner: (可选)输入层分区.默认为 min_max_variable_partitioner 与 min_slice_size64 << 20.
config:RunConfig 对象配置运行时设置.
其中隐藏层的数目是重要调参参数之一
比如在iris_training中我们训练使用如下的代码:
利用classifier这个dnn,运行了2000次的dnn训练
1 | feature_columns =[tf.contrib.layers.real_valued_column("",dimension=4] |
我们在代码中可以看出来,我们使用了10*30*10的隐藏层,输入层是4输出层是1.
In general Tensorflow work, we set a batch for each episode. And the meaning of batch is following,:
In a deeplearning network, generally, we sum all of error/gradient and retrieve learning rate * the mean of gradient value. Use these value to update the network in that batch. As a result, we only do a foward once in a batch deep learning.
Future Work
I will introduce the source code of the dnn network and explain how a batch push data to antoher batch. And the real running process of a dnn network.