0%

MXNet

自己使用时候总结的MXNet一些基本操作及介绍,深度学习训练框架MXNet的基本操作都在这里了

一、基础篇

NDArray介绍篇

在mxnet中,NDArray是所有运算的核心数据结构,mxnet中的所有数据均使用NDArray进行表示,NDarray有点类似于numpy中的ndarray,操作上面也与numpy很相似,但是NDArray提供了numpy.ndarray所不具备的操作,比如:GPU,CPU的切换,自动求梯度的运算等等,这也是为什么mxnet要重新封装一个NDArray的原因。

NDArray就好比盖房子的砖,砖有了自然可以盖房子了,所以利用NDArray你便可以实现所有的深度学习模型了,只是深度学习日新月异,如果每个模型都从NDArray开始写,那当然是比较麻烦的,所以MXnet或者gluon帮我们封装了一些函数可以直接使用,接下来我们都会逐一的了解,那么这一节我们先来仔细学习一下mxnet的NDArray:

特别说明:下面的操作只是NDArray的一些基本操作,
获取更多的操作还请查找MXNet文档:https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html

==当然,我希望你可以跟我一起来敲如下的代码而不是看一遍就过去了,事实证明,敲一遍的效果会好很多==

在mxnet中,使用NDArray需要引用nd包(nd是ndarray的缩写),如下:

1
from mxnet import nd

在引入nd包之后就可以愉快的使用NDArry了。

特别说明,下面代码中 ‘->’ 代表的是输出结果

1. 定义x为一个序列,这里与numpy一致,不同的是,这里返回的每个元素类型都是ndarray

1
x = nd.arange(12)   -> [0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.]

2. 当然我们可以打印出ndarray的shape以及size,这里的shape和size不要混淆了

1
2
3
4
5
6
7
8
9
x.shape      ->  (12 ,)
x.size -> 12
x.context -> x在CPU还是在GPU上

x.dtype -> data type,即每个元素的类型
## 当然,我们也可以改变元素的类型
x = x.astype(‘float32’) ## 利用astype便可以改变元素的类型了。


3. 那么如果我们想要构建一个二维矩阵怎么办呢?可以考虑reshape函数

1
2
3
4
x = nd.arange(12)
x.reshape(shape=(3,4))
# 当然你同样可以省略shape参数,直接这样写:
x.reshape((3,4)) ## 不过不要忘记(3,4)这个括号

4. 如果我们想初始化一个值为0或者值为1的多维矩阵该怎么做呢?

1
2
zero = nd.zeros((2,3,4))  # 别忘记了是负数
one = nd.ones((2,3,4))

5. 那如何建立一个随机初始化的矩阵呢?

1
S = nd.random.normal(0,1,shape=(3,4)) # normal是高斯分布

6. 当然我们也可以直接利用list直接建立ndarray,如下:

1
Y = nd.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

7. 在计算机视觉中,图像一般是3维的,但是在进行模型训练的时候,往往需要将多张图像组成一个batch,构成4维数组,ndarray当然支持,使用concat:

1
2
3
4
5
6
7
x=nd.random.normal(shape=(2,3,4))   ->    x.shape (2,3,4)
y=nd.random.normal(shape=(2,3,4)) -> y.shape (2,3,4)
z=nd.concat(x,y,dim=0) -> z.shape (4,3,4)
# 因为concat是不改变维度的,那么如果构建一个batch呢?利用expand_dims函数
x = nd.expand_dims(x, axis=0) -> x.shape (1,2,3,4)
y = nd.expand_dims(y, axis=0) -> y.shape (1,2,3,4)
z=nd.concat(x,y,dim=0) -> z.shape (2,2,3,4)

8. ndarray天然支持常用的数学计算,==这里方法较多,可以查阅文档,里面有例子,使用会比较方便==

1
2
3
4
5
6
7
8
9
S*Y
S/Y
S.exp()
S == Y
nd.dot(S,Y.T) ## 点乘比较常用,深度学习中矩阵计算就是使用的点乘,如果你从头编写神经网络,一定离不开点乘
nd.floor(S) ## 向下取整
nd.ceil(S) ## 向上取整
nd.argmax() ## 返回最大值的index
nd.topk() ## 返回topk

9. ndarray与numpy的相互转换

1
2
3
4
5
6
7
8
9
10
11
12
13
# ndarray -> numpy

C = S.asnumpy()

# numpy -> ndarray

p = np.ones((2,3))
d = nd.array(p)

# 同样的,我们也可以直接返回一个标量
x = nd.ones((1,), dtype='int32')
y = x.asscalar() ## 这里注意,必须是1维的才可以使用,二维会报错。

10. 前面我们说了,NDArray可以自动计算梯度,那么如何使用呢?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
1. 引用包   :from mxnet import autograd
2. 定义变量 : x = nd.arange(4).reshape((4,1))
3. 关联梯度,默认变量是不为梯度在内存开空间的,如果想要计算梯度,需要进行梯度关联,这样程序才会在内存上为变量开梯度的空间,如下:
x.attach_grad()

4. 计算的时候,使用autograd.record()记录梯度,如下是点乘的计算。
with autograd.record():
y = 2 * nd.dot(x.T,x)
5. 使用backward()计算梯度: y.backward()

总结到一起就是如下:

from mxnet import autograd
x = nd.arange(4).reshape((4,1))
x.attach_grad()
with autograd.record():
y = 2 * nd.dot(x.T,x)
y.backward()

11. 关于CPU与GPU的切换

1
2
3
x = nd.random.normal(shape=(3,4))  ## 默认在CPU内存中
y = nd.random.normal(shape=(3,4), ctx=mx.gpu(0)) ## 利用ctx指定对应的GPU
x = x.as_in_context(mx.gpu(0)) ## 利用as_in_context进行CPU到GPU的切换

12. 复制也是常用的操作,利用copy()可以避免浅拷贝的尴尬

1
2
3
x = nd.array([1,2,3])
y = x.copy()
## 为什么要使用copy()呢?因为在python中,默认的copy是浅拷贝,如果直接y=x,当x改变的时候,y会跟着改变,这就会出现错误。

特别说明:之上只是NDArray的一些基本操作,
获取更多的操作还请查找MXNet文档https://mxnet.incubator.apache.org/

Symbol介绍篇

Symbol的基本函数

  • Symbol.infer_type
  • Symbol.infer_shape
  • Symbol.list_arguments
  • Symbol.list_outputs
  • Symbol.list_auxiliary_states

args = 输入数据symbol + 权值参数symbol

aux = 辅助symbol,比如bn中的mean以及var

举个例子:

1
2
3
4
5
data = mx.sym.var('data')
fc = mx.sym.FullyConnected(data=data, num_hidden=12)
print('name:', fc.list_arguments(), fc.list_outputs(), fc.list_auxiliary_statues())
print('type:', fc.infer_type(data=(np.float32,np.float32)))
print('shape:', fc.infer_shape(data=(1,2)))
1
2
3
4
5
6
7
data= mx.sym.var('data', shape=(1,2))
weight= mx.sym.var('data', shape=(1,2))
bias= mx.sym.var('data', shape=(1,2))
fc = mx.sym.FullyConnected(data=data, weight=weight, bias=bias, num_hidden=12)
executor = fc.bind(ctx=mx.cpu(), args={'data':mx.nd.ones([1,2]),'weight':mx.nd.normal(shape=(12,2),'bias':mx.nd.random.normal(shape=(12,)))})
executor.forward()
print(executor.outputs[0].asnumpy())
1
2
3
4
data = mx.sym.var('data')
fc = mx.sym.FullyConnected(data=data, num_hidden=12)
fc2 = mx.sym.FullyConnected(data=fc, num_hidden=12)
print(fc2.get_internals().list_outputs()) ## 可以轻松获得任何一个节点

Module介绍篇

  • Module是MXNet中集成的接口,将几乎所有的模块封装成一个可以一步完成的训练和测试接口。
  • 所有的Module均继承了BaseModule

Metric 介绍篇

  • Metric是衡量模型效果的接口,如下是官方的计算Accuracy的例子
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class Accuracy(EvalMetric):
"""Computes accuracy classification score.
Examples
--------
>>> predicts = [mx.nd.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels = [mx.nd.array([0, 1, 1])]
>>> acc = mx.metric.Accuracy()
>>> acc.update(preds = predicts, labels = labels)
>>> print acc.get()
('accuracy', 0.6666666666666666)
"""
def __init__(self, axis=1, name='accuracy',
output_names=None, label_names=None):
super(Accuracy, self).__init__(
name, axis=axis,
output_names=output_names, label_names=label_names,
has_global_stats=True)
self.axis = axis

def update(self, labels, preds):
"""Updates the internal evaluation result.
Parameters
----------
labels : list of `NDArray`
The labels of the data with class indices as values, one per sample.
preds : list of `NDArray`
Prediction values for samples. Each prediction value can either be the class index,
or a vector of likelihoods for all classes.
"""
labels, preds = check_label_shapes(labels, preds, True)

for label, pred_label in zip(labels, preds):
if pred_label.shape != label.shape:
pred_label = ndarray.argmax(pred_label, axis=self.axis)
pred_label = pred_label.asnumpy().astype('int32')
label = label.asnumpy().astype('int32')
# flatten before checking shapes to avoid shape miss match
label = label.flat
pred_label = pred_label.flat

check_label_shapes(label, pred_label)

num_correct = (pred_label == label).sum()
self.sum_metric += num_correct
self.global_sum_metric += num_correct
self.num_inst += len(pred_label)
self.global_num_inst += len(pred_label)
  • 如何实现自己的metric函数

    • 需要继承EvalMetric,并重写update函数,注意传入参数的类型。

    • 更新sum_metric和num_inst的值,mxnet会调用sum_metric/num_inst来计算当前metric输出值。

二、使用篇

mxnet生成lst以及rec文件(mxnet指定的训练数据格式)

生成lst的时候,第一个参数是目标存放地址,第二个参数是图像文件目录,利用如下命令会生成train.lst以及val.lst

1
2
3
4
5
6
7
8
python im2rec.py --list --recursive --train-ratio 0.95 /path/train  /path/train # 前面的参数修改成 /path/lst/data 会生成 data_train.lst
python im2rec.py --list --recursive --train-ratio 0.95 /path/val /path/val

python im2rec.py --force-resize 400 --num-thread 8 /path/train.lst /path/train
python im2rec.py --force-resize 400 --num-thread 8 /path/val.lst /path/val

#multi-label
python im2rec.py --pack-label --num-thread 8 /path/train_lst.lst /path/train

mxnet训练demo

简单的利用fit直接训练
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

import mxnet as mx
mnist = mx.test_utils.get_mnist()

batch_size = 100
train_iter = mx.io.NDArrayIter(mnist['train_data'], mnist['train_label'], batch_size, shuffle=True)
val_iter = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)

data = mx.sym.var('data')
# Flatten the data from 4-D shape into 2-D (batch_size, num_channel*width*height)
data = mx.sym.flatten(data=data) #需要展开,因为全连接

# The first fully-connected layer and the corresponding activation function
fc1 = mx.sym.FullyConnected(data=data, num_hidden=128)
act1 = mx.sym.Activation(data=fc1, act_type="relu")
fc2 = mx.sym.FullyConnected(data=act1, num_hidden = 64)
act2 = mx.sym.Activation(data=fc2, act_type="relu")
fc3 = mx.sym.FullyConnected(data=act2, num_hidden=10)
# Softmax with cross entropy loss
mlp = mx.sym.SoftmaxOutput(data=fc3, name='softmax')

# create a trainable module on CPU
mlp_model = mx.mod.Module(symbol=mlp, context=mx.cpu())
mlp_model.fit(train_iter, # train data
eval_data=val_iter, # validation data
optimizer='sgd', # use SGD to train
optimizer_params={'learning_rate':0.1}, # fixed learning rate
eval_metric='acc', # report accuracy during training
batch_end_callback = mx.callback.Speedometer(batch_size, 100), # output progress for each 100 data batches
num_epoch=10) # train for at most 10 dataset passes
mxnet的训练过程,step by step

训练模型主要包括下面几个step:

  • bind : prepares environment for the computation by allocating memory
  • init_params : assigns and initializes parameters
  • init_optimizer : initializes optimizers defaults to sgd
  • metric.create : creates evaluation metric from input metric name
  • forward : forward computation
  • update_metric : evaluates and accumulates evaluation metric on outputs of the last forward computation
  • backward: backwoard computation
  • update: upadates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import logging
logging.getLogger().setLevel(logging.INFO)
import mxnet as mx
import numpy as np

fname = mx.test_utils.download('http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.data')
data = np.genfromtxt(fname, delimiter=',')[:,1:]
label = np.array([ord(l.split(',')[0])-ord('A') for l in open(fname, 'r')])

batch_size = 32
ntrain = int(data.shape[0]*0.8)
train_iter = mx.io.NDArrayIter(data[:ntrain, :], label[:ntrain], batch_size, shuffle=True)
val_iter = mx.io.NDArrayIter(data[ntrain:, :], label[ntrain:], batch_size)

###搭建网络
net = mx.sym.Variable('data')
net = mx.sym.FullyConnected(net, name='fc1', num_hidden=64)
net = mx.sym.Activation(net, name='relu1', act_type="relu")
net = mx.sym.FullyConnected(net, name='fc2', num_hidden=26)
net = mx.sym.SoftmaxOutput(net, name='softmax')
mx.viz.plot_network(net)


mod = mx.mod.Module(symbol=net,
context=mx.cpu(),
data_names=['data'],
label_names=['softmax_label'])


# allocate memory given the input data and label shapes
mod.bind(data_shapes=train_iter.provide_data,label_shapes=train_iter.provide_label)
# initialize parameters by uniform random numbers
mod.init_params(initializer=mx.init.Uniform(scale=.1))
# use SGD with learning rate 0.1 to train
mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))
# use accuracy as the metric
metric = mx.metric.create('acc')
# train 5 epochs, i.e. going over the data iter one pass
for epoch in range(5):
train_iter.reset()
metric.reset()
for batch in train_iter:
mod.forward(batch, is_train=True) # compute predictions
mod.update_metric(metric, batch.label) # accumulate prediction accuracy
mod.backward() # compute gradients
mod.update() # update parameters
print('Epoch %d, Training %s' % (epoch, metric.get()))

mxnet进行finetune代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204

import argparse
import mxnet as mx
from mxnet import nd
import os, sys
import logging
import random
from CosineScheduler import CosineScheduler


def get_fine_tune_model(sym, num_classes, layer_name, fixed_layer_name):

all_layers = sym.get_internals()
net = all_layers[layer_name + '_output']
net = mx.symbol.FullyConnected(data=net, num_hidden=num_classes, name='fc')
net = mx.symbol.SoftmaxOutput(data=net, name='softmax')
return net



def multi_factor_scheduler(args, epoch_size):
step = range(args.step, args.num_epoch, args.step)
step_ = [epoch_size * (x - args.begin_epoch) for x in step if x - args.begin_epoch > 0]
return mx.lr_scheduler.MultiFactorScheduler(step=step_, factor=args.factor) if len(step_) else None

def cos_scheduler(args, epoch_size):
epoch_steps = epoch_size
total_steps = args.num_epoch * epoch_steps
return CosineScheduler(total_steps,epoch_steps,base_lr=args.lr, final_lr = args.lr * 1e-3)


def data_loader(args):
data_shape_list = [int(item) for item in args.image_shape.split(",")]
data_shape = tuple(data_shape_list)

train = mx.io.ImageRecordIter(
path_imgrec=args.data_train,
label_width=1,
mean_r=123.68,
mean_g=116.779,
mean_b=103.939,
std_r=58.393,
std_g=57.12,
std_b=57.375,
data_name='data',
label_name='softmax_label',
data_shape=data_shape,
batch_size=args.batch_size,
rand_crop=args.random_crop,
rand_mirror=args.random_mirror,
# max_aspect_ratio=args.max_aspect_ratio,
max_random_contrast=args.max_random_contrast,
max_random_illumination=args.max_random_illumination,
max_rotate_angle=args.max_rotate_angle,
shuffle=True,
num_parts=kv.num_workers,
resize=args.resize_train,
part_index=kv.rank)

val = mx.io.ImageRecordIter(
path_imgrec=args.data_val,
label_width=1,
mean_r=123.68,
mean_g=116.779,
mean_b=103.939,
std_r=58.393,
std_g=57.12,
std_b=57.375,
data_name='data',
label_name='softmax_label',
data_shape=data_shape,
batch_size=args.batch_size,
rand_crop=0,
rand_mirror=0,
shuffle=False,
num_parts=kv.num_workers,
resize=args.resize_val,
part_index=kv.rank)

return train,val

def train_model(args, kv='device'):
train, val = data_loader(args)

prefix = args.model
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, args.begin_epoch)
sym = get_symbol(num_classes=args.num_classes, num_layers=50, image_shape='3,224,224', fixed_stage= args.fixed_stage)


new_sym, fix_names = get_fine_tune_model(
sym, args.num_classes, args.layer_name)


epoch_size = max(int(args.num_examples / args.batch_size / kv.num_workers), 1)
if args.cosine == True:
lr_scheduler = cos_scheduler(args, epoch_size)
else:
lr_scheduler = multi_factor_scheduler(args, epoch_size)

optimizer_params = {
'learning_rate': args.lr,
'momentum': args.mom,
'wd': args.wd,
'lr_scheduler': lr_scheduler}
initializer = mx.init.Xavier(
rnd_type='gaussian', factor_type="in", magnitude=2)

if args.gpus == '':
devs = mx.cpu()
else:
devs = [mx.gpu(int(i)) for i in args.gpus.split(',')]

model = mx.mod.Module(
context=devs,
symbol=new_sym,
label_names=["softmax_label"],
fixed_param_names = fix_names
)


checkpoint = mx.callback.do_checkpoint(args.save_result + args.save_name)


eval_metric = mx.metric.CompositeEvalMetric()
eval_metric.add(CrossEntropy())
eval_metric.add(Accuracy())



model.fit(train_data=train,
eval_data=val,
begin_epoch=args.begin_epoch,
num_epoch=args.num_epoch,
eval_metric=eval_metric,
validation_metric=eval_metric,
kvstore=kv,
optimizer='sgd',
optimizer_params=optimizer_params,
arg_params=arg_params,
aux_params=aux_params,
initializer=initializer,
allow_missing=True, # for new fc layer
batch_end_callback=mx.callback.Speedometer(args.batch_size, 20),
epoch_end_callback=checkpoint)

if __name__ == '__main__':
parser = argparse.ArgumentParser(description='score a model on a dataset')
parser.add_argument('--model', type=str, default='./pretrain_models/resnet50v1d-pretrained')
parser.add_argument('--gpus', type=str, default='0')
parser.add_argument('--batch-size', type=int, default=64)
parser.add_argument('--begin-epoch', type=int, default=0)
parser.add_argument('--image-shape', type=str, default='3,224,224')
parser.add_argument('--resize-train', type=int, default=256)
parser.add_argument('--resize-val', type=int, default=224)
parser.add_argument('--data-train', type=str, default='/app/data/train_filter_modify.rec')
parser.add_argument('--data-val', type=str, default='/app/data/test_filter_modify.rec')
parser.add_argument('--num-classes', type=int, default=3)
parser.add_argument('--lr', type=float, default=0.005)
parser.add_argument('--num-epoch', type=int, default=25)
parser.add_argument('--kv-store', type=str, default='device', help='the kvstore type')
parser.add_argument('--save-result', type=str, help='the save path', default='/app/output/resnext-50-test')
parser.add_argument('--num-examples', type=int, default=20000)
parser.add_argument('--mom', type=float, default=0.9, help='momentum for sgd')
parser.add_argument('--wd', type=float, default=0.0001, help='weight decay for sgd')
parser.add_argument('--save-name', type=str, help='the save name of model', default='resnext-50')
parser.add_argument('--random-crop', type=int, default=0, help='if or not randomly crop the image')
parser.add_argument('--random-mirror', type=int, default=1, help='if or not randomly flip horizontally')
parser.add_argument('--max-aspect-ratio', type=float, default=0.3, help='width/height')
parser.add_argument('--max_random_contrast', type=float, default=0.3, help='Chanege the contrast with a value randomly chosen from [-max, max]')
parser.add_argument('--max_random_illumination', type=float, default=30, help='Chanege the illumination with a value randomly chosen from [-max, max]')
parser.add_argument('--max_rotate_angle', type=float, default=15, help='Rotate by a random degree in [-v,v]')
parser.add_argument('--layer-name', type=str, default='flatten0', help='the layer name before fullyconnected layer')
parser.add_argument('--factor', type=float, default=0.2, help='factor for learning rate decay')
parser.add_argument('--step', type=int, default=5, help='step for learning rate decay')
parser.add_argument('--cosine', type=bool, default=False, help='cosine for learning rate decay')
args = parser.parse_args()

kv = mx.kvstore.create(args.kv_store)

if not os.path.exists(args.save_result):
os.makedirs(args.save_result)

# create a logger and set the level
logger = logging.getLogger()
logger.setLevel(logging.INFO)

formatter = logging.Formatter('%(asctime)s - %(message)s')

# this handler is used to record information in train.log
hdlr = logging.FileHandler(args.save_result + '/train.log')
hdlr.setFormatter(formatter)
logger.addHandler(hdlr)

# this handler is used to print information in terminal
console = logging.StreamHandler()
console.setFormatter(formatter)
logger.addHandler(console)

# record the information of args
logging.info(args)

train_model(args=args, kv=kv)


finetune的时候,如何调整新加层的学习率

1
2
weight = mx.sym.var('weight', lr_mult=5)
add_conv_1 = mx.sym.Convolution(data=data, weight=weight, num_filter=1024, kernel=(1,1), name='add_conv_1')

mxnet进行inference代码示例

这里是我自己的代码,可能不适用所有情况,需要按需修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
import mxnet as mx
import argparse
import os
import cv2
import shutil
import numpy as np



def data_transpose(data, augments):
for aug in augments:
data = aug(data)
return data

def img_read_aug(path):


try:
img = cv2.imread(path)
if img is None:
return None
img = mx.image.imread(path)
except:
print(path)
return None


cla_cast_aug = mx.image.CastAug()
cla_resize_aug = mx.image.ForceResizeAug(size=[224, 224])

color_norm_aug = mx.image.ColorNormalizeAug(mx.nd.array([123.68, 116.779, 103.939]), mx.nd.array([58.393, 57.12, 57.375]))

cla_augmenters = [cla_cast_aug, cla_resize_aug, color_norm_aug]

img = data_transpose(img, cla_augmenters)
img = mx.nd.transpose(img, axes=(2, 0, 1))
img = mx.nd.expand_dims(img,axis=0)
return img


def predict(img,mod):

data_batch = mx.io.DataBatch([img])
mod.forward(data_batch)
prob = mod.get_outputs()[0].asnumpy()
return prob

def inference(model, base_path, count, recursion, base_result_path,result_anay,class_dict,outfile):

print('process:{}'.format(base_path))

files = os.listdir(base_path)
for file in files:
if os.path.isdir(os.path.join(base_path, file)):

### recursion
inference(model, os.path.join(base_path,file), count,recursion + 1, os.path.join(base_result_path,file), result_anay,class_dict,outfile)

else:
img_path = os.path.join(base_path, file)
img = img_read_aug(img_path)
if img is None:
continue
try:
cla = predict(img, model)
except:
print('predict error')
continue
max = np.argmax(cla[0])

result_name = 'cla'
for i in range(args.num_classes):
result_name += '-{}-{:.4f}'.format(class_dict[i],cla[0][i])

result_name += ('-' + file)

if recursion == 0:
result_path = os.path.join(base_result_path,'-' + class_dict[max])
else:
fold_name = base_result_path.split('/')[-1]
result_path = os.path.join(base_result_path, fold_name + '-' + class_dict[max])

### result analysis
if result_anay.has_key(fold_name + ':' + class_dict[max]):
result_anay[fold_name + ':' + class_dict[max]] += 1

else:
result_anay[fold_name + ':' + class_dict[max]] = 1

if args.save_txt == 1:
outfile.writelines(result_path + '\t' + result_name + '\n')
if args.save_image == 1:
print('save-image' + str(args.save_image))
if not os.path.exists(result_path):
os.makedirs(result_path)
shutil.copyfile(img_path, os.path.join(result_path, result_name))
count += 1


if count % 100 == 0:
print('--{}--'.format(count))



def main(args):

sym, arg_params, aux_params = mx.model.load_checkpoint(args.model_path, args.model_index)
model = mx.mod.Module(symbol=sym, context=mx.gpu(args.gpu))
model.bind(data_shapes=[('data',(1,3,224,224))], for_training=False)
model.set_params(arg_params, aux_params)

base_path = args.test_images_path
if not os.path.exists(base_path):
print('images file path error')
return


result_path = args.test_result_path
if not os.path.exists(result_path):
os.makedirs(result_path)

outfile = open(os.path.join(result_path,'infer_record.txt'),'w')

count = 0
recursion = 0
result_anay = {}
class_dict = {i:item for i,item in enumerate(args.class_label.strip().split(','))}

inference(model, base_path, count, recursion, result_path,result_anay,class_dict,outfile)

print(result_anay)
outfile.close()




if __name__ == '__main__':
parser = argparse.ArgumentParser(description='score a model on a dataset')
parser.add_argument('--model-path', type=str, default='')
parser.add_argument('--model-index', type=int, default=8)
parser.add_argument('--num-classes', type=int, default=3)
parser.add_argument('--class-label', type=str, default='porno,sexy,normal')
parser.add_argument('--test-images-path', type=str, default='None')
parser.add_argument('--test-result-path', type=str, default='None')
parser.add_argument('--save-image', type=int, default=0)
parser.add_argument('--save-txt', type=int, default=1)
parser.add_argument('--gpu', type=int, default=0)
args = parser.parse_args()
main(args)



如下脚本运行上面代码

1
2
3
4
5
6
7
8
9
10
11
python ./src/inference.py \
--gpu 1 \
--model-path ./trainoutput/test-fixed-stage-2/porno-resnetv1d-50 \
--model-index 16 \
--num-classes 3 \
--class-label porno,sexy,normal \
--test-images-path /Data/download-data-normal \
--test-result-path /Data/download-data-normal-result \
--save-image 0 \
--save-txt 1 \

mxnet进行inference的时候多种方式读取图像(看你习惯哪一个)

1、利用mxnet的imread()函数读取图像并进行数据扩增:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

#### data augment function

def transform(data,augmenters):
for aug in augmenters:
data = aug(data)
return data

#### 常用操作
cast_aug = mx.image.CastAug()
resize_aug = mx.image.ForceResizeAug(size=[width,height])
color_normal_aug = mx.image.ColorNormalizeAug(mx.nd.array([123,117,104]),mx.nd.array([1,1,1]))
argmenters = [cast_aug, resize_aug, color_normal_aug]

#### read image
image = mx.image.imread(path) ## 默认是RGB格式的,mxnet默认操作RGB格式数据
image = transform(image,argmenters) ## mxnet默认的图像操作在channel_last进行
image = mx.nd.transpose(image,axes=()==(2,0,1))
image = mx.nd.expand_dims(image,axis=0)

2、另一种mxnet图像读取方式:利用数据流:

这时,data是ndarray格式的数据(imdecode进行图像解码)

1
2
3
4
5
image_string = open('file_path', 'rb').read()
data = mx.image.imdecode(image_string, flag=1)
data = mx.nd.transpose(data,axes=(2, 0, 1))
data = mx.nd.expand_dims(data, axis=0)
print('shape:{}'.format(data.shape))

3、采用opencv读取方式进行图像操作

1
2
3
4
5
6
7
8
9
10
11

def readImage(path):
img = cv2.imread(path)
if img is None:
print 'xxx'
return
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
img= np.transpose(img,(2,0,1)) ## channel first
img = img[np.newaxis,:] ## add batch -> 4 dims
return img

==mxnet对于读取的图像类型有局限性,非jpe或者jpeg图像容易出现错误==

三、使用升级篇

mxnet如何获取网络参数

mxnet可以通过model.get_params() 可以得到arg_params以及aux_params

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import mxnet as mx
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-50',0)#载入模型
mod = mx.mod.Module(symbol=sym,context=mx.gpu(0)) #创建Module
mod.bind(for_training=False,data_shapes=[('data',(1,3,224,224))])
mod.set_params(arg_params,aux_params)
import numpy as np
import cv2
def get_image(filename):
img = cv2.imread(path)
if img is None:
print 'xxx'
return
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) ## BGR - > RGB
img= np.transpose(img,(2,0,1)) ## channel first
img = img[np.newaxis,:] ## add batch -> 4 dims
return img
from collections import namedtuple
Batch = namedtuple('Batch',['data'])
img = get_image('val_1000/0.jpg') #获取图片
data_batch = Batch([mx.nd.array(img)])
mod.forward(data_batch) #预测结果

################################################
#获取权重信息
keys = mod.get_params()[0].keys() # 列出所有权重名称
conv_w = mod.get_params()[0]['conv0_weight']

## 方法二:直接从读取的arg_params读取,这个本身就是key,value结构。
fc_weight = arg_params['fc_weight']

#获取想要查看的权重信息,如conv_weight
print conv_w.asnumpy() #查看具体数值
################################################

#### 拿到输出结果
prob = mod.get_outputs()[0].asnumpy()
y = np.argsort(np.squeeze(prob))[::-1]
print('truth label %d; top-1 predict label %d' % (val_label[0], y[0]))

mxnet如何获取中间层特征

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
## 这个主要是使用mx.symbol.Group([xxx,xxx])来多输出


### 方法一:定义模型的时候就把中间结果
import mxnet as mx
net = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
net = mx.symbol.Activation(data=fc1, name='relu1', act_type="relu")
net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
out = mx.symbol.SoftmaxOutput(data=net, name='softmax')
# 通过把两个输出组成一个group来得到自己需要查看的中间层输出结果
group = mx.symbol.Group([fc1, out])
print group.list_outputs()




### 方法二,在现有模型中将中间层拿出来

import mxnet as mx
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-50',0)#载入模型

args = sym.get_internals().list_outputs() #获得所有中间输出
internals = model.symbol.get_internals()
fc1 = internals['fc1_output']
conv = internals['stage4_unit3_conv1_output']
group = mx.symbol.Group([fc1, sym, conv]) #把需要输出的结果按group方式组合起来,这样就可以得到中间层的输出

mod = mx.mod.Module(symbol=group,context=mx.gpu()) #创建Module
mod.bind(for_training=False,data_shapes=[('data',(1,3,224,224))]) #绑定,此代码为预测代码,所以training参数设为False
mod.set_params(arg_params,aux_params)
import numpy as np
import cv2
def get_image(filename):
img = cv2.imread(path)
if img is None:
print 'xxx'
return
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) ## BGR - > RGB
img= np.transpose(img,(2,0,1)) ## channel first
img = img[np.newaxis,:] ## add batch -> 4 dims
return img
from collections import namedtuple
Batch = namedtuple('Batch',['data'])
img = get_image('val_1000/0.jpg') #获取图片
mod.forward(Batch([mx.nd.array(img)])) #预测结果
prob = mod.get_outputs()[0].asnumpy()
y = np.argsort(np.squeeze(prob))[::-1]
print('truth label %d; top-1 predict label %d' % (val_label[0], y[0]))

Gluon篇

gluon调用MXNet模型(params+json)进行训练

参考链接:https://blog.csdn.net/qq_20622615/article/details/89924387

使用symbol得到的模型或者gluon的hybridize之后的模型包括一个.json文件(网络结构)和.params文件(参数),gluon可以使用net = gluon.SymbolBlock.imports(json, [‘data’], params, ctx)导入网络和参数,这样可以进行测试或者进一步训练。

但是如果只需要使用模型的其中一部分,比如只需要conv层,去掉所有fc层,或者再另外增加一些层, 这样直接导入就会比较复杂。正确的做法如下:

1
2
3
4
5
6
7
8
sym, arg_params, aux_params = mx.model.load_checkpoint("1.0.3", 40)#这里是model的名字和参数对应的epoch 
layers = sym.get_internals()#得到所有的layers
outputs = layers['stage4_unit1_conv2_output']#选择输出层
inputs = layers['data']#选择输入层
net = gluon.SymbolBlock(outputs, inputs)#使用gluon的接口将其封装成一个新的net
net.load_parameters("1.0.3-0040.params", ignore_extra = True, allow_missing = True)#载入数据
y = net(data)
print(y.shape)

如果需要在该网络的基础上再新增加一些层,如下定义:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class PretrainedNetwork(gluon.HybridBlock):    
def __init__(self, pretrained_layer, **kwargs):
super(PretrainedNetwork, self).__init__(**kwargs)
with self.name_scope():
self.pretrained_layer = pretrained_layer #(n, 4, 4, 128)
self.fc = nn.HybridSequential()
self.fc.add(
nn.Flatten(),
nn.Dense(256, activation = 'relu'),
nn.Dropout(rate = 0.5),
nn.Dense(128)
)
self.single_fc = nn.Dense(2)
self.fusion_fc = nn.Dense(2)

def hybrid_forward(self, F, x):
x = self.pretrained_layer(x)
x = self.fc(x)
feat = x
y1 = self.single_fc(x)
feat = feat.sum(axis = 1)
y2 = self.fusion_fc(feat)
return y1, y2

通过下面的方式,使用预训练模型初始化其中一部分:

1
2
net = PretrainedNetwork(pretrained_layer = net) 
net.initialize(forece_reinit = False, init = init.Xavier())

需要注意的是,要先load_parameters再用其初始化PretrainedNwtwork,否则会出现prefix不匹配的问题。

如果需要fix其中一部分参数,只训练其中一部分,可以通过观察所有layer的名字,找到需要训练的layer。

print(net.collect_params())#打印所有的参数,这样可以看到所有的layer及其参数

在Trainer的params通过正则表达式选择需要训练的参数:

1
trainer = gluon.Trainer(params = net.collect_params("pretrained*|dense0*"), optimizer = optimizer)

这样没有被选中的参数就会被fix,训练中不会改变。

导出Gluoncv的预训练模型

利用gluoncv的export_block导出gluoncv提供的预训练模型

1
2
3
4
5
import gluoncv
from gluoncv.utils import export_block

net = gluoncv.model-zoo.get_model('ResNet101_v2', pretrained=True)
export_block('resnet101', net, preprocess=None, layout='CHW')

gluon训练

gluon主要有3个方法得到预训练模型:

  • gluon自身的model_zoo
  • gluoncv提供的model_zoo
  • mxnet提供的预训练模型(*.params ,*.json)

下面分别就这三个方面进行介绍

1.读取gluon model_zoo提供的模型,并进行finetune

gluon提供的model主要在gluon.model_zoo.vision下,模型地址:https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html,你可以根据自己的情况查找对应的模型进行使用。model_zoo提供的模型均为features+output结构

调用方法如下:

一:只修改最终的fc层,进行finetune:

1
2
3
4
5
6
7
8
9
10
11
from mxnet import gluon
class_num = 3
ctx = [mx.gpu(0),mx.gpu(1)]

finetune_net = gluon.model_zoo.vision.resnet50_v2(pretrained=True)

with finetune_net.name_scope():
finetune_net.output = nn.Dense(class_num)
finetune_net.output.initialize(init=mx.init.Xavier(),ctx=ctx)
finetune_net.hybridize()

二:不仅仅修改最终的fc层,还可以增加几层

下面的方法,首先提取出features,然后构建增加的sequential,最后将两部分通过sequential合并在一起。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from mxnet import gluon
pretrained_net = gluon.model_zoo.vision.resnet50_v2(pretrained=True)
pretrained_net_features = pretrained_net.features

class_num = 3
ctx = [mx.gpu(0),mx.gpu(1)]
modify_net = nn.HybridSequential(prefix="")
with modify_net.name_scope():
modify_net.add(nn.Dense(128,activation='relu'),
nn.Dropout(0.5),
nn.Dense(class_num))
modify_net.collect_params().initialize(init=mx.init.Xavier(),ctx=ctx)

net = nn.HybridSequential(prefix="")
with modify_net.name_scope():
net.add(pretrained_net_features)
net.add(modify_net)
net.hybridize() ## 该语句代表静态图动态图切换。

也可以直接修改features,达到同样的效果,不过记得初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from mxnet import gluon
class_num = 3
ctx = [mx.gpu(0),mx.gpu(1)]

finetune_net = gluon.model_zoo.vision.resnet50_v2(pretrained=True)

with finetune_net.name_scope():
finetune_net.features.add(nn.Dense(128,activation='relu'),
nn.Dropout(0.5))
finetune_net.output = nn.Dense(class_num)
finetune_net.features.initialize(init=mx.init.Xavier(),force_init=False,ctx=ctx)
finetune_net.output.initialize(init=mx.init.Xavier(),ctx=ctx)
finetune_net.hybridize()

2.读取gluoncv model_zoo提供的模型,并进行finetune(==推荐==)

gluoncv是gluon提供的比较强大的视觉库,其中提供了很多的预训练模型可以使用,链接:https://gluon-cv.mxnet.io/model_zoo/classification.html

使用gluoncv的预训练模型也很方便,跟使用gluon的model_zoo方法基本一致,不同点如下:

1
2
3
4
from gluoncv.model_zoo import get_model

finetune_net = get_model('ResNet50_v2', pretrained=True)

其他的就跟上面的一致了。

3.直接读取mxnet模型( .params + .json)

有的时候,我们可能需要利用gluon读取mxnet模型,目前利用gluon读取mxnet模型,只能使用gluon.nn.SymbolBlock()进行读取,如下:

1
2
3
4
5
6
7
8
9
ctx = mx.gpu(0)
sym, arg_params, aux_params = mx.model.load_checkpoint('../model/resnetv1d-101',17) ## model path and model index
internals = sym.get_internals()
net_out = internals['fc1_output']

net = gluon.nn.SymbolBlock(outputs=net_out, inputs=mx.sym.var('data'))

net.load_params(filename='../model/resnetv1d-101-0017.params', ctx=ctx)

如上,我们便读取了mxnet的model,现在我们便可以对net进行操作了,如下代码构建了一个3分类的网络:

1
2
3
4
5
6
7
8

class_num = 3
finetune_net = nn.HybridSequential(prefix="")
with finetune_net.name_scope():
finetune_net.add(net)
finetune_net.add(nn.Dense(class_num))## 输出3分类
net.hybridize() ## 该语句代表静态图动态图切换。

4.最优雅的方式,重新定义网络,实现任意的操作

这种方法最为优雅,也最灵活,你可以采用上面个各个方法读取模型,然后重写forward,实现网络的任意操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class PretrainedNetwork(gluon.HybridBlock):
def __init__(self, pretrained_layer, **kwargs):
super(PretrainedNetwork, self).__init__(**kwargs)
with self.name_scope():
self.pretrained_layer = pretrained_layer
self.fc = nn.HybridSequential()
self.fc.add(
nn.Flatten(),
nn.Dense(256, activation = 'relu'),
nn.Dropout(rate = 0.5),
nn.Dense(128)
)
self.output = nn.Dense(2)


def hybrid_forward(self, F, x): ## 这里注意F不要忘记。
x = self.pretrained_layer(x)
x = self.fc(x)
out = self.output(x)

return out



### 采用如下得到网络:

from gluoncv.model_zoo import get_model

finetune_net = get_model('ResNet50_v2', pretrained=True)
net = PretrainedNetwork(pretrained_layer = finetune_net)
net.initialize(forece_reinit = False, init = init.Xavier()) ## 初始化


四、其他的杂七杂八

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

body = mx.sym.Convolution(data=data, num_filter=64, kernel=(7, 7), stride=(2,2), pad=(0, 0),
no_bias=True, name="conv1", workspace=workspace)
# 最大值池化
body = mx.sym.Pooling(data=body, kernel=(3, 3), stride=(2,2), pad=(1,1), pool_type='max')
# 均值池化
body = mx.symbol.Pooling(data=body, kernel=(7,7) , pool_type='avg' , name = 'avg1')
# global average pooling
pool = mx.sym.Pooling(data=last_fm, kernel=(pool_size, pool_size), stride=(1, 1),
pool_type="avg", name="global_pool", global_pool=True)
# BN
body = mx.sym.BatchNorm(data=body, fix_gamma=False, momentum=0.9, eps=2e-5, name='bn1')
*注*eps参数是在cudnn中使用的,目的是防止除数为0的情况,默认参数是0.001,这里使用的是0.00002
# RELU
body = mx.sym.Activation(data=body, act_type='relu', name='relu0')