Caffe2 - (六)图像加载与预处理

Caffe2 - 图像加载与预处理

举例说明从图像文件或图像 url 加载图像,以及相应的 Caffe2 用到的必要的图像预处理.

必要的 python 包:

sudo pip install scikit-image scipy matplotlib
import skimage
import skimage.io as io
import skimage.transform 
import sys
import numpy as np
import math
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
print("Required modules imported.")

1. Image Load

Caffe 使用的是 OpenCV 的 Blue-Green-Red (BGR),而不是通用的 Red-Green-Blue (RGB).

Caffe2 也使用 BGR.

IMAGE_File = 'https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec=1515142841038&di=56e548070fdf3ed09c1dc95a0ee2204f&imgtype=0&src=http%3A%2F%2Fimg6.ph.126.net%2FEfFAh8rDic7haOAZAQ83ag%3D%3D%2F2497527468371353486.jpg'
img = skimage.img_as_float(skimage.io.imread(IMAGE_File)).astype(np.float32)

# show the original image
plt.figure()
plt.subplot(1,2,1)
plt.imshow(img)
plt.axis('on')
plt.title('Original image = RGB')

# show the image in BGR - just doing RGB->BGR temporarily for display
imgBGR = img[:, :, (2, 1, 0)]
#plt.figure()
plt.subplot(1,2,2)
plt.imshow(imgBGR)
plt.axis('on')
plt.title('OpenCV, Caffe2 = BGR')
plt.show()

Caffe 的 CHW 是指:

  • H: Height
  • W: Width
  • C: Channel (as in color)

GPU 采用 CHW;而 CPU 往往是 HWC.

cuDNN 使用来对 GPUs 加速计算的工具,只支持 CHW.

因此,采用 CHW 的原因在于其速度更快.


2. Image Mirror 和 Rotate

# Flip, Mirror
imgMirror = np.fliplr(img)
plt.figure()
plt.subplot(1,2,1)
plt.imshow(imgMirror)
plt.axis('off')
plt.title('Mirror image')

# Rotate
imgRotated = np.rot90(img)
plt.subplot(1,2,2)
plt.imshow(imgRotated)
plt.axis('off')
plt.title('Rotated image')
plt.show()


3. Image Resize

resize 是图像预处理很重要的一部分.

Caffe2 要求图片是方形(square)的,需要 resize 到标准的 height 和 width. 这里将图片 resize 为 256×256 ,然后 crop 出 224×224 的尺寸大小. 即为网络输入的图片 input_height 和 input_width.

input_height, input_width = 224, 224
print("Model's input shape is %dx%d") % (input_height, input_width)
img256 = skimage.transform.resize(img, (256, 256))
plt.figure()
plt.imshow(img256)
plt.axis('on')
plt.title('Resized image to 256x256')
print("New image shape:" + str(img256.shape))
plt.figure()

Model’s input shape is 224x224
New image shape:(256, 256, 3)


4. Image Rescale

由于图片尺度因素,在 resize 和 crop 时需要考虑 rescaling.

比如,原始图片尺寸 1920×1080 ,crop 尺寸 224×224 ,如果不进行 rescaling,则 crop 出的部分可能丢失图片的重要信息及意义.

print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!")
print("Model's input shape is %dx%d") % (input_height, input_width)
aspect = img.shape[1]/float(img.shape[0])
print("Orginal aspect ratio: " + str(aspect))
if(aspect>1):
    # 宽图片 - width 较大
    res = int(aspect * input_height)
    imgScaled = skimage.transform.resize(img, (input_height, res))
if(aspect<1):
    # 长图片 - height 较大
    res = int(input_width/aspect)
    imgScaled = skimage.transform.resize(img, (res, input_width))
if(aspect == 1):
    imgScaled = skimage.transform.resize(img, (input_height, input_width))
plt.figure()
plt.imshow(imgScaled)
plt.axis('on')
plt.title('Rescaled image')
print("New image shape:" + str(imgScaled.shape) + " in HWC")
plt.show()

Original image shape:(607, 910, 3) and remember it should be in H, W, C!
Model’s input shape is 224x224
Orginal aspect ratio: 1.49917627677
New image shape:(224, 335, 3) in HWC


5. Image Crop

# Center crop on the original
print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!")
def crop_center(img,cropx,cropy):
    y,x,c = img.shape
    startx = x//2-(cropx//2)
    starty = y//2-(cropy//2)    
    return img[starty:starty+cropy,startx:startx+cropx]

plt.figure()
# Original image
imgCenter = crop_center(img,224,224)
plt.subplot(1,3,1)
plt.imshow(imgCenter)
plt.axis('on')
plt.title('Original')

# Now let's see what this does on the distorted image
img256Center = crop_center(img256,224,224)
plt.subplot(1,3,2)
plt.imshow(img256Center)
plt.axis('on')
plt.title('Squeezed')

# Scaled image
imgScaledCenter = crop_center(imgScaled,224,224)
plt.subplot(1,3,3)
plt.imshow(imgScaledCenter)
plt.axis('on')
plt.title('Scaled')
plt.show()


6. Image Upscale

将小 square 的图片转换为较大的尺寸.

imgSmall128 = skimage.transform.resize(img, (128, 128))
print "Assume original image shape: ", imgSmall128.shape
imgBig224 = skimage.transform.resize(imgSmall128, (224, 224))
print "Upscaled image shape: ", imgBig224.shape
# Plot original
plt.figure()
plt.subplot(1, 2, 1)
plt.imshow(imgSmall128)
plt.axis('on')
plt.title('128x128')
# Plot upscaled
plt.subplot(1, 2, 2)
plt.imshow(imgBig224)
plt.axis('on')
plt.title('224x224')
plt.show()

Assume original image shape: (128, 128, 3)
Upscaled image shape: (224, 224, 3)

图片有一定的模糊.

imgSmallSlice = crop_center(imgSmall128, 128, 56)
# Plot original
plt.figure()
plt.subplot(1, 3, 1)
plt.imshow(imgSmall128)
plt.axis('on')
plt.title('Original')
# Plot slice
plt.subplot(1, 3, 2)
plt.imshow(imgSmallSlice)
plt.axis('on')
plt.title('128x56')
# Upscale?
print "Slice image shape: ", imgSmallSlice.shape
imgBigg224 = skimage.transform.resize(imgSmallSlice, (224, 224))
print "Upscaled slice image shape: ", imgBigg224.shape
# Plot upscaled
plt.subplot(1, 3, 3)
plt.imshow(imgBigg224)
plt.axis('on')
plt.title('224x224')
plt.show()

7. Image Preprocessing

imgCropped = crop_center(imgScaled,224,224)
print "Image shape before HWC --> CHW conversion: ", imgCropped.shape
# HWC -> CHW 
imgCropped = imgCropped.swapaxes(1, 2).swapaxes(0, 1)
print "Image shape after HWC --> CHW conversion: ", imgCropped.shape

plt.figure()
for i in range(3):
    plt.subplot(1, 3, i+1)
    plt.imshow(imgCropped[i])
    plt.axis('off')
    plt.title('RGB channel %d' % (i+1))

# RGB -> BGR
imgCropped = imgCropped[(2, 1, 0), :, :]
print "Image shape after BGR conversion: ", imgCropped.shape

# Subtract the mean image
# skimage loads image in the [0, 1] range so we multiply the pixel values first to get them into [0, 255].
# mean_file = os.path.join(CAFFE_ROOT, 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
# mean = np.load(mean_file).mean(1).mean(1)
# img = img * 255 - mean[:, np.newaxis, np.newaxis]

plt.figure()
for i in range(3):
    plt.subplot(1, 3, i+1)
    plt.imshow(imgCropped[i])
    plt.axis('off')
    plt.title('BGR channel %d' % (i+1))
# Batch 
# Feed in multiple images
# Make sure image is of type np.float32
imgCropped = imgCropped[np.newaxis, :, :, :].astype(np.float32)
print 'Final input shape is:', imgCropped.shape
plt.show()

本站公众号
   欢迎关注本站公众号,获取更多程序园信息
开发小院