pytorch-faster-rcnn

Image Pre-Processing

The following pre-processing steps are applied to an image before it is sent through the network. These steps must be identical for both training and inference. The mean vector ( $3 \times 1$ , one number corresponding to each color channel) is not the mean of the pixel values in the current image but a configuration value that is identical across all training and test images.

The default values for targetSize and maxSize parameters are 600 and 1200 respectively.

Network Organization

A R-CNN uses neural networks to solve two main problems:

Identify promising regions (Region of Interest – ROI) in an input image that are likely to contain foreground objects
Compute the object class probability distribution of each ROI – i.e., compute the probability that the ROI contains an object of a certain class. The user can then select the object class with the highest probability as the classification result.

R-CNNs consist of three main types of networks:

Head
Region Proposal Network (RPN)
Classification Network