Image Pre-Processing
The following pre-processing steps are applied to an image before it is sent through the network. These steps must be identical for both training and inference. The mean vector (, one number corresponding to each color channel) is not the mean of the pixel values in the current image but a configuration value that is identical across all training and test images.
The default values for and
parameters are 600 and 1200 respectively.
Network Organization
A R-CNN uses neural networks to solve two main problems:
- Identify promising regions (Region of Interest – ROI) in an input image that are likely to contain foreground objects
- Compute the object class probability distribution of each ROI – i.e., compute the probability that the ROI contains an object of a certain class. The user can then select the object class with the highest probability as the classification result.
R-CNNs consist of three main types of networks:
- Head
- Region Proposal Network (RPN)
- Classification Network