Fig. 4 shows the system framework of the proposed visual servo control design, which is a hybrid switching architecture consisting of an image-based reactive planning unit, a positionbased reactive planning unit, and an inverse kinematics controller. As mentioned in the previous subsection, the system requires two cameras to accomplish the object pick-and-place task that contains three stages. In the first stage, the overheadmounted camera estimates the target center position in world coordinate to assist the position-based reactive planning process. The camera first captures the image of whole view of the table, and a color-based image segmentation algorithm is then applied to extract all color objects in the global observed image. Next, an SVM-based shape classifier is employed to classify all extracted color objects into four types of shape: spherical, cubic, spherical hole, and square hole. Based on these classified results, a coordinate conversion approach based on the conventional camera calibration techniques is adopted to convert the center position of all spherical-hole and squarehole objects from image coordinate to world coordinate. The converted coordinates will be used as the desired end-effector positions in world coordinate for the robot to put a grasped object. Moreover, the orientation of each square-hole is also estimated via an image-based object orientation estimation algorithm, which will be presented in Section III.