Smartphone – Generation NEXT (part 4)

Victor Erukhimov
Vladimir Elin, PhD
A. Bovyarin

From Dreams to Reality

In previous articles, “Smartphone of the Future” part 1, part 2 and part 3, we have already talked about the innovative way of operating the device without hands of its user. Perhaps, it all looked fantastic at that time; but now, when we obtained the RF patent for utility model “Electronic Personal Communicator,” we can slightly lift the veil of secrecy over this advanced technology.

In our opinion, the interface of the mobile device based on eye movements of a user can: simplify and speed up the input of information and communication with the device [1], help enter characters via eye movements [2], control the mouse [3], focus on the GUI and run applications [4], select objects in applications of virtual and augmented reality [5] - [7]. The demand for this technology is proved by the large number of research works [8] - [10] and the existence of commercial developments and products [11] - [13].

The technology is vital for people with disabilities. The focus of the conference COGAIN 2008 “Communication Environment and Mobility Control by Gaze” was exactly on this aspect of technology.

Proposed Solution

The work of the project will focus on developing the technology of controlling the mobile device with the user’s eyes. The work process includes selection of equipment (cameras, and various options for their location, the various options of the backlight), the study of algorithms and technologies, offering tracking the direction of gaze, and marketing analysis.

The mobile device will be equipped with two cameras focused in the same direction and forming a stereo pair. With these cameras, the mobile processor can track the direction of pupils of the user’s eyes and the time of their fixation on the virtual keyboard of a mobile device. As a result, we could implement the touchscreen function without physically touching the screen. The presence of two calibrated cameras will significantly improve visual estimation of gaze, for it will be possible to control the position of eye pupils, and it will provide a more accurate estimate of the position of the head, necessary to calculate the direction of gaze.

Design of the Prototype

The system, which defines the exact location of the point of gaze on the mobile device's screen, will have a modular structure. The project will be implemented with the following modules:

Module of detection and tracking of a face in the image
Assessment Module of the 3D position of the head related to the coordinate system of mobile devices
Search Module for the eye area in the face
Estimation Module of the exact location of the centers of the pupils in the image
Module of calculation of the point of view the onto a mobile device

Let’s consider these modules separately:

Face tracking of the user in the image is considered to be a solved task. This function is already present in many mobile devices. In prototyping, we will use the method to find the location a user described in the work [14] and implemented in the open source library with the code opencv [15].

Following the tracking of the user’s face, the next step is to evaluate its position with respect to the coordinate system of the cameras of a mobile device. The key points on the face are eye corners, the center of the nose, mouth corners, etc. Since we have two calibrated cameras in the device, with the data on the correspondence between the points of the face on the left and right camera, we can estimate the distance to these points [16]. Knowing the position of these points, we can estimate the three angles of the head rotation and the displacement vector with respect to the mobile device. Here the method described in [17] can be applied.

To estimate the key points of face, the method Active Shape Model (ASM) [18] will be used. In this case, the classic method of ASM, originally designed for a single camera and front-face position, will be modified to obtain better results using two calibrated cameras. ASM method is based on statistical learning model of location of facial points and their local description. The publicly available databases of marked individuals will be used for learning. To find more precisely the area of the eyes, the final result will be checked by other methods of finding the eye, described in [14], [19].

Solution Technique

For accurate assessment of the gaze, it is necessary to detect accurately the position of the pupils in the image. According to the study [20], the algorithms detecting the pupils work unstable on cameras with low resolution. Also, it is important that the significant improvement in quality of detection of pupils occurs using the infrared light. We have conducted our own research of several popular modern detection algorithms of pupils [9], [10], where we found out that the acceptable quality of detection is achieved with the resolution of the pupil of at least 15 pixels in diameter. Under the acceptable quality, we mean the accuracy in detection of a pupil to within 0.015 cm; such precision will provide the element management of the mobile device interface. Such resolution of a pupil corresponds approximately the resolution of a camera of about 1 megapixel, provided that the face occupies 50-70% of the image. We also analyzed the effect of varying the contrast of the pupil against the cornea on the accuracy of detection. Experiments have shown the need for the infrared light to increase contrast to an acceptable level.

One of the best algorithms for the detection of the pupil is the Starburst algorithm [9]. The algorithm was originally developed for infrared cameras mounted on the head. The input of the algorithm is a black-and-white image, where the eye is located in the center, and the pupil has a darker shade. The output of the algorithm is the exact position of the center of the pupil in the image. The first step of the algorithm is detection and leveling the glares on the cornea and pupil, which are inevitably present in the image. Then comes the generation of hypotheses for the location of points belonging to the boundary of the pupil. It was assumed that some of the points can be detected incorrectly, for example, on the border of the iris, or the upper eyelid. The third step of the algorithm is the search for an ellipse which optimally fits in found points of the pupil border. This step is done using the well-known method of Random Sample Consensus (RANSAC) [21]. The algorithm iteratively selects a subset of the points on which the optimal ellipse is built using the method of least squares [22], and then checked by the optimality criterion of the ellipse at all points of the hypothesis. In a preliminary study, we found out that the quality of work of Starburst is strongly influenced by the presence of glares and other interferences in the pupil area.

Another promising method for detecting the pupil is presented in [10]. Input and output of the algorithm are identical to the input and output of Starburst algorithm. One key feature of this algorithm is robustness to glares and partial overlapping of the pupils by eyelids. This robustness is vital for our scenario of eye-operating of the mobile device with the use of infrared light.

Three Steps of the Algorithm:

Approximate location of the pupil area on the fragment of the eye image. This step is done to reduce the search area and reduce the calculations. The detection of this area is executed by means of a convolution image with specially designed Haar wavelets [14]. The pupil area corresponds to large values of convolution.
In the detected pupil area, the segmentation of pixels of the image on and around the pupil is employed with the clustering k-means algorithm [23]. The connected components are found on the results of the k-means segmentation. The dark center of the connected component corresponds to the initial approximation of the center of the pupil. The edges of the image of the pupil are found using the Canny method [24].
Using the initial estimate of the center of the ellipse and the defined edges of the pupil, the ellipse is iteratively detected, covering as many image edges as possible. The ellipse, as in the Starburst method, is detected using the Random Sample Consensus (RANSAC).

We should note one more search method of the pupils, implemented in the project ITU Gaze Tracker (http://www.gazegroup.org/downloads/23-gazetracker), which is distributed in open source code. The program is designed for tracking the pupil on images obtained with an infrared camera fixed on a head.

Though the development is quite popular and has a large number of references in articles, the algorithm in the basis of its operation is simple and consists of the following steps:

The black-and-white image supplied from the infrared camera is binarized with a certain threshold, so that the darker areas of the threshold are considered as possible candidates of a pupil;
The connected components are selected on the retrieved binarized image;
Some of the components are screened out in accordance with the following rules:

Components on the image boundaries are deleted;

Components, with the area above or below a predetermined threshold, are also deleted;

If after the filtration there is more than one component, then the one that is the closest to the center of the image, is selected;

The center of mass of remaining components is assumed as the center of the pupil.

In [20] it is stated that no methods of assessment of the pupil work the same way under various shooting conditions. We compared the methods Starburst and Swirski. Swirski method showed good results in low-contrast images, but not in all tests it took over the Starburst method. Therefore, our project will also comprise the further study to create a hybrid method that will absorb the best of the methods described above and will be more stable under various shooting conditions. Since the accuracy of the position of the pupil should be at maximum, we plan to create a method that will work with sub-pixel precision, enabling to find pupils at a greater distance from the mobile device camera. Using two calibrated cameras will significantly reduce the number of false detections and provide more accurate estimation of the center of the pupil. In addition, the system for tracking pupils will be designed based on the methods of stochastic optimization [25], which will take into account the dynamic component of the system.

The presence of active infrared illumination in a mobile device can guarantee glares on a pupil. Detecting the glares and movements of the pupil relatively these glares, and with the data on the position of the head relatively to the mobile device, we can estimate the exact direction of gaze.

Conclusion

As part of the present research project based on the RF patent № 118140, the team can create a prototype system that tracks the gaze and implements the function of eye-contact control, including: two infrared high-resolution cameras, infrared backlight, the calculating device and the appropriate software.

Smartphone – Generation NEXT (part 4)

Индикатор ионизирующего излучения ДО-РА с функцией дозиметра-радиометра