Human -Computer Interface using Gestures based on Neural Network

(1)

Human -Computer Interface using Gestures based on

Neural Network

Aarti Malik

1

, Shalini Dhingra

2

1 _{Department of Electronics and Communication Engineering, I.E.T, Bhaddal, INDIA}

2 _{Assistant Professor, Department of Electronics and Communication Engineering, I.E.T, Bhaddal, INDIA}

Abstract- Gestures are powerful tools for non-verbal communication. Human computer interface (HCI) is a growing field which reduces the complexity of interaction between human and machine in which gestures are used for conveying information or controlling the machine. In the present paper, static hand gestures are utilized for this purpose. The paper presents a novel technique of recognizing hand gestures i.e. A-Z alphabets, 0-9 numbers and 6 additional control signals (for keyboard and mouse control) by extracting various features of hand ,creating a feature vector table and training a neural network. The proposed work has a recognition rate of 99%.

.

Keywords –: human –computer interface (HCI), features, feature vector table, static hand gestures, neural network.

I. Introduction

The current interface between humans and machine are keyboard and mouse. To make the human computer interface (HCI) more intuitive, gestures are used which makes the communication possible from a distance and without physical contact with the computer. Gestures are expressive, meaningful body motions – i.e., physical movements of the fingers, hands, arms, head, face, or body with the intent to convey information or interact with the environment. Gesture recognition is the process by which gestures made by the user are made known to the system and is the new technology utilized for HCI. Gesture recognition has wide-ranging applications [1] such as the following:

• developing aids for the hearing impaired;

• enabling very young children to interact with computers;

• designing techniques for forensic identification;

• recognizing sign language;

• medically monitoring patients’ emotional states or stress levels;

• lie detection;

• navigating and/or manipulating in virtual environments;

• communicating in video conferencing;

• distance learning/tele-teaching assistance;

(2)

II. . Proposed methodology

Proposed algorithm –

Figure 1. Methodology for gesture classification

Figure 2. System Block Diagram

B. Data acquisition –

The first step towards image processing is acquiring an image. To acquire an image, an external web camera is used by us and the various hand gestures are captured with its help. After it, a dataset is created which contains the various hand gestures captured by the camera. The format of the image can be RGB i.e. uint8, uint16, single or double. The output image is of the same format as the input image. If the input image is a colormap, the input and output color -maps are both of class double. The acquired image is RGB and needs to be processed before its feature are extracted and recognition is made.

Sign classification

(3)

Figure 3. Actual Dataset

C.Pre- processing:

Pre-processing consist of two steps :

• Segmentation

• Morphological filtering

Segmentation is done to convert gray scale image into binary image so that we can have only two object in image one is hand and other is background. Otsu algorithm [3] is used for segmentation purpose and gray scale images are converted into binary image consisting hand or background. There are two main approaches to segmentation: 1. Pixel-based or local methods that includes: edge detection and boundary detection. 2. Region-based or global approaches, which include region merging and splitting and threshold (Awcock and Thomas 1995). Thresholding techniques is to partition the image histogram by using a single threshold, T. Segmentation is then accomplished by scanning the image pixel by pixel and labeling each pixel as object or background depending on whether the gray level of that pixel is greater of less than the value of T (Gonzalez and Woods 2001).

After converting gray scale image into binary image we have to make sure that there is no noise in image so we use morphological filter technique. Morphological techniques consist of four operations: dilation, erosion, opening and closing.

MORPHOLOGICAL FILTERING: if the segmentation is not smooth, Background may have some 1s which is known as background noise and hand gesture mat have some 0s that is known is gesture noise. These are errors and have to be removed if we want no problem in contour detection of a gesture. A morphological filtering [4] approach has been applied using sequence of dilation and erosion to obtain a smooth, closed, and complete contour of a gesture.

D. Feature extraction:

The pre-processed image is now used and various features of the resultant image are computed. The features that are extracted are: orientation, number of fingers raised and Euclidean distance.

(4)

length of the bounding box is greater than the width of bounding box and their ratio would be greater than 1. And if hand is horizontal then width of bounding box is greater than the length of bounding box and their ratio would be lesser than 1.

2. Number of fingers raised: A gesture is distinguished from the other by its orientation and number of fingers raised. For getting the total number of finger raised in hand gesture we need to process only finger region of the hand that we have got in previous step by computing centroid. To proceed this task we trace the entire boundary matrices of hand. Vertical hand image and horizontal hand image have been processed in different manner. For vertical hand image, we only consider the y coordinates of the boundary matrices. When we get the values of y coordinates of boundaries starts increasing after the sharp decrement in the y-boundaries value. We consider this indication as tip of the finger and we fix it as a peak value or a peak. Similarly for horizontal image, we consider the x coordinate of the boundary matrices. This time only the x coordinates of the boundary matrices is traced. When we get the x coordinate of boundaries starts decreasing after the continuous increment we mark this point as a tip of the finger in horizontal hand and set it as peak. In this way we found the tip of all raised and folded fingers in the image, but we need to classify significant peaks and insignificant peaks among them. For this we need to proceed to the next step to calculate the Euclidean distance.

3. Euclidean distance: it is calculated by the following formula

Here ‘a’ represents all the boundary points and ‘b’ represent the reference point that is taken as centroid itself. On the basis of this distance formula we can find out the length of each raised or folded finger taking centroid as a reference point, this is done in order to extract the exact number of finger raised in the image.

4. Classification: the neural network used for classification is shown below:

Figure 4: The neural network

Classification phase includes network architecture, creating network and training the network. Network of feed forward back propagation with supervised learning is used.

III.EXPERIMENT AND RESULT

The network is trained on 10 samples of each sign. Samples of same size and other features like distance, rotation and lighting effect and with uniform background are taken into consideration while discarding the others. 690 gestures are taken in total and 42 gestures are classified.

(5)

Table1: result chart of recognition of various gestures

Simulation result using GUI

(6)

Figure 5: Example of GUI with an image Figure 6: Identification of sign “1” selected from the dataset

C. Neural network parameters:

ROC: The receiver operating characteristic is a metric used to check the quality of classifiers. For each class of a classifier, roc applies threshold values across the interval [0, 1] to outputs. For each threshold, two values are calculated, the True Positive Ratio (the number of outputs greater or equal to the threshold, divided by the number of one targets), and the False Positive Ratio (the number of outputs less than the threshold, divided by the number of zero targets).The colored lines in each axis represent the ROC curves. The ROC curve is a plot of the true positive rate (sensitivity) versus the false positive rate (1 - specificity) as the threshold is varied. A perfect test would show points in the upper-left corner, with 100% sensitivity and 100% specificity.

(7)

IV.CONCLUSION

We have proposed a method that is user friendly and is uniform for all users. It utilizes those features that are common to all. Visually Impaired people can make use of hand gestures for writing text on electronic document like MS Office, notepad etc. The strength of this approach includes its simplicity, ease of implementation, and it does not required any significant amount of training or post processing, it provide us with the higher recognition rate with minimum computation time .The approach used has a recognition rate of 99% as compared to other approaches used in this direction. we have made a system that converts sign to text.

V.FUTURE SCOPE / CHALLENGES

The work presented in this project recognizes ASL static signs and numbers and some control signals only. The work can be extended to be able to recognize dynamic signs of ASL. The system deals with images with uniform background, but it can be made background independent. It is overcome and it is made background independent. The network can be trained to the other types of images.

REFERENCE

[1] C.L Lisetti and D.J Schiano, ”Automatic classification of single facial images,” Pragmatics Cogn., vol. 8.pp. 185-235 ,2000

[2] Mokhtar M. Hasan, Pramoud K. Misra, (2011). “Brightness Factor Matching For Gesture Recognition System Using Scaled Normalization”, International Journal of Computer Science & Information Technology (IJCSIT), Vol. 3(2).

[3] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal. Machine. Intell., vol. 8, no. 6, pp. 679–698, Nov. 1986. [4] Jennifer Schlenzig, Edward Hunter, and Ramesh Jain, “Recursive spatio-temporal analysis: Understanding Gestures”, Technical report, Visual Computing Laboratory, University of San Diego, California, 1995.

[5] Meenakshi Panwar and Pawan Singh Mehra , “Hand Gesture Recognition for Human Computer Interaction”, in Proceedings of IEEE International Conference on Image Information Processing(ICIIP 2011), Waknaghat, India, November 2011.

[6] Rajeshree Rokade , Dharmpal Doye, Manesh Kokare, “Hand Gesture Recognition by Thinning Method”, in Proceedings of IEEE International Conference on Digital Image Processing (ICDIP), Nanded India, pages 284 – 287, March 2009.