In this paper cloud describes the use of collection of services, applications, information, and infrastructure. Services like computation, network, and information storage. The major areas of focus are: - Information Protection, Virtual Desktop Security, Network Security, and Virtual Security. The need to protect such a key component of the organization cannot be over emphasized. Data Loss Prevention has been found to be one of the effective ways of preventing Data Loss. DLP solutions detect and prevent unauthorized attempts to copy or send sensitive data, either intentionally or unintentionally, without authorization, by people who have authorized access the sensitive information. Data loss, which means a loss of data that occur on any device that stores data. In this paper, we deal with both the terms data loss and dataleakage in analyzing how the DLP technology helps minimizing the data loss/leakage problem? DLP technology minimizes the data loss problem in the organization. Data Loss/Leakage Prevention (DLP) is computer security which is used to finding, monitor, or protect data in use, data in motion, or data at rest. DLP is used to identify sensitive content by using deep information analysis to per inside files or with the use if network communications. DLP is mainly designed to protect information assets in minimal interference in the business processes. It also enforces protective controls to prevent unwanted incidents. DLP can be used to reduce risk and to improve data management practices and even lower compliance cost. Systems are designed to detect and prevent unauthorized use and transmission of confidential information.
User Interface development requires a sound understanding of human hand’s anatomical structure in order to determine what kind of postures and gestures are comfortable to make. Although hand postures and gestures are often considered identical, the distinctions between them need to be cleared. Hand posture is a static hand pose without involvement of movements. For example, making a fist and holding it in a certain position is a hand posture. Whereas, a hand gesture is defined as a dynamic movement referring to a sequence of hand postures connected by continuous motions over a short time span, such as waving good-bye. With this composite property of hand gestures, the problem of gesture recognition can be decoupled into two levels- the low level hand posture detection and the high level hand gesture recognition. In vision based hand gesture recognition system, the movement of the hand is recorded by video camera(s). This input video is decomposed into a set of features taking individual frames into account. Some form of filtering may also be performed on the frames to remove the unnecessary data, and highlight necessary components. One of the earliest model based approaches to the problem of bare hand tracking was proposed by Rehg and Kanade . In  a model-based method for capturing articulated hand motion is presented. The constraints on the joint configurations are learned from natural hand motions, using a data glove as input device. A sequential Monte Carlo tracking algorithm, based on importance sampling, produces good results, but is view-dependent, and does not handle global hand motion.
d.) Modeling: Based on the data and the desired outcomes, a data mining algorithm or combination of algorithms is selected for analysis. These algorithms include classical techniques such as statistics, neighbourhoods and clustering but also next generation techniques such as decision trees, networks and rule based algorithms. The specific algorithm is selected based on the particular objective to be achieved and the quality of the data to be analysed.
When a sensor node collects the data it will send the scanned data to the cluster head, Cluster head will use the concept of data fusion and combine the required data and send to the base station. Due to traffic between the cluster head and the base station, power consuming by cluster head will become more. To avoid this we introduce the concept of Load balancing. To balance the load in the network there are different strategies like centralized load balancing, local or global load balancing, static or dynamic load balancing, sender initiated and receiver initiated load balancing. In Centralized load balancing, as the name implies the amount of load that has to be transferred to other sensor node will be decided by the central node. In the Load balancing, by using the local information of the neighboured processor, it decides upon a load transfer, and thereby minimizing remote communications whereas the Global balancing uses some global information is needed to initiate the load balancing. In static Load Balancing, the load will be assigned initially to all the nodes in the network. In dynamic Load balancing, the load can be changed at the run time depending on the number of sensor nodes participated . The last balancing strategy is sender- initiated and receiver-initiated, in this strategies when data are transferred in the network, due to heavy traffic the
The subject of structural health monitoring is drawing more and more attention over the last years. Many vibration-based techniques aiming at detecting small structural changes or even damage have been developed or enhanced through successive researches. Lately, several studies have focused on the use of raw dynamic data to assess information about structural condition. Despite this trend and much skepticism, many methods still rely on the use of modal parameters as fundamental data for damage detection. Therefore, it is of utmost importance that modal identification procedures are performed with a sufficient level of precision and automation. To fulfill these requirements, this paper presents a novel automated time-domain methodology to identify modal parameters based on a two-step clustering analysis. The first step consists in clustering modes estimates from parametric models of different orders, usually presented in stabilization diagrams. In an automated manner, the first clustering analysis indicates which estimates correspond to physical modes. To circumvent the detection of spurious modes or the loss of physical ones, a second clustering step is then performed. The second step consists in the data mining of information gathered from the first step. To attest the robustness and efficiency of the proposed methodology, numerically generated signals as well as experimental data obtained from a simply supported beam tested in laboratory and from a railway bridge are utilized. The results appeared to be more robust and accurate comparing to those obtained from methods based on one-step clustering analysis.
The objective of this research is to study the web design factors that influence the interesting of the Y-generation respondents. This study utilized the qualitative methodology by using focus group discussion method to gather related data. Five design techniques – screen design, response format, logo type, progress indicator, and image display were studied. Two screen displays wee designed from each technique for participants to select. Samples were selected into 4 groups. Each group possesses different characteristics. This study found that there was a higher preference from participants on progress indicator, logo type, response format and image display. However, there was no significant difference of participants’ preference on screen design.
Crime prediction is an attempt to identify and reducing the future crime. Crime prediction uses past data and after analyzing data, predict the future crime with location and time. In present days serial criminal cases rapidly occur so it is an challenging task to predict future crime accurately with better performance. Data mining technique are very useful to solving Crime detection problem. So the aim of this paper to study various computational techniques used to predict future crime. This paper provides comparative analysis of Data mining Techniques for detection and prediction of future crime.
In 1947, the Harvard Mark II was being tested by Grace Murray Hopper and her associates when the machine sud- denly stopped. Upon inspection, the error was traced to a dead moth that was trapped in a relay and had shorted out some of the circuits. The insect was removed and taped to the machine’s logbook . This incident is believed to have coined the use of the terms “bug”, “debug” and “debugging” in the field of computer science. Since then, the term debugging is associated to the process of detecting, locating and fixing faulty statements in computer programs. In software development, a large amount of resources is spent in the debugging phase. It is estimated that testing and debugging activities can easily range from 50 to 75 percent of the total development cost . This is due to the fact that the process of detecting, locating and fixing faults in the source code is not trivial and is error-prone. Even experienced developers are wrong almost 90% of the time in their initial guess while trying to identify the cause of a behavior that deviates from the intended one .
This section takes surveyon earlier work in different algorithms and techniques. These algorithms and techniques have been used in journals to compact with the test case prioritization difficulties. For System level prioritization the test case prioritization algorithm  regards two features. First feature is a test case detecting average number of faults per minute. The second factor is the error crash. That is testing efficiency could be increased by focusing on the test case. That is possible to have very huge number of harsh error. Therefore for every fault harshness value was allocated based on their collision of the error on the creation  presents a bee colony optimization algorithm. That implies natural workers food forging activities bees as an algorithm. That is for prioritize test cases regression test suite depend on huge error coverage.  Hybrid Particle Swarm Optimization is an artificial intelligence based technique. It is a group of particle Genetic Algorithms and Swarm Optimization technique. The resident’s plays main role in choosing which path the result will approach. The collections of particles creating population or the population contains of particles. From a provided issue, velocities and position are allocated on the basis of true particle Swarm Optimization technique the population are randomly selected. In the test case prioritization operation, it considers particle velocity as total implementation time taken by it in covering the issue and particle position because numeral of faults cover by particle. The finalizing criterion is to be choosing and on the basis of this Hybrid Particle Swarm Optimization algorithm conclude. The genetic algorithm used for prioritize the regression test suite. That genetic algorithm will prioritize the test cases based on the code coverage presented in .The conditions are found in every separate path this is used in the genetic algorithm fitness function will be the
In this method, original pieces from the source document are selected and concatenated to yield a shorter text. This technique is easy to adapt to large sources of data. A Conditional Random Field (CRF) framework was proposed by Shen et al. In their framework, the summarization problem is viewed as a sequence labeling problem where a document is a sequence of sentences that are labeled as 1 or 0 based on the label assignment to other sentences . Daume and Marcu presented BAYESUM which is a Bayesian Summarization model for query expansion. This model was found to be work well in purely extractive settings .
dependence and conditional heteroscedasticity pose separate sources of test size distortions. A o di gly, “a só et al. (2004) robustified the original ICSS method to handle leptokurticity and lack of independence, including GARCH effects. Kokoszka and Leipus (2000) provided a CUSUM-type consistent estimator of a single break point within a possibly non-Gaussian ARCH() framework. Cheng (2009) proposed a more efficient and numerically economical algorithm for change-point detection, encompassing among others the volatility break case. This technique is explained closer in Section 2.2 below. Alternatively, a more recent paper of Ross (2013) presented a nonparametric approach, based on the lassi Mood’s a k test dating back to 1954. The old tool was applied to specific financial time series, and its iterative version for multiple break detection was performed, yielding a non- parametric change-point algorithm (NPCPM). We focus on this procedure in Section 2.3. Xu (2013) presented a nonparametric approach, too, providing powerful CUSUM- and LM-type tests for both abrupt and smooth volatility break detection. In addition, the author provided a rich and versatile discussion and overview of the topic with ample reference. Needless to say, in the meantime numerous applicational papers on break detection have emerged. To list just a few, we mention Andreou and Ghysels (2002) who studied the topic in ARCH and stochastic volatility context; Covarrubias, Ewing, Hein and Thomson (2006) examined volatility changes in US 10-year Treasuries and dealt with modelling issues; structural breaks in currency exchange rates volatility within GARCH setup was considered in Rapach and Strauss (2008). The paper of Eckley, Killick, Evans and Jonathan (2010) deserves separate attention as it tackled volatility break detection in oceanography. To this purpose they analyzed storm wave heights across the Gulf of Mexico throughout the 20 th century, using a penalized likelihood change- point algorithm but only within a Gaussia f a e o k i ludi g o alizi g data p eprocessing).
In this survey, we have discussed about various Data Embedding Techniques which conceals data into a carrier for conveying secret messages having the drawback of image distortion. To overcome this problem we proposed a simple and efficient data embedding technique Pixel Pair Matching (PPM) algorithm. The basic idea of PPM is to use the values of Pixel Pair as a reference coordinate and the values of Pixel Pair in the neighborhood set of this Pixel Pair according to a given message digit. The Pixel Pair is then digit. Exploiting Modification Direction (EMD) and Diamond Encoding (DE) are two data-hiding methods proposed recently based on PPM. PPM allows users to select digits in any notational system for data embedding and thus achieves a better image quality.
In this application  BCO has been used to solve transportation problem using ride-matching problem. This paper presents the BCO concept discussing the social insects’ behavior and flexibilities in these colonies. Then it talks about natural bees focusing on their self-organizing behaviors and patterns. Then they have discussed the BCO meta heuristics in which the agent that is the artificial bees have been introduced. They collaborate with each other to find solution to different combinatorial optimization problems. Here a 5 step algorithm called Bee Colony Optimization has also been framed which comprises of initialization, forward pass and backward pass steps. The forward pass and backward pass are performed until the stopping criterion is met. Then they discussed the fuzzy bee system which brings forward the decision making strategy of artificial bees while searching the optimal solution. They have also given a fuzzy set which describes distance parameter. They have also discussed the calculation of solution component and the mechanism of choice of next solution component. They have introduced bee’s partial solution badness concept, bee’s nest mates recruiting decision, the calculation of path changes done by bees. Then they have solved the transportation problems faced by many congested countries which led to increased travelling time, increased number of waits, unexpected travelling delays, increased travelling costs, inconvenience to drivers and travelers and passengers, increased air and noise pollution and increased traffic accidents. The ride-matching problem have been solved using the fuzzy bee system using first forward pass, first backward pass, second forward pass steps of BCO algorithm explained in the paper earlier. The developed solution was tested ondata collected from 97 travelers who demanded ridesharing and 96 could get best path.
classification process. One of the most straightforward instance-based learning algorithms is the nearest neighbor algorithm. Aha (1997) and De Mantaras and Armengol (1998) presented a review of instance-based learning classifiers. Thus, in this study, apart from a brief description of the nearest neighbor algorithm, we will refer to some more recent works. k-Nearest Neighbor (kNN) is based on the principle that the instances within a dataset will generally exist in close proximity to other instances that have similar properties (Cover and Hart, 1967). If the instances are tagged with a classification label, then the value of the label of an unclassified instance can be determined by observing the class of its nearest neighbors. The Knn locates the k nearest instances to the query instance and determines its class by identifying the single most frequent class label. In Figure 8, a pseudo-code example for the instance base learning methods is illustrated.
Data mining involves the use of sophisticated data analysis tools to discover previously unknown, valid patterns and relationships in large data set. These tools can include statistical models, mathematical algorithm and machine learning methods. Consequently, data mining consists of more than collection and managing data, it also includes analysis and prediction. Classification technique is capable of processing a wider variety of data than regression and is growing in popularity . There are several applications for Machine Learning (ML), the most significant of which is data mining. People are often prone to making mistakes during analyses or, possibly, when trying to establish relationships between multiple features. This makes difficult for them to find solutions to certain problems. Machine learning can often be successfully applied to these problems, improving the efficiency of systems and the designs of machines. Numerous ML applications involve tasks that can be set up as supervised. In the present paper, we have concentrated on the techniques necessary to do this. In particular, this work is concerned with classification problems in which the output of instances admits only discrete, unordered values. Our next section presented various classification methods. Section III described evaluating the performance of classifier. Finally, the last section concludes this work.
The first part of the research deals with applying the wavelet transform as the technique to pre- process the large amount of time series medical data. Time series data are series of values of some feature of an event that are recorded over a period of time. These clinical data are inconsistent and contains noise. Therefore, wavelet transform is applied. It plays an important role in data mining due to its easy accessibility and practical use. Wavelet transformation provides efficient and effective solutions to many data mining problems. The wide variety of wavelet-related methods has been applied to a wide range of data mining problem. Wavelet transformation is the tool that divides up data, function, or operators into different frequency components with a resolution that matches to its scale. It provides economical and informative mathematical representation of many objects of interest. The tool is used in order to observe its ability to provide presentations of data that make mining process more efficient and accurate. After pre-processing, the clinical data goes through the data mining algorithm in order to obtain the desired representation of the information. There are number of data mining algorithms. Applying appropriate data mining algorithms to the prepared data is very important in the data mining process. The selection on the type of algorithm to use depends on the given data mining problem.
Increasing threshold voltage of transistor is not always possible therefore different paths are set for high threshold and low threshold devices . High- threshold voltage devices are used on non-critical paths in order to reduce the leakage power while using low-threshold devices on critical paths for maintaining circuit performance. This technique is called as Dual threshold CMOS (DTMOS) technique and uses an algorithm in search of gates where high threshold voltage devices can be used. In DTMOS, the body and the gate of each transistor are tied together such that when the device is off, the leakage is low. Current is high if the device is on.
brain. Some authors suggest that each neuron would only fire to a specific concept or stimulus (grandmother cells) ; therefore, cell assemblies encoding different ‘‘things’’ would not be expected to share neurons. However, a mounting body of work shows that neurons can be very selective (sparse coding), but are not grandmother cells [68,69]. The apparent grandmother cells in the human medial temporal lobe  may actually respond to between 50 and 150 distinct concepts . Neurons participating in the representation of multiple concepts imply that the processing of information is distributed and occurs through a multiplexed code, in which concepts are represented by the activity of partially-overlapping groups of neurons, as postulated by Hebb. Despite the worldwide acceptance of the cell assembly theory, there is still a paucity of evidence corroborating (or disproving) it. Hebb’s hypotheses not only deal with the formation of assemblies and phase sequences, but also constitute a complete theory describing how learning, fear, hunger, and other complex behaviors emerge from the brain . Most of the difficulty in testing the theory resides in the fact that only a tiny fraction of neurons in the brain can be simultaneously recorded at any given time. However, techniques for massive neuronal recordings are being developed at accelerating rates , and while we still lack proper tools for analyzing large quantities of neurons [57,72], much progress is being made to circumvent this limitation. We hope the work presented here constitutes a useful step in this direction.
Random projection is suitable to map even a very high di- mensional data set into a reduced dimension which still re- tains the most important features of the original data but is small enough to be numerically tractable. The selection of transformation matrix is of critical importance. If we by ad hoc reasoning decide that only certain lines in the mass spec- trum are important, we actually choose a very limiting sim- ple form of projection. On the other hand, if we randomly select different features from our data set multiple times, we effectively ensure that we do not loose anything important. This can be proven mathematically in a strict manner (Can- des, 2006). This reduction can be understood relatively easily by realizing that each projected dimension provides a view of the full original data set viewed from some particular di- rection. Therefore, a random projection into 10 dimensions provides 10 complete views of the entire data. For our task we select each element of R from a standard normal distribu- tion N(0, 1). Using an optional normalization N(0, 1)/ √ N makes the columns of R approximately unit length and the average of X is approximately zero, fulfilling the statistical requirements imposed by PCA.
Validation model is based on the matching between the two qualitative data, i.e. between testing data and the simulation result data. Simulation process start with running grid#30 for 5 runs, to produce simulation result data Gs1, Gs2, …, Gs5. Output investigation and error calculation based on the comparing between the value of simulation result data and actual data as the testing data The testing data contain of data grid captured after grid#30, i.e. grid # 31 to # 35. Since both are ordinal data type, the difference in value is not applied uniformly, the value should be measured proportionally based on the difference in value of his cell. The largest value = 3, where the maximum difference, the value of simulation data 3 and test data value 0 or vice versa. Value differences 1 (e.g. between extreme and high) are clearly thinner than the 2 (extreme- moderate) and 3 (extreme low). Visualization of the value can be improved understanding of user, with respect to error rate and distribution of the error spatially and temporally.