Semi-supervised learning

Top PDF Semi-supervised learning:

Activity recognition from smartphone sensing data

Activity recognition from smartphone sensing data

occurs two files are created where unlabeled data is written, one file only with Dynamic unlabeled data and other with Static unlabeled data. Because of these two files a flag was created whether if both files had data. If that was true (what only happens in stress conditions for the tests) the second part of the hierarchical approach takes turns. First it creates a model for the Dynamic (Running, Walking) or Static (Standing Idle, Sitting), then loads the files created by the first classification, it loads the Dynamic unlabeled file whether we are doing the Dynamic classification or loads the Static unlabeled file whether we are doing the Static classification. In the end we have the instance labeled by function 3. If we are using semi- supervised learning and doing this hierarchical approach the 70% can be found on both classifications (1 st classification and 2 nd classification in Dynamic and Static).
Mostrar mais

74 Ler mais

A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of pre- diction confidence of samples and data distribution in a changing environment, a “soft- start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.
Mostrar mais

19 Ler mais

Res. Biomed. Eng.  vol.33 número4

Res. Biomed. Eng. vol.33 número4

Department of Research & Scientific Affairs. Skateboarding safety [internet]. Rosemont: AAOS; 2013. [cited 2001 Oct 31]. Available from: http://orthoinfo.aaos.org/topic.cfm?topic=a00273. Dong-Hyun L. Pseudo-Label: the simple and efficient semi- supervised learning method for deep neural networks. In: Goodfellow I, Erhan D, Bengio Y, editors. Proceedings of The ICML 2013 workshop: challenges in representation learning - WREPL [internet]; 2013 June 21; Atlanta. 2013 [cited 2013 Aug 26]. Available from: http://deeplearning.net/wp-content/ uploads/2013/03/pseudo_label_final.pdf.
Mostrar mais

8 Ler mais

A Two Step Data Mining Approach for Amharic Text Classification

A Two Step Data Mining Approach for Amharic Text Classification

Abstract: - Traditionally, text classifiers are built from labeled training examples (supervised). Labeling is usually done manually by human experts (or the users), which is a labor intensive and time consuming process. In the past few years, researchers have investigated various forms of semi-supervised learning to reduce the burden of manual labeling. In this paper is aimed to show as the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large quantities of unlabeled documents are readily available. In this paper, intended to implement an algorithm for learning from labeled and unlabeled documents based on the combination of Expectation- Maximization (EM) and two classifiers: Naive Bayes (NB) and locally weighted learning (LWL). NB first trains a classifier using the available labeled documents, and probabilistically labels the unlabeled documents while LWL uses a class of function approximation to build a model around the current point of interest. An experiment conducted on a mixture of labeled and unlabeled Amharic text documents showed that the new method achieved a significant performance in comparison with that of a supervised LWL and NB. The result also pointed out that the use of unlabeled data with EM reduces the classification absolute error by 27.6%. In general, since unlabeled documents are much less expensive and easier to collect than labeled documents, this method will be useful for text categorization tasks including online data sources such as web pages, e-mails and news group postings. If one uses this method, building text categorization systems will be significantly faster and less expensive than the supervised learning approach.
Mostrar mais

9 Ler mais

Towards Multi Label Text Classification through Label Propagation

Towards Multi Label Text Classification through Label Propagation

We have proposed a novel label propagation based approach for multi label classifier. It works in conjunction with semi supervised learning setting by considering smoothness assumptions of data points and labels. The approach is evaluated using small scale datasets (Enron, Slashdot) as well as large scale dataset (Bibtex). It is also verified against traditional supervised method. Our approach shows significant improvement in accuracy by incorporating unlabeled data along with labeled training data. But significant amount of
Mostrar mais

4 Ler mais

Specific Land Cover Class Mapping by Semi-Supervised Weighted Support Vector Machines

Specific Land Cover Class Mapping by Semi-Supervised Weighted Support Vector Machines

Abstract: In many remote sensing projects on land cover mapping, the interest is often in a sub-set of classes presented in the study area. Conventional multi-class classification may lead to a considerable training effort and to the underestimation of the classes of interest. On the other hand, one-class classifiers require much less training, but may overestimate the real extension of the class of interest. This paper illustrates the combined use of cost-sensitive and semi-supervised learning to overcome these difficulties. This method utilises a manually-collected set of pixels of the class of interest and a random sample of pixels, keeping the training effort low. Each data point is then weighted according to its distance to its near positive data point to inform the learning algorithm. The proposed approach was compared with a conventional multi-class classifier, a one-class classifier, and a semi-supervised classifier in the discrimination of high-mangrove in Saloum estuary, Senegal, from Landsat imagery. The derived classification accuracies were high: 93.90% for the multi-class supervised classifier, 90.75% for the semi-supervised classifier, 88.75% for the one-class classifier, and 93.75% for the proposed method. The results show that accuracy achieved with the proposed method is statistically non-inferior to that achieved with standard binary classification, requiring however much less training effort.
Mostrar mais

16 Ler mais

Machine learning approach for the outcome prediction of temporal lobe epilepsy surgery.

Machine learning approach for the outcome prediction of temporal lobe epilepsy surgery.

The data was analyzed using supervised classification tech- niques. This design treats the feature defining the problem differently (as either full recovery from epilepsy or not). This variable is usually termed the class variable. Patient outcome was evaluated after surgery using the Engel’s scale [24]: Class I, seizure free (n = 14); Class II, rare disabling seizures (almost seizure free; n = 2); Class III, worthwhile improvement (n = 3). Classes II and III reflect improvement in the disease but not complete recovery. For this reason, both categories were grouped together. Thus, the supervised class variable describes seizure free patients (n = 14) and those exhibiting an improvement only (n = 5). Finally, the missing values mentioned in Table 2 were entered using the mode of the variable with missing values conditioned to the class variable.
Mostrar mais

9 Ler mais

Exploiting entities for query expansion

Exploiting entities for query expansion

Selection of expansion terms: Regarding an improved selection of expansion terms, Cao et al. [2008] found that a non-negligible fraction of expansion terms identified by traditional pseudo-relevance feedback approaches is either neutral or harmful to the effectiveness of the initial query. As a result, they proposed a supervised classification approach using support vector machines (SVM) to predict the usefulness of expan- sion terms. In a similar vein, Udupa et al. [2009] found that the usefulness of a term may vary drastically depending on the already selected terms. Hence, they proposed to take into account term interactions in order to identify a useful set of expansion terms. Their approach was based on a spectral partitioning of the weighted term- document matrix using singular value decomposition (SVD). Both approaches showed significant improvements compared to state-of-the-art pseudo-relevance feedback ap- proaches, such as relevance models [Lavrenko and Croft, 2001] and model-based feed- back [Zhai and Lafferty, 2001b]. Focusing on difficult queries, Kotov and Zhai [2012] conducted a study on methods leveraging the ConceptNet knowledge base to improve the search results for these poorly performing queries. They proposed a supervised ap- proach using generalized linear regression to use concepts from ConcetpNet to expand difficult queries.
Mostrar mais

109 Ler mais

Teaching history with blogs for student engagement and critical use of digital media   Ensino de história com blogs para participação estudantil e uso crítico da mídia digital

Teaching history with blogs for student engagement and critical use of digital media Ensino de história com blogs para participação estudantil e uso crítico da mídia digital

Richardson described blogs as a “way to communicate with students…, archive and publish student work, learn with far-flung collaborators, and ‘manage’ the knowledge that members of the school community create” (2003, p. 5). Richardson (2006) went on to assert that blogs can be used by participants to construct knowledge, share ideas, and collaborate equally. In addition to knowledge construction, Ferdig and Trammel (2004) stated that links to different types of resources could be provided on a blog, providing students with exposure to the content and diverse perspectives as well as with opportunities to verify that content. They also asserted that the process of verifying content and making their own contributions helps build student expertise in a subject. The public nature of blogs also encourages students to accept more responsibility and ownership for their contri- butions and their learning (F ERDIG & T RAMMEL ,
Mostrar mais

8 Ler mais

Semi-supervised Method of Multiple Object Segmentation with a Region Labeling and Flood Fill

Semi-supervised Method of Multiple Object Segmentation with a Region Labeling and Flood Fill

Efficient and efficient multiple object segmentation is an important task in computer vision and object recognition. In this work; we address a method to effectively discover a user’s concept when multiple objects of interest are involved in content based image retrieval. The proposed method incorporate a framework for multiple object retrieval using semi-supervised method of similar region merging and flood fill which models the spatial and appearance relations among image pixels. To improve the effectiveness of similarity based region merging we propose a new similarity based object retrieval. The users only need to roughly indicate the after which steps desired objects contour is obtained during the automatic merging of similar regions. A novel similarity based region merging mechanism is proposed to guide the merging process with the help of mean shift technique and objects detection using region labeling and flood fill. A region R is merged with its adjacent regions Q if Q has highest similarity with Q (using Bhattacharyya descriptor) among all Q’s adjacent regions. The proposed method automatically merges the regions that are initially segmented through mean shift technique, and then effectively extracts the object contour by merging all similar regions. Extensive experiments are performed on 12 object classes (224 images total) show promising results.
Mostrar mais

19 Ler mais

Educ. rev.  vol.32 número2

Educ. rev. vol.32 número2

This form of professionalism is apparent in the official curricular texts, as it indicates that knowledges to be part of teachers’ qualiication in education programs. The oficial curricular discourse privileges a competence-centered pedagogy, deined as the “know how”, actions and “action forms”, knowledges and abilities mobilized “in situations”, problem solving in the realm of teaching and learning. In addition to competences, the CNE text also speciies the knowledges to be part of the professional development of future teachers, during their initial qualiication. They are, in fact, the so called teaching knowledges, which include professional or education science and pedagogical ideolog y knowledges, disciplinary and curricular knowledges, and experience knowledges. The CNE text lists these knowledges, which include knowledge resulting from experience, as follows: “General and professional culture”; “Knowledge about children, youths and adults”; “Knowledge of the cultural, social, political and economic aspects of Education”; “Contents of the knowledge area that are object of teaching”; “Pedagogical knowledge”; “Knowledge resulting from experience” (BRASIL..., 2001a, p. 44-49).
Mostrar mais

26 Ler mais

Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data.

Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data.

Integrating large-scale functional genomic data has significantly accelerated our understanding of gene functions. However, no algorithm has been developed to differentiate functions for isoforms of the same gene using high-throughput genomic data. This is because standard supervised learning requires ‘ground-truth’ functional annotations, which are lacking at the isoform level. To address this challenge, we developed a generic framework that interrogates public RNA-seq data at the transcript level to differentiate functions for alternatively spliced isoforms. For a specific function, our algorithm identifies the ‘responsible’ isoform(s) of a gene and generates classifying models at the isoform level instead of at the gene level. Through cross-validation, we demonstrated that our algorithm is effective in assigning functions to genes, especially the ones with multiple isoforms, and robust to gene expression levels and removal of homologous gene pairs. We identified genes in the mouse whose isoforms are predicted to have disparate functionalities and experimentally validated the ‘responsible’ isoforms using data from mammary tissue. With protein structure modeling and experimental evidence, we further validated the predicted isoform functional differences for the genes Cdkn2a and Anxa6. Our generic framework is the first to predict and differentiate functions for alternatively spliced isoforms, instead of genes, using genomic data. It is extendable to any base machine learner and other species with alternatively spliced isoforms, and shifts the current gene- centered function prediction to isoform-level predictions.
Mostrar mais

16 Ler mais

Clinics  vol.65 número12

Clinics vol.65 número12

The Orbscan II TM (Bausch & Lomb) is a hybrid system that acquires data through slit-scanning and Placido ring technology. This instrument is able to map multiple ocular surfaces beyond the anterior corneal surface. 11 A well- known theorem in prediction theory states that, when more variables describing an event can be measured, the model can predict the outcome more precisely. 12 Thus, we hypothesized that a high accuracy in the classification of keratoconus subjects can be reached when Orbscan II data are used to develop supervised learning methods.

6 Ler mais

Gene expression profiling of solitary fibrous tumors.

Gene expression profiling of solitary fibrous tumors.

Interestingly, some genes overexpressed in SFTs encode therapeutic targets of drugs commercialized or under development (Table S2): BCL2, CD33, EGFR, ERBB2, FGFR1, FNTA, FYN and YES1, RARA and RARG, and TLR3. They also include several additional kinase genes such as CHEK1, DDR1, EPHA1, EPHB1, JAK2, JAK3, LCK, PIK3C2G, PRKCD, PTK7, STK3, and STK4, and 5 genes coding for histone deacetylases (HDAC1, 3, 4, 5 and 11). Of course, before any clinical testing, functional experiments are warranted to determine whether the overexpression of these ‘‘druggable’’ genes in SFTs represents a ‘‘passenger’’ or a ‘‘driver’’ alteration. Histone deacetylases may be interesting candidates. HDACs are transcriptional corepressors, whose mutations and/or overepression have been reported in several cancers, making them new important therapeutic targets. Clinical trials with HDAC inhibitors are ongoing in sarcomas (NCT01112384: SB039; NCT00918489: vorinostat; NCT00878800: belinostat). On the basis of our observation, analysis of patients with SFT is awaited. Our other supervised analyses compared expression profiles of SFT subgroups defined upon anatomical location and mitotic index. Meningeal SFTs have long been considered as different from pleural or extra-pleural SFTs. Our data suggest that they are not significantly different at the transcriptional level on a whole- genome scale. However, a robust 573-gene signature was identified by supervised analysis, suggesting differences between both locations. This result is consistent with that reported in a series of 23 samples [22]. Mitotic index is a prognostic feature of SFT, but displays some limitations at the technical (reproducibility) and prognostic levels. We identified a robust 31-gene signature discriminating SFTs with ‘‘low’’ versus ‘‘high’’ mitotic count. The analysis revealed many genes related to cell cycle and mitosis, including the classical Ki67 cell cycle marker and some kinases involved in G2 and M phases of the cell cycle: Aurora-A, a major kinase regulating mitosis, BUB1, BUB1B and TTK/MPS1 with known key roles in the various cell division checkpoints, and MELK, a regulator of the S/G2 and G2/M transitions. Interestingly, some of these genes such as AURKA, BUB1, TTK, and RRM2 code for therapeutic targets of drugs under develop- ment (Table S5). Using IHC, we could validate the overexpression of AURKA in SFTs with high mitotic index in a 51-sample series. In conclusion, we report the largest gene expression profiling study of SFTs. The robustness of our GES was confirmed using independent validation sets at the RNA level and for 2 genes at the protein level. The comparison between SFTs and STSs evidenced several differentially expressed genes, some of them could provide new diagnostic markers (ALDH1), as well as potential prognostic (AURKA) and/or therapeutic targets.
Mostrar mais

10 Ler mais

Clinical Relationships Extraction Techniques from Patient Narratives

Clinical Relationships Extraction Techniques from Patient Narratives

engineering techniques, supervised learning techniques). In the case of rule-based engineering, writing extraction rules requires extensive effort from a rule engineering expert who is familiar with the target domain. In the case of supervised learning, annotation of training data and features/model parameters require extensive effort from at least one annotator (expert in the target domain) and from a natural language processing expert. In addition to MUC can be classified according to annotated corpora and evaluation software exist (e.g. the ACE relation extraction challenges [11], the LLL genic interaction extraction challenge [12], the BioCreative-II protein-protein interaction task [13]). Many systems use a syntactic parse with domain-specific grammar rules such Linguistic String project [14] to fill template data structures corresponding to medical statements. Other systems use a semantic lexicon and grammar of domain-specific semantic patterns such MedLEE [15] and BioMedLEE [16] to extract the relationships between entities. Other systems use a dependency parse of texts such MEDSYNDIKATE [17] to build model of entities and their relationships. MENELAS [18] also use a full parse. All these approaches are knowledge-engineering approaches. In addition to supervised machine learning has been applied to clinical text. There are many works on relation extraction from biomedical journal papers and abstracts. This work has been done within the hand-written rule base/knowledge engineering approaches.
Mostrar mais

15 Ler mais

Using the Comfortability-in-Learning Scale to Enhance Positive Classroom Learning Environments

Using the Comfortability-in-Learning Scale to Enhance Positive Classroom Learning Environments

st udy w it h m ult iple pr ogr am s m ay r ev eal differences in how pr ogr am s v iew lear ning env ir onm ent s as a m eans t o incr ease lear ning. I t w ould also be int er est ing t o invest igat e how inst r uctors could im plem ent changes to cour ses based on dat a fr om each adm inist r at ion of t he CLS. Adding open ended quest ions t o t he CLS for st udent s and inst r uct ors m ay shed addit ional insight on learning. St udent s could be ask ed t o descr ibe how t heir com for t abilit y w it h classm at es, inst r uct or , and course cont ent changed t hroughout t he sem est er and inst r uctor s could be asked t o descr ibe how t hey used t he dat a for m t he CLS to infor m t heir t eaching.
Mostrar mais

8 Ler mais

EDUCATIONAL INOVATION AND CONSUMER BEHAVIOUR. A STUDY OF STUDENTS PERCEPTIONS ON THE USE OF E-LEARNING IN CLASS.

EDUCATIONAL INOVATION AND CONSUMER BEHAVIOUR. A STUDY OF STUDENTS PERCEPTIONS ON THE USE OF E-LEARNING IN CLASS.

The emergence of e-learning platforms as a result of the growing importance of lifelong learning and integrating them in the traditional educational environment was a crucial moment in the evolution of educational practices. Focusing on computer, Internet and intranets, e-learning brings education a surplus of interactivity, interaction, responsibility and collaborative learning. Considered as innovative solutions, initially, complementary to the classical teaching techniques, e-learning technologies gradually penetrate the traditional classroom learning environment. Introducing innovation in the educational environment causes changes in the behavior of all actors confronted with it. Hence, knowing the perceptions of the main consumers of knowledge is a key element in the implementation process of innovation and assessing its effectiveness. This paper aims to develop major behavioral theories on e-learning environments, seeking to establish and explain the attitude of students, the main consumers of educational services, in terms of their perceptions about the introduction and use of these technologies in the classroom. Considering the results of the presented study may be the starting point in developing a complex behavioral pattern specific of the educational market by integrating behavioral aspects of all actors involved in providing education and confronting them with the main factors of influence.
Mostrar mais

5 Ler mais

Minimal Feature Set for Unsupervised Classification of Knee MR Images

Minimal Feature Set for Unsupervised Classification of Knee MR Images

Real Knee MRI data have been collected from MRI canters. Segmentation is implemented using Active Contour without edges. It is easy to separate them out easily and can easily access the part containing cartilage thickness. In the next phase, total 46 features have been calculated and in the pre- processing 5 features which give the detail of patient’s personal data have been removed. A database file consisting of 704 images with 41 lists of attributes is prepared and it used for classification process in next phase. Classification is implemented and performance of different parameters are compared using five algorithms ‘ID3’, ‘J48’, ‘FID3new’, ‘Naive Bayes’ & ‘Kstar’. ‘FID3new’ is a hybrid algorithm, which is proposed in this work. . In unsupervised classification learning rate of different algorithms is calculated by starting the training from 1 % till 99 % and it has been concluded that minimum 50% of training is required in the case of unsupervised classification also. At this training rate minimal feature set has been calculated by taking minimum 2 features in starting and then increase the number two in each iteration till 42 features (one more feature that defines the cluster assignment). In case of unsupervised classification minimal feature set consist of 20 features and ‘slice thickness’ is the feature with highest priority. Classification is done using different algorithms. It has been concluded that ‘FID3new’ correctly classifies all instances and gives TP rate of 1 and Root Means Square’s Error value 0. It classify the database on the base of feature ‘Slice thickness’ and divides them into four classes A, B, C & D. Where A= 0.9 mm, B=3, C= 4 and D= 6. The images coming under B & D class is classified as ‘Normal’ images and the images coming under A & D class is classified as ‘Abnormal’ images
Mostrar mais

6 Ler mais

Prediction of disease-related interactions between microRNAs and environmental factors based on a semi-supervised classifier.

Prediction of disease-related interactions between microRNAs and environmental factors based on a semi-supervised classifier.

Predicting novel disease-related miRNA-EF interactions is becoming an emergently important problem in bioinformatics, which could not only benefits the understanding of the disease pathogenesis at the miRNA and EF levels, but also plays significant roles in the prognosis, diagnosis, treatment and prevention of disease [34]. In this work, we first observed that miRNAs (EFs) pair interacting with more similar EFs (miRNAs) is often more similar after analyzing the human disease related miRNA-EF interaction data. Based on the above finding, we then developed the miREFScan to predict novel disease-related interactions between miRNAs and EFs based on a semi-supervised classifier in the framework of LapRLS. The result shows that miREFScan has a reliable accuracy of prediction. miREFScan is the first computational tool which can predict ternary relationships among miRNAs, EFs, and diseases together at the same time. It is anticipated that miREFScan would be a useful resource for researches about the relationships among miRNAs, EFs, and human diseases.
Mostrar mais

10 Ler mais

Learning and testing stochastic discrete event

Learning and testing stochastic discrete event

In the area of learning stochastic discrete event systems, we have established an inclu- sion between this model and a more abstract model (generalized semi-Markov processes). We have implemented a new and the first algorithm to learn GSMP. This learning al- gorithm allows us to construct models given data from realistic sensor networks or even samples from real environments. With this we can provide a set of features such as the possibility of statistically verifying these models using statistical model checking and test- ing deterministic models or even creating stochastically a suite of tests. We have ensured that our learning algorithm, in the limit, is equal or similar to the one that was used for learning. We demonstrate the proposition that merging two equivalent states is cor- rect when sample executions grow infinitely. We also show that the convergence of the Kolmogorov-Smirnov test is reachable (i.e., a exponential convergence). We also have proposed an algorithm to estimate the scheduler of events of a GSMP. This allows us to estimate the original clock values and estimate the parameters of the probabilistic distri- butions coupled to each event. A potential benefit of discrete event systems is that they tend to be highly amenable to parallelization in comparison to other common systems. We have exemplified one real case study that can be amenable for analysis and simulation. We can use this model to simulate the high speed train availability to the satellite and test a new land-satellite communication protocols. We have exemplified an analysis of scheduling algorithms for real-time systems when the uncertainty govern the execution time of tasks.
Mostrar mais

100 Ler mais

Show all 5023 documents...