+
Data Mining, Integration and Analysis
Karin Becker
+ Data Mining, Integration and Analysis
Knowledge Discovery
Web and Text Mining
Data Science
Recommendation Systems
Scalability and Performance
Reproducibility
Ana Lucia Cetertich Bazzan
Joao Luiz Dihl Comba
Karin Becker
Leandro Krug Wives
Lucas Mello Schnorr
Mara Abel
Renata De Matos Galante
Viviane Pereira Moreira
Reserach Areas Faculty
+ Knowledge Discovery
What do we do?
+ Knowledge Discovery
• Data Collection
• Data Integration
• Data Preprocessing
• Data Mining
• Data Analysis
+
Karin Becker
+ Extract Knowledge from Social Media
Semantic enrichment framework for event-related tweet identification (Simone Romero)
No assumptions about event properties
Contextual knowledge from semantic web and external documents
Improved mainly recall
Simone Romero, Karin Becker. A framework for event classification in tweets based on hybrid semantic enrichment . Expert Systems with Applications 118: 522-538 (2019)
+ Extract Knowledge from Social Media
Identification of stance in tweets (Marcelo Dias)
No threads of argumentations
Unsupervised and weakly supervised* frameworks (runner- up)
Target and stance expression depends on the domain
Marcelo Dias, Karin Becker. An Heuristics-based, Weakly-Supervised Approach for Classification of Stance in Tweets . Proc. of Web Inteligence, 2016.
+ Extract Knowledge from Social Media
Identification of stance in tweets
Unsupervised framework
Excelent perfomance on straightfoward targets (Hillary, Clinton)
Marcelo Dias, Karin Becker. An Heuristics-based, Weakly-Supervised Approach for Classification of Stance in Tweets . Proc. of Web Inteligence, 2016.
+ Extracting Knowledge from Social Midia
analyze the emotions people express about terrorism events in Twitter using demographics (Jonathas Harb)
Automatic emotion classification (4 terrorism events)
Tested deep learning with different seeding strategies
Demographic analysis (Face++, Profile Location)
Jonathas Harb, Karin Becker. Emotion Analysis of Reaction to Terrorism on Twitter. Proc.
of Workshop on Big Social Data and Urban Computing, 2018.
Conv (5) Conv (4) Conv (3)
Global Pool Global Pool Global Pool
Concat Dropout
Out
Embeddings
Word-level Input
Glove’s Embeddings
Analysis
Q2: Do different terrorism events raise the same
emotional reaction?
NO
Gender? Age? Location?
Our hypothesis: it
depends on how people relate to the event
+ Extracting Knowledge from Social Midia
Compare engagement of twitter users in Pink October and Blue November campaigns (Roberto Walter)
5 different countries
Demographic analysis (Face++, Profile Location)
Tweet topic categorization
Roberto Walter, Karin Becker. Caracterização e Comparação das Campanhas do Outubro Rosa e Novembro Azul no Twitter. SBBD 2018: 133-144
+ Extracting Knowledge from Social Midia
Topic discovery and drift analysis
+ Extracting Knowledge from Social Interaction
Relating conversational topics and toxic behavior effects in a MOBA game (Joaquim Mesquita)
MOBA Games (LoL)
Effects of toxic behavior on other players
Behavioral patterns based on on-line chats
Joaquim A. M. Neto, Karin Becker: Relating conversational topics and toxic behavior effects in a MOBA game. Entertainment Computing 26: 10-29 (2018)
+ Extracting Knowledge from Social Interaction
Relating conversational topics and toxic behavior effects in a MOBA game (Joaquim Mesquita)
MOBA Games (LoL)
Effects of toxic behavior
Behavioral Patterns based on on-line chats
Joaquim A. M. Neto, Karin Becker: Relating conversational topics and toxic behavior effects in a MOBA game. Entertainment Computing 26: 10-29 (2018)
+ Extracing Knowledge from Medical Data
Machine translation for biomedical texts, paralel corpus (Felipe Soares)
Hierarchical classifier for non-invasive colorectal cancer screening
Plasma fluorescence data
Cancer, No findings, Further investigation
Felipe Soares, Karin Becker, Michel J. Anzanello:
A hierarchical classifier based on human blood plasma fluorescence for non- invasive colorectal cancer screening. Artificial Intelligence in Medicine 82: 1- 10 (2017)
+ Extracting Knowledge from Medical Data
Relating mental states using social media (Vanessa Borba)
Characterization of mental states (verbal cues, emotions and sentiments, behavioral and social patterns)
Analysis of temporal evolution of mental states (e.g.
Ansiety – depression – suicide)
Detecting Anomalies in Health Provision Records (Cristiano Sulzbach)
Lack of parameters of “normality”
Discovery of groups of data
Analysis of closeness
+ A final word on Software Engineering
Strong background on software engineering
Industry experience
Agile Methods
Sentiment analysis on software artifacts
Satisfaction of IT users (Sentiment analysis on IT Tickets, Blaz, 2016)
Analisis of assertiveness of user stories and development productivity and quality metrics (Guilherme Dias, 2018)
Using gamefication in SCRUM for self-imrpovement (Camilla Schmidt, on-going)
Renata Galante
galante@inf.ufrgs.br
Data Integration
Data Analysis
Raul Barth (master)
Passenger density and flow analysis and city zones and bus stops classification for public bus service
management
Framework
• DMBSM – Data Mining Framework for Bus Service Management
• Input: GPS, bus stop and smart card data
• Extracting as passengers’ density and flow information
• Bus stops segmentation based on travel purposes
• Finding the real bus service demand
• Enabling decision-making.
• Based on Lambda Architecture, using Big
Data for parallel processing
Framework – Architecture and Results
Drunk Text Identification
Marcos Grzeça, Karin Becker, Renata Galante (UFRGS)
Drunk Text Identification
Detecção de textos escritos por pessoas alcoolizadas Marcos Grzeça, Karin Becker, Renata Galante (UFRGS)
Romero & Becker (2019)