LICAP

History

Founded in 1999, the Applied Computational Intelligence Laboratory – LICAP was created from research funding received from different foundations and institutions such as PUC Minas, CNPQ and FAPEMIG. Since then, it has become an indispensable laboratory for the development of new research projects within PUC Minas. Today LICAP has in its history the participation of more than a hundred members, among undergraduate and graduate students being scenery and support of new projects in the area of applied computational intelligence, having as consequence several published works.

Presentation

Nowadays there is a lot of data being available, and many more are being created all the time through different data sources such as the internet and its services, sensors (fixed and mobile), media-collection devices, telecommunications, consumption, financial, and meteorological, among many others. Any sector, whether industrial, agro-business, health, services, transportation, logistics, urban planning, energy consumption, climatology, meteorology, etc., are interested in how to exploit this huge volume of data. The knowledge obtained from this data will be higher by paying attention to all available information, structured data sources, and unstructured available around a problem domain. It is also known that manipulating, managing, and extracting relevant knowledge of this huge volume of data already exceeded the storage capacity and computer processing. This new scenario has provided the arise of the concept, Data Science involving the main foundations that support a process of knowledge discovery and generating useful information for decision-making from several sources in large volumes of data.

This new concept has a strong relationship with Data Mining, since the latter focuses on techniques and algorithms to identify patterns and useful knowledge extraction. The details and validity of such procedures and methods characterize and establish the concept of science data.

When a process of knowledge discovery is added a huge amount of data, with its various types, generating various data sources, and the need to manipulate them for online knowledge extraction or real-time, comes a new concept, Big Date. This is a term used to refer to the handling, storage, and analysis of massive data volumes generated and captured at high speed from sources containing Various types of data, structured and unstructured, but complementary; and when analyzed can generate credible information of great value to society. The 5Vs are the principles that underlie the concept of Big Data. Within this new paradigm, it is possible to identify their key challenges: theoretical foundations, infrastructure, capture, storage, manipulation, search, transfer, mining, analysis, visualization, privacy, and data security.

In order to extract knowledge and patterns from massive amounts of data, there are the first challenges: 1st) the hardware infrastructure and software become unconventional requiring new architectures and solutions such as Hadoop technology; 2nd) data mining, which allows the discovery of patterns is to be made with acceptable processing times; and 3rd) visualization of massive data with results to support decision-making.

We are interested in the extraction of knowledge from data. These data can be structured, semi-structured, and unstructured. We use theories and techniques drawn from fields such as computation, statistics, mathematics, and information theory proposing new approaches and/or algorithms for methodologies for knowledge discovery, conceptual modeling, pre-processing of data, machine learning, statistical learning, formal verification of algorithms, computer programming for the explosive problem and high-performance computing. Nowadays we are interested in adapting the conventional solutions of Data Mining for Big Data Problems. We are interested in several domains, such as Bioinformatics and Health, Education, Social Networks, Weather forecasting, Prediction of stock, the Siderurgical Industry, Energy, Marketing optimization, Fraud detection, Security, and Public policy.

Bibliographic production of LICAP

AREAS OF INTEREST

We are interested in the following topics

01
Methodology for Knowledge Discovery; Data mining for complex data; Data mining for sequence; Data mining for longitudinal data; Data mining for temporal/time series data; Data mining for extreme events; Pre-processing of data (outlier analysis, missing value, noisy data).
02
Causal Discovery; Interpretability and Explainability of models; Counterfactual analysis; Representativeness of the data set.
03
Supervised learning; Unsupervised learning; Semi-supervised learning; Transductive learning; Classification Multi-label;
04
Clustering, bi-clustering, tri-clustering; Categorical data processing; Similarity measures; Dimensionality reduction.
05
Feature selection: Bio-inspired mechanisms – Genetic algorithms.
06
Neural networks – Deep Learning; Knowledge extraction from trained neural networks; Support vector machine; Bayesian inference; Association rules; Quantitative methods for data analysis.
07
Statistic techniques: Factor analysis; Simple and multivariate linear regression; Non-linear Regression; Descriptive Statistic; Inferential Statistic; Hypothesis test; Analysis of variance.

08
Formal concept analysis for data mining; Diadic and Triadic analysis; Minimal set of implication rule; Handling of context with high dimensionality; Optimization of FCA algorithms: Parallelism of FCA algorithms; FCA to represent and analyze social networks.
09
Formal verification of systems and project of algorithms; Algorithm for combinatorial optimization; Optimization of algorithm for data mining.
10
Big Data: Big-data infrastructure; Distributed computing MPI, GPU, Map-reduce High-performance computing; Sampling, balancing; Scalable methods.
11
Applications: Finance; Health Informatics; Public health; Bioinformatics; Forecasting weather; Extreme events; Data mining for social good; Methodologies for Data Mining; Education data mining; Digital library; Social networks; Community detection; User modeling; Industry 4.0.