Fuzzy rough sets (FRSs) are considered to be a powerful model for analyzing uncertainty in data. This model encapsulates two types of uncertainty: 1) fuzziness coming from the vagueness in human concept formation and 2) roughness rooted in the granulation coming with human cognition. The rough set theory has been widely applied to feature selection, attribute reduction, and classification. However, it is reported that the classical FRS model is sensitive to noisy information. To address this problem, several robust models have been developed in recent years. Nevertheless, these models do not consider a statistical distribution of data, which is an important type of uncertainty. Data distribution serves as crucial information for designing an optimal classification or regression model. Thus, we propose a data-distribution-aware FRS model that considers distribution information and incorporates it in computing lower and upper fuzzy approximations. The proposed model considers not only the similarity between samples, but also the probability density of classes. In order to demonstrate the effectiveness of the proposed model, we design a new sample evaluation index for prototype-based classification based on the model, and a prototype selection algorithm is developed using this index. Furthermore, a robust classification algorithm is constructed with prototype covering and nearest neighbor classification. Experimental results confirm the robustness and effectiveness of the proposed model.
Almost all the existing representation based classifiers represent a query sample as a linear combination of training samples, and their time and memory cost will increase rapidly with the number of training samples. We investigate the representation based classification problem from a rather different perspective in this paper, that is, we learn how each feature (i.e., each element) of a sample can be represented by the features of itself. Such a self-representation property of sample features can be readily employed for pattern classification and a novel self-representation induced classifier (SRIC) is proposed. SRIC learns a self-representation matrix for each class. Given a query sample, its self-representation residual can be computed by each of the learned self-representation matrices, and classification can then be performed by comparing these residuals. In light of the principle of SRIC, a discriminative SRIC (DSRIC) method is developed. For each class, a discriminative self-representation matrix is trained to minimize the self-representation residual of this class while representing little the features of other classes. Experimental results on different pattern recognition tasks show that DSRIC achieves comparable or superior recognition rate to state-of-the-art representation based classifiers, however, it is much more efficient and needs much less storage space.
Unsupervised feature selection (UFS) aims to reduce the time complexity and storage burden, as well as improve the generalization performance. Most existing methods convert UFS to supervised learning problem by generating labels with specific techniques (e.g., spectral analysis, matrix factorization and linear predictor). Instead, we proposed a novel coupled analysis-synthesis dictionary learning method, which is free of generating labels. The representation coefficients are used to model the cluster structure and data distribution. Specifically, the synthesis dictionary is used to reconstruct samples, while the analysis dictionary analytically codes the samples and assigns probabilities to the samples. Afterwards, the analysis dictionary is used to select features that can well preserve the data distribution. The effective L2,p-norm regularization is imposed on the analysis dictionary to get much sparse solution and is more effective in feature selection. We proposed an iterative reweighted least squares algorithm to solve the L2,p-norm optimization problem and proved it can converge to a fixed point. Experiments on benchmark datasets validated the effectiveness of the proposed method.
By removing the irrelevant and redundant features, feature selection aims to find a compact representation of the original feature with good generalization ability. With the prevalence of unlabeled data, unsupervised feature selection has shown to be effective in alleviating the curse of dimensionality, and is essential for comprehensive analysis and understanding of myriads of unlabeled high dimensional data. Motivated by the success of low-rank representation in subspace clustering, we propose a regularized self-representation (RSR) model for unsupervised feature selection, where each feature can be represented as the linear combination of its relevant features. By using L2,1 -norm to characterize the representation coefficient matrix and the representation residual matrix, RSR is effective to select representative features and ensure the robustness to outliers. If a feature is important, then it will participate in the representation of most of other features, leading to a significant row of representation coefficients, and vice versa. Experimental analysis on synthetic and real-world data demonstrates that the proposed method can effectively identify the representative features, outperforming many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy, redundancy reduction and classification accuracy.
With the rapid development of digital imaging and communication technologies, image set-based face recognition (ISFR) is becoming increasingly important. One key issue of ISFR is how to effectively and efficiently represent the query face image set using the gallery face image sets. The set-to-set distance-based methods ignore the relationship between gallery sets, whereas representing the query set images individually over the gallery sets ignores the correlation between query set images. In this paper, we propose a novel image set-based collaborative representation and classification method for ISFR. By modeling the query set as a convex or regularized hull, we represent this hull collaboratively over all the gallery sets. With the resolved representation coefficients, the distance between the query set and each gallery set can then be calculated for classification. The proposed model naturally and effectively extends the image-based collaborative representation to an image set based one, and our extensive experiments on benchmark ISFR databases show the superiority of the proposed method to state-of-the-art ISFR methods under different set sizes in terms of both recognition rate and efficiency.
Most of the current metric learning methods are proposed for point-to-point distance (PPD) based classification. In many computer vision tasks, however, we need to measure the point-to-set distance (PSD) and even set-to-set distance (SSD) for classification. In this paper, we extend the PPD based Mahalanobis distance metric learning to PSD and SSD based ones, namely point-to-set distance metric learning (PSDML) and set-to-set distance metric learning (SSDML), and solve them under a unified optimization framework. First, we generate positive and negative sample pairs by computing the PSD and SSD between training samples. Then, we characterize each sample pair by its covariance matrix, and propose a covariance kernel based discriminative function. Finally, we tackle the PSDML and SSDML problems by using standard support vector machine solvers, making the metric learning very efficient for multiclass visual classification tasks. Experiments on gender classification, digit recognition, object categorization and face recognition show that the proposed metric learning methods can effectively enhance the performance of PSD and SSD based classification.
Small sample size is one of the most challenging problems in face recognition due to the difficulty of sample collection in many real-world applications. By representing the query sample as a linear combination of training samples from all classes, the so-called collaborative representation based classification (CRC) shows very effective face recognition performance with low computational cost. However, the recognition rate of CRC will drop dramatically when the available training samples per subject are very limited. One intuitive solution to this problem is operating CRC on patches and combining the recognition outputs of all patches. Nonetheless, the setting of patch size is a non-trivial task. Considering the fact that patches on different scales can have complementary information for classification, we propose a multi-scale patch based CRC method, while the ensemble of multi-scale outputs is achieved by regularized margin distribution optimization. Our extensive experiments validated that the proposed method outperforms many state-of-the-art patch based face recognition algorithms.
Linear subspace learning (LSL) is a popular approach to image recognition and it aims to reveal the essential features of high dimensional data, e.g., facial images, in a lower dimensional space by linear projection. Most LSL methods compute directly the statistics of original training samples to learn the subspace. However, these methods do not effectively exploit the different contributions of different image components to image recognition. We propose a novel LSL approach by sparse coding and feature grouping. A dictionary is learned from the training dataset, and it is used to sparsely decompose the training samples. The decomposed image components are grouped into a more discriminative part (MDP) and a less discriminative part (LDP). An unsupervised criterion and a supervised criterion are then proposed to learn the desired subspace, where the MDP is preserved and the LDP is suppressed simultaneously. The experimental results on benchmark face image databases validated that the proposed methods outperform many state-of-the-art LSL schemes.
Attribute reduction is one of the most meaningful research topics in the existing fuzzy rough sets, and the approach of discernibility matrix is the mathematical foundation of computing reducts. When computing reducts with discernibility matrix, we find that only the minimal elements in a discernibility matrix are sufficient and necessary. This fact motivates our idea in this paper to develop a novel algorithm to find reducts that are based on the minimal elements in the discernibility matrix. Relative discernibility relations of conditional attributes are defined and minimal elements in the fuzzy discernibility matrix are characterized by the relative discernibility relations. Then, the algorithms to compute minimal elements and reducts are developed in the framework of fuzzy rough sets. Experimental comparison shows that the proposed algorithms are effective.