Skip to content

Commit 3be5e29

Browse files
author
Julio Cesar Contreras Huerta
committed
up
1 parent 6f96248 commit 3be5e29

16 files changed

+896
-1
lines changed

COPYING.TXT

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
2+
3+
All rights reserved.
4+
5+
Redistribution and use in source and binary forms, with or without
6+
modification, are permitted provided that the following conditions are met:
7+
8+
- Redistributions of source code must retain the above copyright notice,
9+
this list of conditions and the following disclaimer.
10+
- Redistributions in binary form must reproduce the above copyright notice,
11+
this list of conditions and the following disclaimer in the documentation
12+
and/or other materials provided with the distribution.
13+
14+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
15+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17+
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
18+
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
19+
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
20+
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
21+
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
22+
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
23+
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
24+
POSSIBILITY OF SUCH DAMAGE.

FKL_2017_ECML.m

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
% Fair Kernel Learning
2+
% http://isp.uv.es/
3+
% Adrian Perez-Suay, 2017 (c) Copyright
4+
5+
% Source code to reproduce the ECML 2017 paper: "Fair Kernel Learning"
6+
% Authors: Adrian Perez-Suay, and Valero Laparra, and Gonzalo Mateo-Garcia,
7+
% and Jordi Mu~noz-Mari, and Luis Gomez-Chova, and Gustau Camps-Valls
8+
9+
close all, clear, clc
10+
addpath(genpath('.'))
11+
12+
%% Small experimentation (low demanding)
13+
exp = 1; %
14+
% ECML'17 experimentation (high demanding)
15+
% exp = 2; % uncomment this line
16+
17+
%% Experiment (SEX)
18+
% Do the learning being independent to sex variable
19+
% Fair Kernel Learning
20+
FKL_sex
21+
% Fair Dimensionality Reduction
22+
FDR_sex
23+
24+
%% Experiment (SEX and RACE)
25+
% Do the learning being independent to sex and race variables
26+
% Fair Kernel Learning
27+
FKL_sex_race
28+
% Fair Dimensionality Reduction
29+
FDR_sex_race
30+
31+
%% Draw results (both experiments)
32+
draw_FKL
33+
draw_FDR

README.md

+181-1
Original file line numberDiff line numberDiff line change
@@ -1 +1,181 @@
1-
# FKL
1+
# Fair Kernel Learning (FKL) and Feature Analysis Tools
2+
3+
**Version:** 1.0
4+
**Release Date:** 2017
5+
6+
## Overview
7+
This repository provides the Fair Kernel Learning (FKL) implementation and tools for Feature Discriminative Ratio (FDR) and Feature Kernel Learning (FKL) analysis. The repository also includes utilities for visualization, kernel learning, and fairness-driven learning methods, as presented in the ECML 2017 paper:
8+
9+
**"Fair Kernel Learning"**
10+
**Authors:** Adrian Perez-Suay, Valero Laparra, Gonzalo Mateo-Garcia, Jordi Muñoz-Marí, Luis Gomez-Chova, Gustau Camps-Valls
11+
**Image Processing Laboratory (IPL)** - [http://isp.uv.es/](http://isp.uv.es/)
12+
13+
This repository offers utilities for performing fair kernel learning with demographic-aware tools for fairness analysis, particularly for attributes like sex and race.
14+
15+
## Features
16+
- **Fair Kernel Learning (FKL):** Implementation of fairness-aware learning methods.
17+
- **FDR Computation:** Calculate Feature Discriminative Ratios (FDR) for datasets.
18+
- **FKL Computation:** Perform Feature Kernel Learning (FKL) for fairness-based feature selection.
19+
- **Demographic Analysis:** Specific scripts to analyze fairness performance by demographic subgroups.
20+
- **Visualization Tools:**
21+
- `draw_FDR.m`: Plot FDR results.
22+
- `draw_FKL.m`: Visualize FKL results.
23+
- **Pre-Loaded Datasets:** Supports a9a dataset from LIBSVM and UCI repositories.
24+
25+
## Usage
26+
### Example 1: Run Fair Kernel Learning (FKL)
27+
Run the main script `FKL_2017_ECML.m` to perform Fair Kernel Learning.
28+
29+
```matlab
30+
% Fair Kernel Learning Example
31+
clear; clc; close all;
32+
33+
% Set experiment type for exact reproduction
34+
exp = 2;
35+
36+
% Run the main FKL function
37+
FKL_2017_ECML;
38+
```
39+
**Important Note:** To reproduce the experimental setup exactly as in the ECML 2017 paper, set the variable `exp` to `2` (`exp = 2;`) in the script `FKL_2017_ECML.m`.
40+
41+
### Example 2: Compute Feature Discriminative Ratio (FDR)
42+
Run `FDR.m` to compute the Feature Discriminative Ratio for your dataset.
43+
44+
```matlab
45+
% FDR Example
46+
clear; clc; close all;
47+
48+
% Load or define data
49+
X = rand(100, 10); % 100 samples, 10 features
50+
Y = randi([0, 1], 100, 1); % Binary class labels
51+
52+
% Compute FDR
53+
FDR_scores = FDR(X, Y);
54+
55+
% Display results
56+
disp('Feature Discriminative Ratios:');
57+
disp(FDR_scores);
58+
```
59+
60+
### Example 3: Visualize FDR Results
61+
Use `draw_FDR.m` to visualize FDR scores.
62+
63+
```matlab
64+
% Visualize FDR
65+
FDR_scores = rand(1, 10); % Example FDR scores for 10 features
66+
draw_FDR(FDR_scores);
67+
68+
% Plot customization
69+
title('Feature Discriminative Ratios');
70+
xlabel('Features');
71+
ylabel('FDR Score');
72+
```
73+
74+
### Example 4: Compute Feature Kernel Learning (FKL)
75+
Run `FKL.m` to compute FKL values for your dataset.
76+
77+
```matlab
78+
% FKL Example
79+
clear; clc; close all;
80+
81+
% Load or define data
82+
X = rand(100, 10); % 100 samples, 10 features
83+
Y = randi([0, 1], 100, 1); % Binary class labels
84+
85+
% Compute FKL
86+
FKL_scores = FKL(X, Y);
87+
88+
% Display FKL scores
89+
disp('Feature Kernel Learning Scores:');
90+
disp(FKL_scores);
91+
```
92+
93+
### Example 5: Analyze Fairness by Demographics (Sex and Race)
94+
Run specific scripts to analyze FDR and FKL for specific demographic groups.
95+
96+
```matlab
97+
% Example: FKL analysis by sex
98+
FKL_sex;
99+
100+
% Example: FKL analysis by sex and race
101+
FKL_sex_race;
102+
```
103+
104+
### Example 6: Use Radial Basis Function (RBF) Kernel
105+
Use `rbf.m` to compute the RBF kernel for your data.
106+
107+
```matlab
108+
% RBF Kernel Example
109+
X1 = rand(10, 5);
110+
X2 = rand(8, 5);
111+
sigma = 1.0;
112+
113+
% Compute RBF kernel
114+
K = rbf(X1, X2, sigma);
115+
disp('RBF Kernel Matrix:');
116+
disp(K);
117+
```
118+
119+
## Dataset
120+
The dataset used in this project is the a9a dataset from LIBSVM and UCI repositories.
121+
122+
- **Original source:** LIBSVM datasets
123+
- **Additional info:** UCI Machine Learning Repository
124+
The dataset was converted to Octave/GNU format using `libsvmread` from LIBSVM.
125+
126+
## Installation
127+
1. Clone the repository:
128+
129+
```bash
130+
git clone https://github.com/username/fair-kernel-learning.git
131+
cd fair-kernel-learning
132+
```
133+
134+
2. Add paths to MATLAB:
135+
136+
```matlab
137+
addpath(pwd);
138+
```
139+
140+
3. Verify your MATLAB environment and ensure all functions are accessible.
141+
142+
## Authors
143+
- **Adrian Perez-Suay**
144+
- **Valero Laparra**
145+
- **Gonzalo Mateo-Garcia**
146+
- **Jordi Muñoz-Marí**
147+
- **Luis Gomez-Chova**
148+
- **Gustau Camps-Valls**
149+
150+
**Image Processing Laboratory (IPL)** - [http://isp.uv.es/](http://isp.uv.es/)
151+
152+
**Contact:** [[email protected]](mailto:[email protected])
153+
154+
## License
155+
```
156+
157+
158+
All rights reserved.
159+
160+
Redistribution and use in source and binary forms, with or without
161+
modification, are permitted provided that the following conditions are met:
162+
163+
- Redistributions of source code must retain the above copyright notice,
164+
this list of conditions and the following disclaimer.
165+
- Redistributions in binary form must reproduce the above copyright notice,
166+
this list of conditions and the following disclaimer in the documentation
167+
and/or other materials provided with the distribution.
168+
169+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
170+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
171+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
172+
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
173+
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
174+
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
175+
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
176+
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
177+
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
178+
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
179+
POSSIBILITY OF SUCH DAMAGE.
180+
```
181+

README.txt

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
This is a fair reproducibility research code of the paper:
2+
"Fair Kernel Learning" submitted to ECML 2017.
3+
4+
Authors:
5+
Adrian Perez-Suay, and Valero Laparra, and Gonzalo Mateo-Garcia,
6+
and Jordi Mu~noz-Mari, and Luis Gomez-Chova, and Gustau Camps-Valls
7+
8+
All authors are currently in Image Processing Laboratory (IPL)
9+
In the group Image and Signal Processing at:
10+
11+
http://isp.uv.es/
12+
13+
Adrian Perez-Suay, 2017 (c) Copyright
14+
15+
16+
--------------
17+
Original data used in this demo are downloaded from LIBSVM at:
18+
19+
https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
20+
21+
We downloaded the a9a libsvm dataset and convert to Octave/GNU through its libsvmread function from LIBSVM. More info about the a9a dataset can be obtained also in the UCI repository:
22+
23+
http://archive.ics.uci.edu/ml/
24+
--------------
25+
26+
Running the code:
27+
- execute in Matlab and/or Octave the only .m function in the main directory: FKL_2017_ECML
28+
- Source code and functions invoked by the above function are located at code/ directory.
29+
- Data used in the paper are located at data/ directory.
30+
- Results in .mat format will be allocated at results/ directory.
31+
- Finally, the plot of the results will be in the figures/ directory.
32+
33+
IMPORTANT NOTE:
34+
In the FKL_2017_ECML.m set the variable exp equal to 2 (exp = 2) in order to reproduce exactly the experimentation setup we did in the ECML submitted paper.

code/FDR.m

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
% Fair Dimensionality Reduction
2+
% http://isp.uv.es/
3+
% Adrian Perez-Suay, 2017 (c) Copyright
4+
5+
6+
function [res] = FDR(X,ytr,Q,te,yte,Qte)
7+
%% dimensions and number of features parameters
8+
[N,d] = size(X);
9+
nf = size(Q,2);
10+
nte = length(yte);
11+
%% PCA
12+
[V,D] = eig(cov(X));
13+
D = diag(D);
14+
[D, ind] = sort(D,'descend');
15+
D = diag(D);
16+
V = V(:,ind);
17+
H = eye(N) - (1/N)*ones(N);
18+
Hte = eye(nte) - (1/nte)*ones(nte);
19+
HKqH = H*Q*Q'*H;
20+
%% Fair (Linear) PCA: invariant in x-axis
21+
[VD,DD] = eigs(X'*X*X'*X,X'*(HKqH)*X+1e-7*eye(d),nf);
22+
% fprintf('Searching maximum dependence parameter...\n');
23+
sigmas_q = logspace(-5,3,15); % search maximal dependence parameter
24+
hsic = zeros(1,length(sigmas_q));
25+
for i=1:length(sigmas_q)
26+
K = rbf(X,X,sigmas_q(i));
27+
hsic(i) = (1/N^2)*trace(K*HKqH);
28+
end
29+
% figure,semilogx(sigmas_q,hsic),grid on
30+
[~,ma] = max(hsic);
31+
sigma = sigmas_q(ma);
32+
res.sigma = sigma;
33+
%% Kernel PCA
34+
K = rbf(X,X,sigma);
35+
Kte = rbf(te,X,sigma);
36+
[VK,DK] = eig(H*K*H);
37+
DK = diag(DK);
38+
[DK, ind] = sort(DK,'descend');
39+
DK = diag(DK);
40+
VK = VK(:,ind);
41+
%% Fair Kernel Dimensionality Reduction: invariant in x-axis
42+
[VDK,DDK] = eigs(K*H*K*H*K,K*HKqH*K+1e-7*eye(N),nf);
43+
HKqteH = Hte*Qte*Qte'*Hte;
44+
res.PCA.dep = zeros(1,d);
45+
res.PCA.acc = zeros(1,d);
46+
for i=1:d
47+
knn = knnclassify(te*V(:,1:i),X*V(:,1:i),ytr);
48+
res.PCA.dep(i) = (1/nte^2)*trace(V(:,1:i)'*te'*(HKqteH)*te*V(:,1:i));
49+
res.PCA.acc(i) = sum(yte == knn)/nte;
50+
end
51+
%
52+
knn = knnclassify(te*VD(:,1:nf),X*VD(:,1:nf),ytr);
53+
res.DPCA.dep = (1/nte^2)*trace(VD(:,1:nf)'*te'*(HKqteH)*te*VD(:,1:nf));
54+
res.DPCA.acc = sum(yte == knn)/nte;
55+
%
56+
res.KPCA.dep = zeros(1,d);
57+
res.KPCA.acc = zeros(1,d);
58+
for i=1:size(VK,2)
59+
knn = knnclassify(Kte*VK(:,1:i),K*VK(:,1:i),ytr);
60+
res.KPCA.dep(i) = (1/nte^2)*trace(VK(:,1:i)'*Kte'*HKqteH*Kte*VK(:,1:i));
61+
res.KPCA.acc(i) = sum(yte == knn)/nte;
62+
end
63+
%
64+
knn = knnclassify(Kte*VDK(:,1:nf),K*VDK(:,1:nf),ytr);
65+
res.KDPCA.dep = (1/nte^2)*trace(VDK(:,1:nf)'*Kte'*HKqteH*Kte*VDK(:,1:nf));
66+
res.KDPCA.acc = sum(yte == knn)/nte;
67+
68+
end

code/FDR_sex.m

+50
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
rand('seed',123), randn('seed',123)
2+
fprintf('Fair Dimensionality Reduction (sex)\n')
3+
if exp==1
4+
reps = 2;
5+
n = 50;
6+
nva2 = 50;
7+
nte2 = 50;
8+
elseif exp==2
9+
reps = 25;
10+
n = 7000;
11+
nva2 = 7000;
12+
nte2 = 16281;
13+
end
14+
rps = cell(1,reps);
15+
res = cell(1,reps);
16+
resU = cell(1,reps);
17+
for ii=1:reps
18+
% fprintf('Iter %d of %d\n',ii,reps);
19+
%% set training and validation sets
20+
load a9a.mat
21+
rp = randperm(size(X,1));
22+
rps{ii} = rp;
23+
X = X(rp,:);
24+
label_vector = label_vector(rp);
25+
tr = X(1:n,:);
26+
ytr = label_vector(1:n);
27+
iQ = [72:73];% sex
28+
Q = X(1:n,iQ); % Variable to enforce independence (quit/remove)
29+
nva = n+nva2;
30+
va = X(n+1:nva,:);
31+
yva = label_vector(n+1:nva);
32+
33+
%% set test test
34+
load a9a_test.mat
35+
te = X(1:nte2,:);
36+
yte = label_vector(1:nte2);
37+
Qte = te(1:nte2,iQ);
38+
%% clear some variables
39+
clear X label_vector
40+
41+
%% FAIR DIMENSIONALITY REDUCTION (LINEAR/KERNEL)
42+
[res{ii}] = FDR(tr,ytr,Q,te,yte,Qte);
43+
%% avoid sensitive features from training (and train)
44+
uf = setdiff(1:size(tr,2), iQ);
45+
utr = tr(:,uf);
46+
uva = va(:,uf);
47+
ute = te(:,uf);
48+
[resU{ii}] = FDR(utr,ytr,Q,ute,yte,Qte);
49+
save results/FDR_sex.mat res resU
50+
end

0 commit comments

Comments
 (0)