Attachment 'sheet04.m'
Download 1 function sheet04
2 % Generate data
3 centers = [0, 0; 7, 3; -2, 4; 0, 10; -5, -5];
4
5 [X, Y] = generate_data(centers, 50);
6
7 % plot the result
8 figure(1)
9 gscatter(X(:,1), X(:,2), Y);
10
11 % cluster the data set with different choices of K
12 figure(2)
13 [Y2, M2] = k_means_clustering(2, X);
14 plot_clustering(X, Y2, M2)
15
16 figure(3)
17 [Y5, M5] = k_means_clustering(5, X);
18 plot_clustering(X, Y5, M5)
19
20 figure(4)
21 [Y8, M8] = k_means_clustering(8, X);
22 plot_clustering(X, Y8, M8)
23
24
25 % 1. Generate data with normally distributed clusters
26 % with centers given by the rows of C. Generate N
27 % points from each cluster, and also return a vector K
28 % which contains the cluster indices for each point.
29 function [X, K] = generate_data(C, N)
30 % ...
31
32 % 2. Compute K-means clustering. Randomly select K points
33 % as initial means. Iterate while the difference
34 % between the old means and the new means matrix in the
35 % Frobenius norm (norm(..., 'fro')) is larger than 1e-10.
36 %
37 % Return the cluster indices and the matrix of means (means
38 % are rows).
39 function [Y, MEANS] = k_means_clustering(K, X)
40 % ...
41
42 % Plot the clustering and the centers.
43 function plot_clustering(X, Y, MEANS)
44 gscatter(X(:, 1), X(:, 2), Y);
45 hold on;
46 plot(MEANS(:, 1), MEANS(:, 2), '+', 'MarkerSize', 10, 'LineWidth', 3);
47 hold off;
48
49 % Compute all pairwise distances quickly.
50 function D = pwdist(X, Y)
51 D = size(X, 2);
52 N = size(X, 1);
53 M = size(Y, 1);
54
55 XX = sum(X.*X, 2);
56 YY = sum(Y.*Y, 2);
57 D = repmat(XX, 1, M) + repmat(YY', N, 1) - 2*X*Y';
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.