k-nearest neighbors - Codeforces

→ Обратите внимание

До соревнования
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
08:44:40
Зарегистрироваться »

*есть доп. регистрация

→ Трансляции

Leetcode BiWeekly Contest 144 — Solution Discussion

Shayan

До начала 10:14:39

Codeforces CodeTON Round 9 (Div 1 + Div 2) — Solution Discussion

Shayan

До начала 11:44:39

Всё →

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
4	atcoder_official	161
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя ologn_13

k-nearest neighbors

Автор ologn_13, 12 лет назад, По-английски

Hi all!! The problem is related to searching the k-nearest neighbor of a point in 2-D co-ordinate system. Inputs are given in form of (x,y) coordinates and we have to find out k-nearest neighbor of a point. I search thoroughly on internet, but I got only pure theoretic algorithms in term of searching points in a hypothetical sphere. I also read that it can be solved using k-d trees, which I learn, but I am not getting how to implement the algorithm. Please help, if anyone have implemented it before or had a look at its implementation earlier. Thanks!

k-nearest neighbors

ologn_13
12 лет назад
18

Комментарии (15)

Показать архивные | Написать комментарий?

RodionGork

12 лет назад, # |

What about limits? There could be different approaches depending on number of points etc...

→ Ответить

ologn_13

12 лет назад, # ^ |

constraints:- Number of input points(N)<= 10^6 ; k <= 10^5 ; Time limit:- 5 sec.

→ Ответить

RodionGork

12 лет назад, # ^ |

← Rev. 3 →

Never mind. I understood suddenly that k-nearest means one which is k-th when sorted by distance. I thought instead that we are building something like a chain of k closest points. Quite stupid of me :)

→ Ответить

SkyHawk

12 лет назад, # ^ |

We can find distance to each point and sort them. Or we have many queries?

→ Ответить

ilyaraz

12 лет назад, # |

If k is small (say, constant), then we can build a higher order Voronoi diagram in time $\text{[math]}$ , and the reduce our problem to the planar point location problem. So, we get a data structure with preprocessing time $\text{[math]}$ , space O(k(n - k)), and query time $\text{[math]}$ .

→ Ответить

ologn_13

12 лет назад, # ^ |

for k=100, n=10^6, queries = 100, i think Voronoi diagram will be proved costly.

→ Ответить

RodionGork

12 лет назад, # ^ |

← Rev. 2 →

Then how about splitting space into sectors (at least rectangular grid)... You then can seek for k-th neighbor only among points of the same sector and few neighboring.

Surely the hard problem would be to analyse set of point to decide which dissection would split points as uniformly as possible...

→ Ответить

Burunduk1

12 лет назад, # ^ |

Interesting :-) Do you mean smth like this cd.duke.edu ?

→ Ответить

ilyaraz

12 лет назад, # ^ |

Exactly! Though I'm not an expert in 2D nearest neighbors, and don't know all the details.

→ Ответить

Burunduk1

12 лет назад, # |

← Rev. 2 →

k-nearest neighbor of a point in 2-D co-ordinate system.

By the way, which distance between points do you mean? :-)

$\text{[math]}$ or |x| + |y| ?

→ Ответить

Burunduk1

12 лет назад, # |

← Rev. 3 →

Answer to one query is "chain of the K nearest neighbors of point number i", rigth?

You say n, k, q ≤ 10⁵. Size of answer is k × q. It is 10¹⁰. Maybe just q × k ≤ 10⁵?

UPD

If constraints really are k × q ≤ 10⁵, KD-tree works good for random sets of points.

About KD trees: wiki, some implementation, my implementation for acm.timus.ru task.1369

→ Ответить

ologn_13

12 лет назад, # ^ |

← Rev. 2 →

Burunduk1, i meant the Euclidean Distance between query point and given points. Does your implementation work for quite large constraints, like k*q=10^10?

Also the timus problem take care of only one nearest point and in this problem we just have to print all points with same distance as a most nearest point has. But here, the problem is to find out k-nearest points including those with same closest distance as well as more so as to complete k points.

→ Ответить

Burunduk1

12 лет назад, # ^ |

← Rev. 2 →

I also read that it can be solved using k-d trees, which I learn, but I am not getting how to implement the algorithm.

Start from easier task. Learn, how to find 1-nearest point, using KD-tree. If you already know (implemented it by yourself), how to find 1-nearest point, you should understand, how to find k-nearest points. Just use if (sqr(dx) + sqr(dy) > res[k] + eps) return; instead of if (sqr(dx) + sqr(dy) > res + eps) return; (it's line from my code). Where res[k] -- k-th element in sorted order. You may maintain it using treap.

Does your implementation work for quite large constraints, like k*q=10^10?

Hm... of course, it takes some time to process such query :-D If you are about memory, it's O(N). If you are about asymptotics, for random set of points it's about O(K*logN*logK) per query.

→ Ответить

R_R_

12 лет назад, # ^ |

Hi, do you know any problem in online judges that can be solved using kd tree? It doesn't look like timus 1369 can be solved that way :(

→ Ответить

Hikari9

9 лет назад, # |

← Rev. 4 →

You can do this using KD tree. Basically, if you know KD tree nearest neighbor query, to transform it into k-nearest, just store all the points in a priority queue while traversing.

It's O(k log(k) log(n)) on average, O(k log(k) sqrt(n)) on worst. For proofs, look at the cited references in wikipedia.

Short implementation of 2D KNN in C++ using KD tree below, tested.

EDIT: build is O(n log n), not O(2n) EDIT 2: I tested this on Kattis, with k=2. Unfortunately, I can't find online judges with KNN to test with. If you know some problems, I would appreciate a reply.

// 2D point object
struct point {
	double x, y;
	point(double x = 0, double y = 0): x(x), y(y) {}	
};

// the "hyperplane split", use comparators for all dimensions
bool cmpx(const point& a, const point& b) {return a.x < b.x;}
bool cmpy(const point& a, const point& b) {return a.y < b.y;}

struct kdtree {
	point *tree;
	int n;
	// constructor
	kdtree(point p[], int n): tree(new point[n]), n(n) {
		copy(p, p + n, tree);
		build(0, n, false);
	}
	// destructor
	~kdtree() {delete[] tree;}
	// k-nearest neighbor query, O(k log(k) log(n)) on average
	vector<point> query(double x, double y, int k = 1) {
		perform_query(x, y, k, 0, n, false); // recurse
		vector<point> points;
		while (!pq.empty()) { // collect points
			points.push_back(*pq.top().second);
			pq.pop();
		}
		reverse(points.begin(), points.end());
		return points;
	}
private:
	// build is O(n log n) using divide and conquer
	void build(int L, int R, bool dvx) {
		if (L >= R) return;
		int M = (L + R) / 2;
		// get median in O(n), split x-coordinate if dvx is true
		nth_element(tree+L, tree+M, tree+R, dvx?cmpx:cmpy);
		build(L, M, !dvx); build(M+1, R, !dvx);
	}

	// priority queue for KNN, keep the K nearest
	priority_queue<pair<double, point*> > pq;
	void perform_query(double x, double y, int k, int L, int R, bool dvx) {
		if (L >= R) return;
		int M = (L + R) / 2;
		double dx = x - tree[M].x;
		double dy = y - tree[M].y;
		double delta = dvx ? dx : dy;
		double dist = dx * dx + dy * dy;
		// if point is nearer to the kth farthest, put point in queue
		if (pq.size() < k || dist < pq.top().first) {
			pq.push(make_pair(dist, &tree[M]));
			if (pq.size() > k) pq.pop(); // keep k elements only
		}
		int nearL = L, nearR = M, farL = M + 1, farR = R;
		if (delta > 0) { // right is nearer
			swap(nearL, farL);
			swap(nearR, farR);
		}
                // query the nearer child
		perform_query(x, y, k, nearL, nearR, !dvx);

		if (pq.size() < k || delta * delta < pq.top().first)
                        // query the farther child if there might be candidates
			perform_query(x, y, k, farL, farR, !dvx);
	}
};

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 23.11.2024 08:50:21 (k1).

Десктопная версия, переключиться на мобильную.

При поддержке