Sunday, December 10, 2017

A yosupo week

Miss me?

The Aug 21 - Aug 27 week started of with TopCoder SRM 720 on Thursday (problems — unavailable now, but hopefully the TopCoder website will be fixed soon, results, top 5 on the left). There was a tense challenge phase battle at the top, but yosupo has made sure it was only for the second place by solving the 1000-pointer — congratulations on the victory!

The said 1000-pointer was concerned with somewhat theoretical isotonic regression in L1 metric. On the outside, it looked like this: you are given an undirected weighted graph with at most 50 vertices and at most 1000 edges, and two spanning trees in it. You have to modify the weights of the edges in such a way that the first given spanning tree is minimal, and the second given spanning tree is maximal, while minimizing the total change of weights.

A few solutions for this problem are described and linked in this comment thread.

Later on the same day we had AIM Tech Round 4 on Codeforces (problems, results, top 5 on the left, analysis). yosupo has continued his great day, this time winning without an extra problem but thanks to solving everything fast enough. Well done!

Finally, Saturday presented us with AtCoder Grand Contest 019 (problems, results, top 5 on the left, my screencastanalysis). This time yosupo was only 6th, and the victory instead went to Um_nik who has dominated the round by finishing all problems with half an hour to spare. Congratulations!

In my previous summary, I have mentioned a Codeforces problem: you are given an array of 300000 numbers, and 300000 queries of the following form: a segment [l;r] and a number k, for which you need to find the smallest number that appears on at least 1/k of all positions in the given segment of the array. Note that k is at most 5, so the number must be quite frequent within the segment.

The first idea is to use Mo's algorithm to handle the segments; we'll keep count of how many of each number are there in the current segment in a simple array, and thus can handle the counting for all segments in O(n*sqrt(n)). In order to find the answer for each segment, we'll need to find the counts that are above 1/5 of their sum. One way to do that would be to maintain a priority queue, but that adds an extra log factor both for queries and for the Mo's part, and some coding complexity.

A nicer way to find the high counts is to use randomization: let's just pick 120 random numbers within our segment, and hope that all numbers that occupy at least 1/5 of the segment will be found. Indeed, the probability of that not occurring for a number is less than (1-1/5)120 , or about 2.3e-12. We have at most 5 interesting numbers, 300000 queries, and around 100 testcases, so the overall probability of an incorrect answer does not exceed 5*100*300000*2.3e-12=3.45e-4, which is good enough.

Thanks for reading, and check back soon as I crawl towards the present :)

No comments:

Post a Comment