Sunday, July 23, 2017

An unsportsmanlike week

Yandex.Algorithm 2017 Final Round took place both in Moscow and online on Tuesday (problems, results, top 5 on the left). My contest started like a nightmare: I was unable to solve any of the given problems during the first hour, modulo a quick incorrect heuristic submission on problem F. As more and more people submitted problems D (turned out to be a straightforward dynamic programming) and E (turned out to be about finding n-th Catalan number), I grew increasingly frustrated, but still couldn't solve them. After my wits finally came back to me after an hour, all problems suddenly seemed easy, and I did my best to catch up with the leaders, submitting 5 problems. Unfortunately, one of them turned out to have a very small bug, and I was denied that improbable victory :)

Congratulations to tourist, W4yneb0t and rng.58 on winning the prizes!

Here's the Catalan number problem that got me stumped. Two people are dividing a heap of n stones. They take stones in turns until none are left. At their turn, the first person can take any non-zero amount of stones. The second person can then also take any non-zero amount of stones, but with an additional restriction that the total amount of stones he has after this turn must not exceed the total amount of stones the first person has. How many ways are there for this process to go to completion, if we also want the total amount of stones of each person to be equal in the end?

TopCoder Open 2017 Round 2C was the first event of the weekend (problems, results, top 5 on the left, parallel round resultsmy screencast). The final 40 contestants for the Round 3 were chosen, and kraskevich advanced with the most confidence of all, as he was the only contestant to solve all problems. Congratulations!

This round contained a very cute easy problem. Two teams with n players each are playing a game. Each player has an integer strength. The game lasts n rounds. In the first round, the first player of the first team plays the first player of the second team, and the stronger player wins. In the second round, the first two players of the first team play the first two players of the second team, and the pair with the higher total strength wins. In the i-th round the strengths of the first i players in each team are added up to determine the winner. You know the strengths of each player, and the ordering of the players in the second team. You need to pick the ordering for the players in the first team in such a way that the first team wins exactly k rounds. Do you see the idea?

And finally, AtCoder Grand Contest 018 took place on Sunday (problems, results, top 5 on the left, analysis, my screencast with commentary). tourist has adapted his strategy to the occasion, submitting the first four problems before starting work on the last two, but in the end that wasn't even necessary as he also solved the hardest problem and won with a 15 minute margin. Well done!

As you can see in the screencast, I was also trying out this late submission strategy this time, and when I solved the first four and saw that Gennady has already submitted them, I was quite surprised. There was more than an hour left, so surely I'd be able to solve one more problem? And off I went, trying to crack the hardest problem because it gave more points and seemed much more interesting to solve than the previous one.

I have made quite a bit of progress, correctly simplifying the problem to the moment where the main idea mentioned in the analysis can be applied, but unfortunately could not come up with that idea. That left me increasingly anxious as the time ran out: should I still submit the four problems I have (which turned out to be all correct in upsolving), earning something like 40th place instead of 10th or so that I would've got had I submitted them right after solving? Or should I avoid submitting anything, thus not appearing in the scoreboard at all and not losing rating, but showing some possibly unsportsmanlike behavior?

I have to tell you, this is not a good choice to have. Now I admire people who can pull this strategy off without using the escape hatch even more :) To remind, the benefits of this strategy, as I see them (from comments in a previous post), are:
1) Not giving information to other competitors on the difficulty of the problems.
2) Not allowing other competitors to make easy risk/reward tradeoffs, as if they know the true scoreboard, they might submit their solutions with less testing when appropriate.

I ended up using the escape hatch, which left me feeling a bit guilty, but probably more uncertain than guilty. Do you think this is against the spirit of competition, as PavelKunyavskiy suggests? Please share your opinion in comments, and also let's have a vote:

Is it OK to bail out of a contest after a poor performance with late submit strategy at AtCoder?

Here's the hardest problem that led to all this confusion: You are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible. Can you do that?

In my previous summary, I have mentioned a hard VK Cup problem. You are given an undirected graph with 105 vertices and edges. You need to assign non-negative integer numbers not exceeding 106 to the vertices of the graph. The weight of each edge is then defined as the product of the numbers at its ends, while the weight of each vertex is equal to the square of its number. You need to find an assignment such that the total weight of all edges is greater than or equal to the total weight of all vertices, and at least one number is nonzero.

The first idea is: as soon as the graph has any cycle, we can assign number 1 to all vertices in the cycle, and 0 to all other vertices, and we'll get what we want. So now we can assume the graph is a forest, and even a tree.

Now, consider a leaf in that forest, and assume we know the number x assigned to the only vertex this leaf is connected to. If we now assign number y to this leaf, then the difference between the edge weight and the vertex weight will increase by x*y-y*y. We need to pick y to maximize this difference, which is finding the maximum of a quadratic function, which is achieved when y=x/2, and the difference increases by x2/4.

Now suppose we have a vertex that is connected to several leaves, and to one other non-leaf vertex for which we know the assigned number x. After we determine the number y for this vertex, all adjacent leaves must be assigned y/2, so we can again compute the difference as the function of y and find its maximum. It will be a quadratic function again, and the solution will look like x*const again.

Having rooted the tree in some way, this approach allows us to actually determine the optimal number for all vertices going from leaves to the root as multiples of the root number, and we can check if the overall delta is non-negative or not.

There's an additional complication caused by the fact that the resulting weights must be not very big integers. We can do the above computation in rational numbers to make all numbers integer, but they might become quite big. However, if we look at the weight differences that some small trees get, we can notice that almost all trees can get a non-negative difference, and come up with a case study that finds a subtree in a given tree which still has a non-negative difference but can be solved with small numbers. I will leave this last part as an exercise for the readers.

Wow, it feels good to have caught up with the times :) Thanks for reading, and check back next week!

A red-black week

Last week, Codeforces presented the problems from the VK Cup finals as two regular contests. First off, Codeforces Round 423 took place on Tuesday (problems, results, top 5 on the left, analysis). W4yneb0t has continued his string of excellent Codeforces performances with the best possible result - a victory, which has also catapulted him to the second place in the overall rating list. Well done!

In between the Codeforces rounds, TopCoder held its SRM 718 very early on Thursday (problems, results, top 5 on the left, anlaysis). The round seems to have been developing in a very exciting manner: three people submitted all three problems during the coding phase, then two of them lost points during the challenge phase, and the last remaining person with three problems failed the system test on the easiest problem! When the dust settled, snuke still remained in the first place thanks for his solution to the hard problem holding. Congratulations on the second SRM victory!

Codeforces Round 424 rounded up the week's contests (problems, results, top 5 on the left, analysis). TakanashiRikka was the fastest to solve the first four problems, and (probably not a sheer coincidence :)) the only contestant to have enough time to consider all cases in the hardest problem correctly while solving at least one other problem. Congratulations on the well-deserved first place!

Here's that tricky problem. You are given an undirected graph with 105 vertices and edges. You need to assign non-negative integer numbers not exceeding 106 to the vertices of the graph. The weight of each edge is then defined as the product of the numbers at its ends, while the weight of each vertex is equal to the square of its number. You need to find an assignment such that the total weight of all edges is greater than or equal to the total weight of all vertices, and at least one number is nonzero.

In my previous summary, I have mentioned a difficult IPSC problem: we start with a deck of 26 red and 26 black cards, and a number k (1<=k<=26). The first player takes any k cards from the deck, and arranges them in any order they choose. The second player takes any k cards from the remaining deck, and arranges them in any order they choose, but such that their sequence is different from the sequence of the first player. The remaining 52-2k cards are shuffled and dealt one by one. As soon as the last k dealt cards exactly match one of the player's sequences, that player wins. In case no match happens after the cards run out, we toss a coin, and each player wins with probability 50%. What is the probability of the first player winning, assuming both play optimally?

Solving this problem required almost all important programming contest skills: abstract mathematical reasoning, knowledge of standard algorithms, coming up with new ideas, good intuition about heuristics, and of course the programming skill itself.

We start off by noticing that the second player has a very simple way to achieve 50% winrate: he can just choose a sequence that is a complement of the first player's sequence (replace red cards by black and vice versa), and then everything is completely symmetric.

How can the second player achieve more? He has two resources: first, he can choose a string that is more likely to appear in the sequence of the remaining cards. Second, he can choose a string that, when it appears together with the string of the first player, tends to appear earlier.

The strings that are more likely to appear are those that leave an equal proportion of reds and blacks (after taking out the string of the first player once and the string of the second player twice), and have no borders (prefixes that are equal to suffixes). This is because we can count the number of ways a given string can appear by multiplying the number of positions it can appear in by the number of ways to place the remaining characters after the matching part is fixed. The number of ways to place the remaining characters is maximized then the remaining characters have equal numbers of blacks and reds. This slightly overcounts the number of ways because in some cases the string can appear more than once; the lack of borders minimizes the number of such occurrences.

The strings that tend to appear earlier when both appear are those which have a suffix which matches a prefix of the first player's string. At best, if the first player string is s+c, where s is a string of length k-1 and c is a character, the second player should pick his string from 'r'+s and 'b'+s. In this case as soon as there's a match of the first player's string not in the first position, we can have a >50% chance to have a match of our string one position before.

Now we can already make the first attempt at a solution: let's try likely candidates for the first player's best move - it should likely be among the strings that have the most appearances; the second player should then choose either another string with lots of appearances, or a string that counter-plays the first player's string in the manner described above. However, this is not enough to solve the problem - we will get a wrong answer.

As part of implementing the above solution, we had to also implement the function to count the sought probability for the given pair of strings. It's also not entirely trivial, and can be done by using dynamic programming where the state is the number of remaining red and black cards, and the state in the Aho-Corasick automaton of the two strings.

So, where do we go from there? Since we already have the function that computes the probability, we can now run it on all pairs of strings for small values of k and try to notice a pattern. We get something like this:

1 0.5 r b
2 0.5 rb br
3 0.3444170488792196 rbr rrb
4 0.35992624362382514 rrbb rrrb
5 0.3777939526283981 rrbrb brrbr
6 0.413011479190688 rbrbrr brbrbr
7 0.45319632265323256 rrbrrbb brrbrrb
8 0.4782049196004824 rrbbrrbb brrbbrrb

No obvious pattern seems to appear. However, we can notice that for large values of k, more precisely when 3k>52, the answer will be 0.5 simply because there is not enough remaining cards for either string to appear. So we only need to research the values of k between 9 and 17 now.

And here comes another key idea: we need to believe that by cutting enough branches early, our exhaustive search solution can run in a few minutes for all those values. At first, this seems improbable. For example, for k=16 we have 65536 candidates for each string, and four billion combinations in total, not to mention the Aho-Corasick on the inside. However, from our previous attempts at a solution we have some leads. More precisely, we know which strings of the second player are the most likely good answers for each string of the first player.

This allows us to get a good upper bound on the first player's score for each particular string reasonably quickly, which leads to the following optimization idea: let's run the search for all strings of the first player at the same time, and at each point we will take the "most promising" string - the one with the highest upper bound so far - and make one more step of the search for it, in other words try one more candidate for the second player's string, which may lower its upper bound. We continue this process until we arrive at the state where the most promising candidate does not have anything else to try, because we already ran through all possible second player strings for it - and this candidate then gives us the answer.

This search runs relatively fast because for most first player strings, our heuristics will give us an upper bound that is lower than the ultimate answer very quickly, and we will stop considering those strings further. It is still quite slow for larger values of k, so we need a second optimization on top: we can skip the Aho-Corasick part in the simple case where there's simply not enough cards of some color for the second player's string to appear. With those two optimizations, we can finally get all the answers in a few minutes.

Thanks for reading, and check back soon for this week's summary!

Sunday, July 16, 2017

A postcard week

The July 3 - July 9 week had a pretty busy weekend. On Saturday, IPSC 2017 has gathered a lot of current contestants together with veterans that come out of retirement just for this contest every year (problems, results, top 5 on the left, analysis). The (relatively) current contestants prevailed this time, with team Past Glory winning with a solid 2 point gap. Well done!

They were one of only two teams who has managed to solve the hardest problem L2. It went like this: we start with a deck of 26 red and 26 black cards, and a number k (1<=k<=26). The first player takes any k cards from the deck, and arranges them in any order they choose. The second player takes any k cards from the remaining deck, and arranges them in any order they choose, but such that their sequence is different from the sequence of the first player. The remaining 52-2k cards are shuffled and dealt one by one. As soon as the last k dealt cards exactly match one of the player's sequences, that player wins. In case no match happens after the cards run out, we toss a coin, and each player wins with probability 50%. What is the probability of the first player winning, assuming both play optimally?

Just an hour later, TopCoder Open 2017 Round 2B selected another 40 lucky advancers (problems, results, top 5 on the left, analysis, parallel round results, my screencast). dotory1158 had a solid point margin from his solutions, and did not throw it all away during the challenge phase, although the contest became much closer :) Congratulations on the win!

The final round of VK Cup 2017 took place in St Petersburg on Sunday (problems, results, top 5 on the left). Continuing the Snackdown trend from the last week, two-person teams were competing. The xray team emerged on top thanks to a very fast start - in fact, they already got enough points for the first place at 1 hour and 38 minutes into the contest, out of 3 hours. Very impressive!

And finally, AtCoder hosted its Grand Contest 017 on Sunday as well (problems, results, top 5 on the left, analysis). Once again the delayed submit strategy has worked very well for tourist, but this time the gap was so huge that the strategy choice didn't really matter. Way to go, Gennady!

Problem D in this round was about the well-known game of Hackenbush, more precisely green Hackenbush: you are given a rooted tree. Two players alternate turns, in each turn a player removes an edge together with the subtree hanging on this edge. When a player can not make a move (only the root remains), he loses. Who will win when both players play optimally?

If you haven't seen this game before, then I encourage you to try solving the problem before searching for the optimal strategies in the Internet (which has them). I think the solution is quite beautiful!

Thanks for reading, and check back for this week's summary.

A hammer week

TopCoder has hosted two SRMs during the June 26 - July 2 week. First off, SRM 716 took place on Tuesday (problems, results, top 5 on the left, analysis). ACRush came back to the SRMs after more than a year, presumably to practice for the upcoming TCO rounds. He scored more points than anybody else from problem solving - but it was only good enough for the third place, as dotory1158 and especially K.A.D.R shined in the challenge phase. Well done!

Later on the same day, Codeforces held its Round 421 (problems, results, top 5 on the left, analysis). There was a certain amount of unfortunate controversy with regard to problem A, so the results should be taken with a grain of salt - nevertheless, TakanashiRikka was the best on the remaining four problems and got the first place. Congratulations!

The second TopCoder round of the week, SRM 717, happened on Friday (problems, results, top 5 on the left, analysis, my screencast). I had my solution for the medium problem fail, but even without that setback I would finish behind Deretin - great job on the convincing win!

Here's what that problem was about. You are given two numbers n (up to 109) and m (up to 105). For each i between 1 and m, you need to find the number of permutations of n+i objects such that the first i objects are not left untouched by the permutation. As an example, when n=0 we're counting derangements of each size between 1 and m.

My solution for this problem involved Fast Fourier Transformation because, well, I had a hammer, and the problem was not dissimilar enough to a nail :) And it failed because I've reused FFT code from my library, which I've optimized heavily to squeeze under the time limit in a previous Open Cup round, and to which I've introduced a small bug during that optimization :(

And on the weekend, two-person teams competed for the fame and monetary prizes at the onsite finals of CodeChef Snackdown 2017 (problems, results, top 5 on the left, analysis). The last year's winner "Messages compress" were in the lead for long periods of time, but in the last hour they tried to get three problems accepted but got zero, which gave other teams a chance. Team Dandelion have seized that chance and won solving 9 problems out of 10. Way to go!

Thanks for reading, and check back for the next week's summary.

FBHC2017 Finals

There was one quite important competition that I forgot to mention two summaries back: Facebook Hacker Cup 2017 onsite finals took place on June 14 (problems, results, top 5 on the left, analysis, my screencast). In this round I have managed to make three different bugs in my solution to the second problem and the way those bugs combined led to my program passing the samples almost by accident, but of course it did not pass the final testing. On the bright side, not spending more time on this problem allowed me enough time to solve all other problems, so maybe the three bugs were actually a crucial component of the victory :)

Thanks for reading, and check back for the hopefully more complete next week's summary!

Saturday, July 15, 2017

A dynamic nimber week

The June 19 - June 25 week did not actually have any rounds that I'd like to mention, so let me turn back to the problem from the previous week's summary.

You are given an acyclic directed graph with n<=15 vertices and m arcs. Vertices 1 and 2 each contain a chip. Two players take alternating turns, each turn consisting of moving one of the chips along an arc of the graph. The player who can't make a valid move loses. We want to know which player wins if both play optimally. Now consider all 2m subsets of the graph's arcs; for how many of them does the first player win if we keep only the arcs from this subset?

Without the subset part, the problem is pretty standard. We need to compute the nimbers for all vertices of the graph, and the first player wins if and only if the nimbers for the first two vertices are different.

However, we do not have the time to even iterate over all subsets, and thus we can not apply this naive algorithm. Dynamic programming comes to the rescue, allowing to reuse some computations. The dynamic programming idea that is closest to the surface is: go in a topological ordering, and keep computing the nimbers. The nimber for a vertex depends on which arcs going from this vertex are included in the subset, and on the values of nimbers for the vertices reachable from this one, so our dynamic programming state should include only the values of nimbers for the already processed vertices, reducing the running time from 2m to something around the n-th Bell number. That is still not too little, but it turns out this approach could be squeezed to pass.

However, it turns out there is another beautiful dynamic programming idea that helps us move to the running time on the order of 3n. Instead of going vertex-by-vertex, we will now go nimber-by-nimber. For each consecutive nimber, we will try all possibilities for a subset a of vertices having this nimber. What are the requirements on such a subset? From the definition of nimbers we get:
  1. For each vertex in this subset, there must be an arc to at least one vertex with each smaller nimber.
  2. There must be no arcs within this subset.
Condition 2 is pretty easy to check, but condition 1 is not. However, we can notice that we can instead check:
  1. For each vertex that still has no assigned nimber (and thus will eventually have a higher nimber), there must be an arc to at least one vertex in this subset.
  2. There must be no arcs within this subset.
The new condition 1 guarantees that all higher nimbers will have arcs to all lower nimbers, so by the time we reach a certain nimber, we don't need to care about arcs to lower nimbers anymore. Now we can see that the state of our dynamic programming can simply be the last placed nimber and the set of vertices that do not yet have a nimber.

Finally, we can notice that the last placed nimber does not actually take part in the computations at all, so we can reduce the number of states n times more to just remembering the set of vertices that do not yet have a nimber, yielding overall complexity of O(n*3n).

Thanks for reading, and check back for the next week's summary!

An all-at-once week

Codeforces came back during the June 12 - June 18 week with its Round 419 (problems, results, top 5 on the left, analysis). Only Radewoosh and yutaka1999 could get all problems right, and Radewoosh has booked his (semi-) permanent place on the front page of Codeforces with this victory: he is now in top 10 by rating. Well done!

AtCoder Grand Contest 016 took place on the next day (problems, results, top 5 on the left, analysis, my screencast). It's not as if tourist needs an unusual strategy to win, but he successfully demonstrated that the AtCoder rules actually make it reasonable to withhold submissions until one has a solution for the last problem they intend to submit. As a theory, here's how this strategy might have helped here: maybe Gennady already had all solutions implemented by the 68-th minute, but he saw that I have an incorrect attempt for one of the problems, and he was not sure in his solution for problem D. So he submitted everything else, and started testing the solution for D more thoroughly, as he knew that he'd have five minutes after I solve my last problem to still get the first place. Gennady, is this theory at least remotely close to reality? :)

The hardest problem of the round presented a peculiar combination of dynamic programming and nimbers, one that I don't recall seeing before. You are given an acyclic directed graph with n<=15 vertices and m arcs. Vertices 1 and 2 each contain a chip. Two players take alternating turns, each turn consisting of moving one of the chips along an arc of the graph. The player who can't make a valid move loses. We want to know which player wins if both play optimally. The problem so far would be very simple, of course, so here comes the twist: consider all 2m subsets of the graph's arcs; for how many of them does the first player win if we keep only the arcs from this subset?

Thanks for reading, and check back soon for the next week's summary!

Sunday, July 9, 2017

A Dublin week

The June 5 - June 11 week was dominated by the main Google Code Jam elimination rounds.

First off, Code Jam Round 3 took place on Saturday (problems, results, top 5 on the left, analysis). The top 26 have qualified for the finals in Dublin, and kevinsogo was the only contestant to solve the very tricky last problem and still have time left for two more - congratulations on the first place!

I have contributed to the constructive problem trend with problem B: you are given a directed graph with at most n<=1000 vertices, and need to output any nowhere-zero flow in it with edge flows not exceeding n2 by absolute value. Seymour's theorem shows that we can actually make do with values between -6 and 6, but such frugality was not required :)

One day later, not just two, but 21 more tickets to Dublin were up for grabs in Distributed Code Jam Round 2 (problems, results, top 5 on the left, analysis). Solving everything was not required to qualify, but it was certainly required to get into the screenshot on the left. Congratulations to fagu on being the fastest!

Thanks for reading, and check back for the next week's summary!

Monday, July 3, 2017

A week**7

TopCoder SRM 715 was the first round of May 29 - June 4 week (problems, results, top 5 on the left, my screencast). It was nice to reduce the amount of 3am rounds thanks to my United States trip :)

The medium problem continued the "constructive" trend on TopCoder. You are given four numbers: k, n1, n2, n3. You need to construct a valid Towers of Hanoi configuration that requires exactly k moves to be solved, has n1 disks on the first rod, n2 on the second one, and n3 on the third one.

Yandex.Algorithm 2017 Round 3 wrapped up the week, and also completed the selection of the 25 finalists (problems, results, top 5 on the left, analysis, overall standings). Despite the addition of a marathon round which should theoretically be less correlated with the algorithm rounds, the finalist cutoff just increased more or less proportionally, from 32 points from 3 rounds last year to 40 points from 4 rounds this year. Congratulations to all finalists!

In my previous summary, I have mentioned a problem from Round 2 of the same competition: consider all sequences of balls of k<=15 colors, with exactly ai<=15 balls of i-th color, and no two adjacent balls of the same color. Let's arrange them in lexicographical order. What is the number of the given sequence s in this order?

Finding the number of s is equivalent to counting the number of sequences coming before s in lexicographical order. Coming before in lexicographical order, in turn, means that some prefix of the such sequence would be equal to the corresponding prefix of s, and the next number will be smaller than the corresponding number of s. That allows us to split our problem into 15 simpler subproblems, each looking like: how many sequences of balls of k colors exist, with exactly bi<=ai balls of i-th color, no two adjacent balls of the same color, and the first ball has color less than c, and not equal to d?

Here comes the main idea that I keep forgetting. Let's add balls into our sequence color-by-color. In order to not have adjacent balls of the same color in the end, it suffices to simply remember how many pair of adjacent balls of the same color we have. In other words, having placed some amount of colors, for a total of t balls, we have t+1 positions where the balls of the next color can be placed, and some of those positions are special: we must place at least one ball in that position eventually, to avoid having two adjacent balls of the same color in the final position. We need to remember just the number of special positions, and do not need to remember which ones exactly are special.

When placing a new color which has ai balls, we iterate over the number m of blocks of consecutive balls of this color we're going to have, and the number p of those blocks that will be inserted into special positions. Now we need to multiply several combination numbers (to choose p special positions, to choose m-p non-special positions, and to split ai balls into m non-empty blocks), and we also know the new number of special positions which changes by -p+(ai-m).

Finally, in order to deal with the requirements on the color of the first ball, we can start by processing the colors the first ball can be, and continue with the colors it can't be, and disallow placing balls into the first position on the second stage, which just reduces the number of available non-special positions by one.

Assuming we have k colors and at most a balls of each color, the running time of this approach is a product of:
  • k*a for iterating over the position of the first difference,
  • k for iterating over colors,
  • k*a for iterating over the number of special positions so far,
  • a for iterating over the number of blocks we form with the new color,
  • a for iterating over the amount of said blocks that go into special positions,
for a total of O(k3a4).

Thanks for reading, and check back for the next week's summary!

Tuesday, May 30, 2017

A 7-time week

ACM ICPC 2017 World Finals headlined the last week (problems, results, top 12 on the left, our stream, text analysis, video analysis). The ITMO team was leading for quite some time, but they did not manage to solve problem J in time, which gave a chance to the other teams. They did not take advantage of that chance, however, and ITMO became 7-time world champions. Congratulations!

Problem D, while being a bit on the "professional" side, was quite cute. You are given 500000 top-left corners and 500000 bottom-right corners on the plane, and need to pick one of each to obtain a valid rectangle with maximum possible area.

Here's its analysis video, in case you give up :)

AtCoder Grand Contest 015 took place on Saturday, when most World Finals contestants should have already got back home (problems, results, top 5 on the left, my screencast with commentary, analysis). When just two last problems remained, I went for the harder one, and almost got it (got accepted in upsolving after about 5 more minutes) - but not quite. sky58, on the other hand, chose the right strategy and won - congratulations!

Problem C allowed multiple different solutions, each with a non-trivial observation and thus quite exciting to get. You are given a set of blue cells on the 2000x2000 grid that forms a forest with regard to 4-connectivity, and 200000 queries. Each query asks: if we take a certain sub-rectangle of our grid, how many connected components of blue cells are there if we look just at that sub-rectangle?

A few hours later, Yandex.Algorithm 2017 Round 2 provided another chance to score place points towards qualification for the final round (problems, results, top 5 on the left, my screencast, analysis). Tourist threw all strategy considerations out of the window by solving all problems with 15 minutes remaining, while others have barely managed solve solve 5 out of 6. Amazing performance!

The solution to problem E relied on a standard idea which I forgot, so maybe explaining the solution in my blog will help me remember :) Here's what it was about: consider all sequences of balls of k<=15 colors, with exactly ai<=15 balls of i-th color, and no two adjacent balls of the same color. Let's arrange them in lexicographical order. What is the number of the given sequence in this order?

Finally, Codeforces held the online mirror of Helvetic Coding Contest 2017 which I have mentioned earlier (problems, onsite results, online results, online top 5 on the left, analysis). Congratulations to the sweet team on the victory (and their penalty time is better than ours from the onsite contest, too)!

In my previous summary, I have mentioned a TopCoder problem: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

Consider an arbitrary pair of vertices. If their distances to vertex 1 differ by at least 2, then we can't have an edge between them. The same is true for their distances to vertex 2. This is relatively easy to spot, but here comes the hard part: if neither of the above is true, in other words if both pairs of distances differ by at most 1, then we can assume to always have this edge in our graph. Because if we don't have it, we can add it and distances to vertices 1 and 2 will not be affected for the vertices we just connected, and thus for all other vertices as well.

Now, since for each edge we can determine whether we have it in our graph or not, all that remains is to construct the graph, and check if the distances to vertices 1 and 2 come out as expected.

Thanks for reading, and check back next week!

Wednesday, May 24, 2017

ACM ICPC 2017 World Finals stream

ACM ICPC 2017 World Finals start tomorrow at 9:00 local time. There will be quite a few ways to follow the event, the most prominent being ICPC Live. Together with tourist and Endagorion, we have decided to provide another perspective on this competition: we'll try to solve the problems as soon as we get the statements (the rumor has it, we'll be able to submit on Kattis as well), and will stream our screen and our discussions on Youtube! Tune in tomorrow around the time World Finals starts, although we might start a bit later.

We're not sure how this will work out, and would appreciate any advice in comments! One thing that we're not yet sure about is whether we should talk in English or in Russian. The latter should be more productive and thus more realistic, but will naturally be harder to follow for most of the audience. On the other side, the sound quality might be so bad that it won't even matter :)

Monday, May 22, 2017

An almost Rapid week

Last week was relatively calm, with just two competitions that I want to mention, both on Saturday. First, TopCoder Open 2017 Round 2A has significantly raised the stakes compared to Round 1, with just 40 top finishers qualifying (problems, results, top 5 on the left, my screencast). I have enjoyed the medium problem of this round the most, as it is quite rewarding to come up with an easy-to-code beautiful solution after wasting some time coding a very tricky one that comes to one's mind first. Especially rewarding step is removing all code written for the tricky solution (screencast position) :)

Here's what the problem was about: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

With just a few minutes break, Codeforces hosted its Round 415 (problems, results, top 5 on the left). With the only successful solution to problem E coming from a contestant with no other solved problems, it was the speed that decided the winner, and Radewoosh was almost half an hour faster than the rest. Congratulations!

In my previous summary, I have mentioned a Codeforces problem: you are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

First of all, shifting all labels by a constant does not change the answer, so let's pick a vertex A and say that its label is 0. Now, the labels for all other vertices are almost uniquely determined: it's not hard to see that for all vertices labeled not 0, the absolute value of the label is equal to the shortest distance from A. So, we just need to determine which sign will each label have, and which vertices (out of those at distance 1) will have label 0.

Here we can see that vertices at distance 1 from A are the most tricky part, so let's concentrate on them. We can assign three labels to them: let's say those with label 1 are set X, with label 0 are set Y, and with label -1 are set Z. By the problem statement, all those sets must be cliques, and additionally we must have all edges between X and Y, and between Y and Z, but no edges between X and Z.

Let's assume we have a representative B from X, and a representative C from Z. Then the label of each vertex can be determined trivially: if it's connected only to B, it's 1, only to C, then -1, to both, then 0.

It doesn't matter which representatives we pick - in fact, it's not hard to see that we need to pick any two vertices B and C that are connected to A but not between themselves. If we remember that A can also be picked freely, our goal now is to find a chain of two edges such that its endpoints are not connected.

And this, in turn, can be done like this: first, let's find any missing edge. The graph is connected, so there's a path between its ends. If this path is of length 2, we're found what we need. If it's longer, consider its next-to-last vertex. If it's connected to its first vertex, we've found what we need. If not, then we can remove the last vertex and obtain a shorter path such that its ends are not connected. By repeating the process, we will eventually find the required path of length 2.

Now we have solved the problem for vertices with labels -1, 0 and 1, but how do we determine the sign of the label for the remaining vertices? Well, for vertices with label 2/-2, we can use connectivity to any vertex with label 1 as the differentiating factor, and so on.

Finally, having determined all labels, we need to check if our graph does in fact correspond to those labels. The simplest way to do that seems to be: let's check that for all edges in our graph the difference between the labels of the ends is at most one, and also check that the total number of edges in our graph matches the total number of pairs of vertices with labels differing by at most 1. After the first check, the only way we can have an incorrect graph would be having not all required edges, and the second check takes care of that.

Thanks for reading, and check back here and in my Twitter for news from ACM ICPC World Finals this week!

Another speaking week

Just like the previous week, the fun of May 8 - May 14 week started on Thursday with a Codeforces round, this time with Playrix Codescapes Cup (problems, results, top 5 on the left, analysis). Even an incorrect submission for E could not stop tourist, as he still won thanks to solving problem G and having much more points than everybody else on F. Well done!

The next round of the week was also a named Codeforces round - this time Tinkoff Challenge Final Round (problems, results, top 5 on the left, analysis, my screencast with commentary). This time explaining everything aloud did not lead to a disastrous performance for me (finally!). Maybe the quality of explanations suffered :) V--o_o--V was still significantly faster, so congratulations on the victory!

Problem D was a nice exercise in discovering a reliable way to detect a relatively simple pattern. You are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

Later on Saturday, Google Code Jam Round 2 has narrowed the field to just 500 competitors (problems, results, top 5 on the left, analysis). Congratulations to jsannemo on the victory - quite impressive form for the KTH team before the upcoming World Finals, with this win and simonlindholm's win a week earlier.

Yandex.Algorithm 2017 Round 1 took place early on Sunday (problems, results, top 5 on the left, analysis, my screencast). Um_nik was flawless on the five easier problems and correctly noticed the fact that problem E was also, in fact, not very hard. Well done!

Just 80 minutes later, Russian Code Cup 2017 Elimination Round has revealed the 55 finalists (problems, results, top 5 on the left, analysis, my screencast). LHiC did not make any mistakes, and that turned out to be the key to getting the first place. Congratulations!

And finally, Distributed Google Code Jam Round 1 wrapped up the week (problems, results, top 5 on the left, analysis). mk.al13n was ten minutes faster than the rest of the pack in this still quite unusual format with parallel computations. Great job on the victory!

In my previous summary, I have mentioned an AtCoder problem: you are given two trees on the same set of vertices, one blue and one red. You want to convert the blue tree into the red one, step-by-step. At each step, you must take any path consisting of blue edges, add a red edge connecting its endpoints, but remove one of the edges of the path. After n-1 steps all blue edges will be removed, and n-1 red edges will be added, and you want those edges to form the required red tree.

The key idea in this problem is: let's look at the process from the end. Before the last step, we have just one blue edge connecting vertices, say, A and B, so our only option is to remove that edge and add a red edge connecting A and B. Now in the next-to-last step, we must either do the same, or make use of the last blue edge: for example, we can remove blue edge A-C, and add red edge B-C. After some staring at the paper, one can figure out what does this mean: first, we need to find an edge that is both blue and red for the last step, and then we need to contract it - unit its ends together into one vertex. Then, we need to find an edge that is both blue and red in the resulting graph (it might either be blue and red in the beginning, or be a result of a merge of different blue and red edges during the contraction), and contract it again, and so on until the graph is just one vertex.

Now it becomes clear that it doesn't really matter in which order we do the contractions, as they never make things worse. So we should just repeatedly perform any available contraction. There's some technical mastery involved in making the process run in O(n*polylog(n)) time, but that part is relatively standard and I don't want to focus on it too much. You can check the analysis for more details.

Thanks for reading, and check back soon for the last week's summary!

Sunday, May 21, 2017

A plus-minus week

The first contest in May was the Codeforces Round 411 (problems, results, top 5 on the left, analysis). This was one of those rare rounds where finding bugs in the solutions of others turned out to be a better strategy than solving more problems - but just barely. Congratulations to AlexDmitriev for finding 18 challenges and finishing less than one challenge above the second place!

And then, the weekend. First off, AtCoder Grand Contest 014 (problems, results, top 5 on the left, analysis, my screencast). The round seemed to have been decided in the first 45 minutes thanks to the amazing performance by w4yneb0t, but with just 20 seconds left simonlindholm overcame all tricks in the last problem and claimed the victory. Huge congratulations!

Problem E looked tedious at first, but coming up with the right idea helped make the implementation really easy. You are given two trees on the same set of vertices, one blue and one red. You want to convert the blue tree into the red one, step-by-step. At each step, you must take any path consisting of blue edges, add a red edge connecting its endpoints, but remove one of the edges of the path. After n-1 steps all blue edges will be removed, and n-1 red edges will be added, and you want those edges to form the required red tree.

In a few ours after the end of AGC 014, TopCoder Open 2017 Round 1C provided the last chance to qualify into Round 2 (problems, results, top 5 on the left). The results were not very surprising, with all contestants with a "red" rating who took part advancing to the next round. Nevertheless, the fight for the first place was extremely heated with several changes of the leader. In the end Baz93 has emerged on top thanks to 9 successful challenges, including one made 7 seconds before the end of the challenge phase. Congratulations!

On Sunday, TopCoder hosted a TopCoder Open 2017 Regional event in St Petersburg (problems, results, top 5 on the left, SRM 714 results with 2 same problems out of 3, my screencast). The medium problem (easy in SRM) was another example of extremely easy implementation after coming up with the right idea. You are given a string of at most 2500 parentheses which is a valid parentheses sequence. You repeatedly do the following: remove the first character of the string (which is always an opening parenthesis), and remove any closing parenthesis from the string. The remaining string must also be a valid parentheses sequence. You repeat this step until everything has been removed. How many ways are there to complete the entire process, modulo 109+7?

Finally, Codeforces hosted VK Cup 2017 Round 3 (problems, results, top 5 on the left, parallel round results, analysis, my screencast). This time it was another team of Moscow students who dominated the proceedings, with over 500 points margin. Congratulations to the sinful team!

In my previous summary, I have mentioned an interactive Open Cup problem. You are given six six-sided dice, each having some digits and mathematical symbols written on it, as follows:

Die 1: = < > != <= >=
Die 2: 4 + - ( ( )
Die 3: 0 / / / 8 +
Die 4: 2 3 4 5 - )
Die 5: + - * / 1 9
Die 6: 6 7 + - ( )

You need to pick a die, then roll it (by interacting with the judge program), and record the symbol that appears. Then you can do it again, using either the same or a different die, and so on, doing at most 1000 rolls. At some point you need to stop rolling, and form a correct mathematical expression (equality or inequality) by concatenating all recorded symbols in some order. No need to optimize anything - just build any correct expression after at most 1000 rolls.

There must be a ton of different approaches in this problem, as any valid expression suffices. The first idea lies on the surface: since the first die always gives us a comparison, and there must be exactly one comparison in each expression, we can start by rolling the first die once. This will give us the type of comparison we're building. In the remainder of the explanation I will assume we get "=", as that is the most tricky case - the rest can be handled in the same manner.

But then, things get not so easy. First, we should get enough symbols to form any syntactically valid expression: we should get the same amount of opening and closing parentheses, and the amount of operations we get should be two less (one for each side) than the amount of numbers we have (after the contest I've noticed that the numbers can be joined together to form multiple-digit numbers, but I did not notice this in the heat of the moment). Then, of course, we need to form not just syntactically valid expression but a correct equality.

The next idea is: the "correct" part is actually much easier than "syntactically valid" part. Assuming we have only digits and +/- signs, then we just need to split the digits into two groups with the same sum, and with enough different digits this is always possible.

Now, we need to find a way to get the right left-right parenthesis balance, and the right operation-digit balance. The two key dice that help accomplish that are, for example, die 3 and die 2. Let's assume we already rolled some dice, and we have more closing parentheses than opening parentheses. Then we can repeatedly roll die 2 until we get the right parentheses balance. After that, let's assume we have more digits than operations. Then we can repeatedly roll die 3 until we get the right balance (and this won't affect the parentheses).

So now we only need to prepare the ground for rolling dies 2 and 3 - we need to roll some dice in such a way that we have more closing than opening parentheses,  and much more digits than operations (so that even rolling die 2 a few times does not cancel that), and enough different digits and +/- operations to build our equality in the end. Die 3 also gives us "/" operations, but to avoid complications we'll just divide 0 by some numbers to get rid of those.

It's easy to see now that die 4 is exactly what we need. Let's roll it a few (say, 100) times. We will have a few (~16) closing parentheses, and a lot more digits than operations, and those digits will be from a wide variety, so we're guaranteed to get two equal sums. Now we also need a 0 to handle the divisions, so let's roll die 3 until we get it. After this we can switch to "die 2, then die 3" strategy described above to get the right balances, and finally form our equation.

How did your team solve this problem? Maybe there's a simpler approach?

Saturday, May 13, 2017

A week modulo 3

TopCoder SRM 713 opened the last week of April (problems, results, top 5 on the left). Once again, nobody has managed to solve the hard problem. Endagorion was the fastest on the first two, and defended his lead during the challenge phase with a +50. Well done!

Yandex.Algorithm 2017 Qualification Round was available as a virtual contest for the whole Saturday (problems, results, top 5 on the left, analysis). Egor has dominated the round thanks to very fast problem solving, and the most appropriate strategy: get one problem accepted in the open to ensure qualification, and then submit the rest blindly to minimize the penalty time. Very well executed!

Russian Code Cup 2017 Qualification Round 3 also took place on Saturday (problems, results, top 5 on the left, analysis). -XraY- fought back after missing the top 200 in the first qualification round and solved all problems cleanly and so fast that his penalty time is more than 2x smaller than that of the second place - amazing!

The final Open Cup round of the 2016-17 season took place on Sunday, the Grand Prix of Ural (results, top 5 on the left, overall Open Cup results, top 5 on the left). Team Past Glory did not really leave much doubt as to who would win this round, solving 12 problems 1.5 hours before the end of the contest, and having all the time in the world to solve the tricky geometric problem F. Congratulations on the victory!

Problem E was a great example of a non-obvious problem that still requires almost pure creativity to solve, instead of mathematical theorems, algorithms or data structure knowledge. It is an interactive problem. You are given six six-sided dice, each having some digits and mathematical symbols written on it, as follows:

Die 1: = < > != <= >=
Die 2: 4 + - ( ( )
Die 3: 0 / / / 8 +
Die 4: 2 3 4 5 - )
Die 5: + - * / 1 9
Die 6: 6 7 + - ( )

You need to pick a die, then roll it (by interacting with the judge program), and record the symbol that appears. Then you can do it again, using either the same or a different die, and so on, doing at most 1000 rolls. At some point you need to stop rolling, and form a correct mathematical expression (equality or inequality) by concatenating all recorded symbols in some order. No need to optimize anything - just build any correct expression after at most 1000 rolls.

How would you approach this problem?

In keeping with the new trend of having multiple competitions at the same time, Google Code Jam 2017 Round 1C took place in the middle of the Open Cup (problems, results, top 5 on the left, analysis). It was EgorKulikov who followed Eryx's strategy from Round 1A this time, submitting the super tricky hardest problem much earlier than everybody else. Congratulations on the victory!

In my previous summary, I have mentioned another Open Cup problem: construct a completely multiplicative function f(x) such that f(x)=1 or -1, f(a*b)=f(a)*f(b), and that for each n between 1 and 1000000 the prefix sum f(1)+f(2)+...+f(n) does not exceed 20 by absolute value.

When trying to solve it, I was approaching it in the way that many recent "constructive" problems were meant to be solved: try a few things on the computer, maybe do some local optimizations, and it should work. However, I did not manage to find success on this path.

It turns out that there is a solution that can be easily done on paper without using any computer time. Let's define f(3k+1)=1, f(3k+2)=-1, f(3k)=f(k). The multiplicativity follows from the fact that powers of 3 do not impact the value of the function, and numbers 1 and 2 modulo 3 are the same as 1 and -1 modulo 3. Finally, almost all numbers in the prefix sum cancel out: if we look at positions not divisible by 3, 1 and -1 alternate, so the prefix sum of those positions is always 1 or 0. For positions divisible by 3, but not by 9, the same argument applies, and so on; thus each prefix sum will never exceed log31000000 (one for each power of 3), which is less than 20.

Thanks for reading, and check back soon for the last week's summary!