Блог пользователя Gassa

Автор Gassa, история, 6 лет назад, По-русски

Всем привет!

Основное время марафона VRt Contest 2019 закончилось. В предварительных результатах пяти участникам удалось набрать больше 60 000 баллов. Борьба продолжалась до последнего дня. Кто же победит на итоговых тестах? Это мы узнаем после завершения итогового тестирования. (Обновление: результаты в отдельном комментарии.)

Предлагаю участникам рассказать идеи своих решений в комментариях ниже — или в отдельных постах, если комментарий получился бы слишком большим.

  • Проголосовать: нравится
  • +75
  • Проголосовать: не нравится

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится +47 Проголосовать: не нравится

Thanks for the contest!

I think that a lot of people thought of my decision, bacause it is simple. But anyway briefly describe it.

The basic idea: to create groups of workers, and at each step choose the next location where the group will work such that it requires the same number of people that are in group.

Let's write down all locations (except base) in one vector. In a cycle from 1 to 7 (the number of workers in the group) we will create an array of strings — a sequence of actions of this group. We start from the base. While there are unfulfilled places, we start the function of searching for the next place (about it a bit later), go there and update all values ​​(group's coordinates, time, answer, etc). At the same time, we calculate the profit that this group has brought. If at some point it has nowhere to go, then proceed to the creation of a new group with the same number of workers. At the same time we look at the calculated profit. If it is negative, then we do not output the actions of the group.

Selection the next location. The most obvious solution is to act greedily, and, for all possible places of work, calculate the time when group will be at the place. From all such times choose the smallest. Result = about 55k points.

It can be improved. If, when choosing the next job, to take into account not only the arrival time. I also considered the position offset relative to the base (the distance from the new location to the base minus the distance from the previous to the base). As well as the length of the path already covered by the group and the time of completion of work at this location (by the condition — h). It all shoved into one formula along with the arrival time. As a result, the task was reduced to the calculation of the coefficients in front of these variables in the final formula. Since the coefficients were different for different input data, I had to iterate over them in a loop, and look at where algorithm gives the greatest answer.

My final formula: arrive_time - (delta * (len - magic2) - one[i].h * magic3 / 3) / magic

  • delta: the position offset relative to the base;
  • len: length of the path;
  • one[i].h: time of completion of work at this location;
  • magic, magic2, magic3: the coefficients ^_^

For more details: My submission. I commented everything in great detail there!

»
6 лет назад, # |
  Проголосовать: нравится +25 Проголосовать: не нравится

First of all thanks for the contest, really enjoyed it

I used a very simple greedy solution to solve the problem.

The idea is to divide the workers into groups of size $$$7$$$, I find the path for the first worker, While choosing the next location I always take the location that starts first, taking into consideration the start time of this job, and the current time + time to go to this location, after the first worker of the group is over there will be some gaps in the current locations still not done, I try to fill this gaps with jobs that need at most $$$6$$$ workers and so on until all this jobs are done for this group, this way I guarantee that all the jobs are been worked by simultaneity.

Every time after I use a new worker I calculate the profit to check if using this worker gives me extra profit or not. The idea can be improved a lot, but unfortunately I was busy all week with some other stuff.

You can check the code

»
6 лет назад, # |
  Проголосовать: нравится +35 Проголосовать: не нравится

Ok, probably this comment will be a bit rude, but I want to share my feelings about this task and ask if some of you feel the same.

I saw an announcement about an optimization task on Codeforces, so I was happy because I usually like them. Let's read the first line of the statement.

Consider several locations in integer points on the plane. One of the locations is the base where workers wait. Every other location contains one job.

"WOW! Such an original and new idea for an optimization task! I've never ever seen any task about exactly the same thing!"

Reading the next few lines made me sure that the organizers just wanted to make them look bad in my eyes. I know that the company wanted to make the task similar to the problems they are solving each day, but anyway, it's a really bad idea to give next instance of a task which everybody thinks about when they hear an "optimization task" keyword. Seriously, the statement looks like it could be invented in one minute. Also, probably it was possible to make some other task, which would still refer to their job.

If I'm mistaken and there are any nice plot twists in this task, please, tell me about them.

Sorry for rude comment and offensive words, but I hope that it's rude enough to make tasks about moving workers and solving jobs on a plane not appear anymore.

Best regards, usually calm Radewoosh.

  • »
    »
    6 лет назад, # ^ |
    Rev. 4   Проголосовать: нравится +34 Проголосовать: не нравится

    If somebody is not convinced enough, I have these examples coming to my mind:

    yandex.algorithm (OK, this one have a plot twist, so it isn't so bad)

    marathon match

    hashcode (worst possible)

    bubblecup (a little bit different, but still delivering packages on a plane)

    more hashcode (how to not hate this contest?)

    I'm sure that there are many more such tasks.

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится +64 Проголосовать: не нравится

      Well, although I agree that the general idea of the problem is similar to the other problems you listed, IMHO there were details, as pointed out by Gassa, that made it very interesting.

      In fact, I really liked this contest, found the problem statement very clear and parameters seemed wisely chosen.

      Thanks Gassa and everyone else involved! Looking forward the next marathon!

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +97 Проголосовать: не нравится

    Thanks for the comment.

    I agree the setting is of the standard ones. However, there's a twist which also made its way to the problem title: the workers have to collaborate at exactly the same time to complete the jobs. It may seem not really important, right until you try some classic approach which involves local changes. Then, each local change you try to make creates a chain reaction of changes. To avoid that, you can either restrict your solution (e.g., have $$$7$$$ groups of workers, where workers from the $$$k$$$-th group only collaborate in groups of $$$k$$$), or restrict local changes (and get stuck in a local optimum more often), or try an approach that avoids them altogether (here, I'll wait for the very top performers to share their thoughts).

    Other parts of the problem are simplified, like the graph being the plane with Manhattan distances, or the workers being exactly the same (fungible). In retrospect, I see how this actually makes the problem feel more standard. But it also helps to focus on the twist, and definitely helps to get the first solution running.

    You are of course within your right to dislike yet another problem in this setting, and not take part. I'm certain there will be more contests with optimization problems, some of which will surely interest you more.

»
6 лет назад, # |
  Проголосовать: нравится +27 Проголосовать: не нравится

How does the tshirt look like?

»
6 лет назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

How's it so possible that my solution passed a hundred pretests and later gets WA1. This is strange and my score didn't change at all.

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +8 Проголосовать: не нравится

    Yeah i saw your solution getting 0 points! This happened to me in the last marathon round because my last submission to the problem was a solution with very less score but still they considered that.

    They will only consider your last solution which you submitted an got a positive score. (Its written in the blog) But it shows a WA on test 1, that is definitely weird!

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Sorry, this will be fixed!

    The reason is as follows. A couple days before the deadline, a user reported a bug in the checker. After a rejudge, yours was the only solution that was affected. I had an impression you resubmitted a working one after that, but apparently this is not the case. My bad, and sorry again!

    As per the rules, your last solution which gets a positive score will be run on the final test set.

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      Okay. Thanks.

    • »
      »
      »
      6 лет назад, # ^ |
      Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

      Seems that my last solution without a bug got somewhere around 5000 points on the contest. I've spent so much time optimizing my solution. Sad as it is. Still thanks for a nice contest, I love participating in marathons. UPD: seems that 53000 was a pretest score, now my solution gets around 530000.

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится +31 Проголосовать: не нравится

The contest was amazing but I couldn't give as much time as I wanted to due to university exams but I really enjoyed it!

My approach:

I used a greedy approach in which i would take the task with the** minimum possible start time (l[i] in the input)** and then greedily select the next best possible job (the parameter taken to find the next best job was the time taken to reach from the previous location to the latest location and on the right starting time) from there and then when there is no more jobs left to take i would take the currently selected jobs and assign to a new worker.

So i used a priority queue which decided which is the best job to start with and then i would take the first one as the starting job and continue normally. This solution only takes 30ms and i was not utilizing the time so then it struck me that at any point there maybe more than one best job to be selected. So then I did the complete process around 100 times, each time selected a random best job and then finally I would take the solution with the maximum profit.

Some tricks that helped me were:

1)Since I was using only one type of parameter for the selection of job it was not giving me so good results so I made multiple functions and in each on of them i used a different parameter because for different test cases different parameters can be the best one. and then called each on them equal number of times with randomness.

2)I was initially taking time as the seed to my random values but then i figured that it was better to take a fixed seed, so i took my all time favorite seed 690069.

3)I found this hack in the early stages of the contest I found that it is always better to finish all the jobs whose worker requirement was greater than 3 (4,5,6,7). and each time we select a worker for a job the required number decreases by 1 and when it becomes < 4 I do not touch it anymore and then finally complete all the jobs whose worker requirement is greater than 0.

Although the one thing i really wanted to know is that did anyone skip jobs in their algorithm. Because i never was able to come up with a strategy that involved skipping jobs!

»
6 лет назад, # |
Rev. 4   Проголосовать: нравится +44 Проголосовать: не нравится

Thanks for the contest, it's very interesting!

Here is my solution:

  1. a greedy constructor to give an initial solution
  2. fix the start-time of the locations, optimize on the worker routes directly on the objective function, model this sub-problem as a Minimum-Cost-Flow.
  3. fix the worker routes, optimize the start-time of the locations, can be optimized to nearly optimal using just shortest path
  4. iterate 2 and 3

by start-time I mean the time when worker starts to work in each location

scores 586064.895 using my own constructor. I tried C137's constructor, and it will probably give a boost about 30000 according to my local tests

=============

I think the idea of 2 is interesting and would like to briefly describe it:

consider a digraph G consisting all locations as vertexes, and an arc <u,v> in G means a worker goes to location v after location u. then the location's collaboration constraints turns into the constraint of indegree of vertexes.

since the start-time of locations are fixed, then we can actually construct G with all legal arcs. the problem is to use some paths to satisfy all indegree constraints, with each path costs 240, and each vertex used as start/end of a path also costs something.

this can be transformed into a minimum-cost-flow problem on a bipartite graph. some prune of arcs are needed, otherwise any algorithm of MCF will be too slow.

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Phronesis jetblack20, When you guys say that the score will increase by 30000 and 7000 what do you exactly mean? The displayed score is divided by 1000. So does it mean it will actually increase by 30 and 7 or actually 30000 and 7000! because if it actually would increase so much the you your scores would actually be around 88000.000 and 66000.000 which is better than anyone's score!

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      For my case, i expect final score to be 598962.869 + 7000, which is equal to 605962.869. It is ~1000 datasets in final grading round, so division by 1000 gives an average for a single dataset.

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      I would expect ~586k+30k, as it's about the final score

»
6 лет назад, # |
Rev. 3   Проголосовать: нравится +22 Проголосовать: не нравится

Thanks, it was very fun to participate in the contest!

I used approach that involves local search with simulated annealing.

My approach is pretty simple:

1) Initialize each location with set of neighbours (~40). Each neigbour should not be far away from current location.

2) Assing fixed initial starting time for each location.

3) Find initial feasible solution with simple greedy algorithm

4) Perform one of two local modifications for a randomly chosen location:

a) With probability 0.2, change starting time of location by amount randomly chosen from interval [-20, 20]. If some constraints are violated after that change, fix them by introducing new workers.

b) With probability 0.8, push some workers from current location to a randomly chosen neighbour. If constraints are violated, try to fix them by reconnecting modified workers. If this fix fails, satisfy constraints by introducing new workers.

Repeat step 4 using simulated annealing.

There was a simple improvement that could give me additional ~7k points to my final solution, but unfortunately i figured it out too late. You can simply increase probability of picking "bad" modification, that involves time change.

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится +63 Проголосовать: не нравится

First of all thanks for the contest, the wisely choosed parameters made it very challenging for me.

Initially I solved the task with the following restrictions:

  • workers have been assigned to groups of size K
  • every worker in the group moves simultaneously
  • a group of size K do jobs which require K workers.

I stored only the list of jobs (path) for each worker, the optimal start time of every job can be calculated easily having the restrictions above.

So, my first solution was very classic: greedy construction + optimization with Simulated Annealing. The optimization part worked well, the results I got were almost same when I had started from a very stupid initial construction, which showed that the greedy construction part doesn't matter much.

The optimization consisted of 3 steps:

Let's the string abcdefghijk represents a path, each letter a job.

  • move a job from one path to another
    abcdefghijk -> abcdeBfghijk
    ABCDEFGHIJK -> ACDEFGHIJK
  • swap tails
    abcdefghijk -> abcdFGHIJK
    ABCDEFGHIJK -> ABCDEefghijk
  • swap two jobs
    abcdefghijk -> aBcdefghijk
    ABCDEFGHIJK -> AbCDEFGHIJK

Up to this point it's look like standard solution for standard optimization problem, but... This solution had an upper bound of about 60xxx points. It seemed reasonably good until solutions with 61k+ and 62k+ score started to appear.

I did not find an elegant way to handle constructions with less restrictions for a long time.

The break point.

On the second week I found that if I deceive the workers and pretend that all jobs with p=6 have p=7 then surprisingly scores began to improve (p = required number of workers for job). At first look it's a waste of resources because we are using 7 workers even for 6-worker jobs, but on the other hand we have more possibilities for constructing paths, and the gain compensates for wasted resources. Besides, we obtain some free capacity (when a 7 worker group is doing a p=6 job, one of the workers can do some p=1 jobs in the meantime).

In my final solution I did all p>=4 jobs having groups of size 7, then I tried to insert p<=3 jobs to the free capacity. The remaining 2<=p<=3 jobs were done similarly with groups of size 3 and p=1 jobs inserted into them.

That's all.

UPD.

The code.

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    hey T1024 The three optimizations you mentioned,

    move from one part to another , swap tails and swap two jobs!

    On what basis did you do any of the above moves??

    Thanks

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Not sure that I understand your question correctly.

      I chose type of moves randomly — first two with probability 0.4 and last with 0.2.

      • »
        »
        »
        »
        6 лет назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        sorry if the question was unclear but i got the answer! so getting to those probabilities was trial and error or was there a specific way you arrived at it??

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится +14 Проголосовать: не нравится

I finished 5th, but the top was way out of my league.

The code is here (more for completeness than for reading, it's a mess).

I start by generating an initial solution: Each job is transformed into a JobPackage. Such a package contains a list of jobs (initially just one single job) and an earliest and latest possible starting time. Then I merge job packages: take two packages with a small distance (last job of package A to first job of package B and the same or similar amount of workers) and append one to the other if possible. This again results in an earliest and latest possible starting time — but not a fixed time in most cases, leaving some flexibility. I complete all jobs, not skipping any (I even tried to remove some jobs from the input by hand, didn't get any better by skipping). In a first run I take all jobs with 4+ workers and merge those only (giving a small penalty for a different amount of workers). In a second run I also merge those jobs with less than 4 workers, with a higher penalty for different worker counts.

I planned to split each task with more than 4 workers in two: one with 4 and one with the rest. That would double the amount of matching partners with 1-3 worker jobs and focus more on optimizing the jobs that require a lot of workers. Somehow it didn't work out, maybe I screwed up the implementation. Keeping track of the first and last possible starting time was a pain.

This initial solution gives a score around 907 for the 9th testcase and takes about 1 second to finish.

Then I keep mutating my initial solution randomly as long as I have time remaining. First of all I generate the indipendent workers from the job packages created above. I do several minor mutations with these workers:

  • Combining workers by appending the tasks of one work to those of another.
  • Take workers with very few jobs and try to assign all their jobs to different workers (possibly changing the starting time in that process), reducing the overall worker count
  • Take a job, which is the first or last for some workers at least. Find other workers which can also do the job for the same or less cost (also changing the starting time here)
  • set all starting times of jobs, which aren't the last job for any worker, to the latest time possible. Then do the same in the opposite direction (set an early start). This helps to reducing waiting times, while the workers could actually start a job
  • passing jobs from one worker to another and doing a crossover (one worker keeps its first few jobs, but then continues with the last jobs of another worker instead, the other worker gets the remining jobs). Here a waiting worker is to be preferred over a travelling one (if both gives the same score), as waiting periods can be filled with other jobs in following mutations.

These mutations help to improve the score of test 9 to about 965 (983 offline when running it for more than 15s — there is not much improvement after 15s for smaller testcases such as 1)

My biggest issue is that I didn't find a clever way to change the order in which jobs have to be completed. It would lead to a huge chain of changes.

I liked the contest and saw it as a challenge to climb on the ladder. The problem didn't seem similar to others to me, which I've tried so far — but I'm more into bot programming and fighting other bots than into optimization. The leaderboard was very stable and kept the ordering after the rerun. Thumbs up for this.

Plots of a solution to test 9: graph, map.

»
6 лет назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

Gassa, hey when will the system test finish, so that we can submit few solutions and know the final results so that at least a T-shirt is confirmed! xD

»
6 лет назад, # |
  Проголосовать: нравится +11 Проголосовать: не нравится

Gassa, will you also make the submissions from the contest public, in order to be able to read others' solutions? (especially of the top contestants who won't describe their approach in this thread or anywhere else)

»
6 лет назад, # |
  Проголосовать: нравится +35 Проголосовать: не нравится

Final results (top 25):

Place Contestant Score Language
1 T1024 639881.469 GNU C++14
2 Rafbill 636292.029 GNU C++17
3 wleite 623724.206 Java 8
4 tamionv 615478.737 GNU C++17
5 eulerscheZahl 608706.631 Mono C#
6 AnlJ8vIySM8j8Nfq 606391.908 Java 8
7 katana_handler 604072.087 GNU C++17
8 NighTurs 603652.452 GNU C++17
9 ckr 601043.658 GNU C++17
10 Nagrarok 600503.406 GNU C++14
11 MiriTheRing 599614.750 GNU C++17
12 jetblack20 598962.869 Java 8
13 karaketir16 596502.335 GNU C++17
14 iehn 595796.071 GNU C++11
15 C137 590415.267 GNU C++14
16 seirg 587567.206 GNU C++17
17 Artmat 587504.727 GNU C++17
18 Phronesis 586064.895 GNU C++14
19 Guty 584270.633 GNU C++14
20 reeWorld 582740.133 GNU C++17
21 IdeaSeeker 582346.050 GNU C++17
22 Lucerne 581810.559 GNU C++11
23 Osama_Alkhodairy 579074.751 GNU C++17
24 nishantrai18 578640.706 GNU C++11
25 Degalat57 576980.039 GNU C++17

Congratulations to the winners!

»
6 лет назад, # |
  Проголосовать: нравится +82 Проголосовать: не нравится

My solution, differently from T1024, didnt rely on groups of workers. And didn’t use SA, just a greedy allocation, repeated several times, with different (random) starting jobs and other parameters.

I had a heuristic formula to estimate the numbers of workers to be used (mainly based on the total number of jobs). It allocates workers to jobs, picking the earliest available jobs, but randomly skipping few of them. For example, if jobs are sorted by the time allowed to start, it could pick 1, 5, 7, 13, 20... Until the number of workers match the number estimated before. After that initial allocation, the solution keeps track of "resources" (workers available at a given position and time, when their last task was completed). Then it finds the next job to be executed, testing all pending jobs. Here several heuristics were used, but the main idea was to pick the job that would be able to start earlier and use the workers that would "lose" less time (which includes the distance and possibly a "wait time", as workers can come from different locations and we need to have them all to start). If in the way to the chosen next job to be completed workers can complete another job, without delaying the original one, they will do that.

Repeat the whole process, picking different (random) stating jobs, and changing a little bit the number of workers, until the time is almost over. On top of the best solution found, in the last 200 ms, repeat the process only for the jobs left undone, trying to add more workers to take care of these jobs, and check if that would provide a positive gain.

For large cases, it runs only 10 iterations, and 50-100 in average cases. Increasing the available running time (10x) would improve my score by only 0.5%, which wouldn't be enough to reach the top two spots.

As a conclusion, I would say that allowing workers moving independently is, in theory, a good option to minimize their idle / moving time. On the other hand, it creates a chain of dependency, which I couldn't figure out a good way to change without rebuilding a large part of the solution, and that prevented me to use SA to improve the solution. Watching a simple visualizer that I built, it was clear that paths taken by works weren't very smart, as the greedy selection only sees the next move. Starting from the scratch, even with "everything tuned", was good but not enough to match SA, even with a restriction as the group of workers used by the winner.

By the way, thanks again for the nice competition, and congratulations to T1024 and Rafbill on the impressive performance!!!

»
6 лет назад, # |
  Проголосовать: нравится +35 Проголосовать: не нравится

Thanks for the contest, I also found it quite interesting. And congratulations to T1024 for the win! As Gassa said, the challenging part is that it is hard to make a solution based on local changes work well.

I represented my solution by a directed acyclic graph, whose vertices are jobs, and where an edge $$$a \rightarrow b$$$ represents a worker moving to job b after completing job a. If you are given such a graph, it is easy to check if it can lead to a valid solution, by computing for each job the minimum possible start time (after doing a topological sort of the vertices). It is also possible to find the best way (minimizing the total time) to schedule jobs in polynomial time, using the simplex algorithm. I used that as a postprocessing step in my code.

Another subproblem that can be solved in polynomial time is that of finding an optimal graph assuming that the start time for each job is fixed. As Phronesis said, it can be encoded as a minimum-cost bipartite matching. I didn't use that in my solution, but I used a maximum bipartite matching (that minimises the number of workers) to compute a good initial solution, and to regularly reset the solution (so as to avoid local optimums).

My solution proceeds by repeating the following steps (~20000 times for the larger test cases, and ~150000 times for the smaller ones):

  1. Compute a random cut of the current DAG. It separates the graph into a left part L, and a right part R.
  2. Compute for each job $$$a$$$ in L, the minimum possible end time $$$Tl(a)$$$, and for each job $$$b$$$ in R the maximum possible start time $$$Tr(b)$$$.
  3. Forget all edges in the graph between L and R.
  4. Compute a maximum bipartite matching between L and R. An edge $$$a \rightarrow b$$$ can be used if $$$Tl(a) + dist(a,b) \le Tr(b)$$$.
  5. Add the edges of the maximum matching to the graph.

Each time a larger matching is found, the number of workers used in the solution decreases. Because a new matching is computed each time, it also changes the graph, even when the solution doesn't improve. (this is important to avoid being stuck in local optimum).

In a second phase, I stop trying to reduce the number of workers and compute minimum cost maximum matchings using the hungarian algorithm, so as to decrease the total time used by the solution.

I always complete all available jobs.

My code : https://pastebin.com/0vwWyrVF

Scores