[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80776":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":11,"stars7d":11,"stars30d":11,"stars90d":13,"forks30d":13,"starsTrendScore":14,"compositeScore":15,"rankGlobal":8,"rankLanguage":8,"license":16,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":19,"hasPages":17,"topics":20,"createdAt":8,"pushedAt":8,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":13,"starSnapshotCount":13,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},80776,"imc-prosperity-4","rmtf1111\u002Fimc-prosperity-4","rmtf1111",null,"Python",43,3,40,0,9,46.61,"MIT License",false,"main",true,[],"2026-06-12 04:01:30","# IMC Prosperity 4\n\nOur team, `rat_hunters` finished **#2** in IMC Prosperity 4, with a cumulative Phase-2 PnL of 1,459,764 XIRECs (Algo 1,220,042 + Manual 239,722). On the algorithmic challenge alone, we finished #3 globally at a 500 XIRECs difference from second place on algo.\n\n## Round 3 — Mean reversion\n\n**Algo PnL: +297,716** • **Algo rank: #4**\n\nThe first instinct for this round was to attempt some IV-based strategies. However, after realizing that the fluctuations in the IV accounted for ±2 moves in the price, we dropped the whole options business.\n\nAfter this, Maxime started making positive PnL on both products with mean reversion and then it hit me - this is Prosperity, mean reversion MUST be the answer to all your problems. A quick analysis of `VELVET_FRUIT` and `HYDROGEL_PACKS` showed that they have negative autocorrelation, which could be explained by mean-reversion. If it looks like MR, walks like MR and it's Prosperity, it probably is MR.\n\n***Strategy:*** Our strategy for the two products was exactly the same - find a fair value (for `VELVET_FRUIT`: `5250`, for `HYDROGEL_PACKS`: `9990`); find a symmetric threshold for the deviation from fair value at which to enter a position (for `VELVET_FRUIT`: `28`, for `HYDROGEL_PACKS`: `40`); when the price crosses `fair +- threshold` send a signal to fill up your position respectively (this could take a few ticks). We had no liquidation upon reversion, just buy at lows and sell at highs (and of course, for `VELVET_FRUIT`, do the same for its respective options). 100 lines of code. \n\n## Round 4 — Mean reversion\n\n**Algo PnL: +221,170** • **Algo rank: #3** • **Rank for this round only: #20**\n\nRound 4 de-anonymized the trade tape: every fill carried a counterparty ID. However, we did not find anything interesting so we did not use any bot behaviour.\n\nHowever, we noticed that for `HYDROGEL_PACKS`, downward excursions had a median peak of 20, while upward excursions had a median peak of 40. This meant that symmetric thresholds were suboptimal.\n\n***Strategy:*** Same as round 3, except we introduced asymmetric thresholds for `HYDROGEL_PACKS` - we used a buy threshold at `-8` from fair and sell at `+40` from fair price. \n\n**P.S: Despite being ranked #4 and #20 for Algo Round 3 and 4, we still ended up at #3 for Algo when combining the two rounds. I wonder why ;)**\n\n## Round 5 — The New Prosperity!\n\n**Algo PnL: +701,157** • **Algo rank: #3** • **Rank for this round only: #8**\n\nAfter 5 hours of sleep I woke up at 7 am and the first thing I saw was our teammates on east coast saying we're in 2nd place. There were also 50 (fifty!) new products, split across 10 sectors and each had 5 products. Needless to say, against my best efforts, I couldn't go back to sleep. This wasn't just for fun anymore :) \n\n### Finding Nemo (Alpha).\nFirst things first - we plotted first-order difference correlations within groups. This gave away almost all the alphas and the products we worked on the first day of Round 5. Below are the heatmaps that showed anything significant.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"images\u002Fheatmap_pebbles.png\" width=\"46%\" alt=\"Purification Pebbles diff-correlation\"\u002F>\n  \u003Cimg src=\"images\u002Fheatmap_robots.png\" width=\"46%\" alt=\"Domestic Robots diff-correlation\"\u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cimg src=\"images\u002Fheatmap_oxygen_shakes.png\" width=\"46%\" alt=\"Liquid Breath Oxygen Shakes diff-correlation\"\u002F>\n  \u003Cimg src=\"images\u002Fheatmap_snackpacks.png\" width=\"46%\" alt=\"Protein Snack Packs diff-correlation\"\u002F>\n\u003C\u002Fp>\n\nAnother interesting correlation finding was that Snackpacks as a sector was correlated with the rest of the market (and it was quite significant, at 0.22 correlation for first-order differences). Unfortunately, we did not have the time to explore this direction.\n\n### Pebbles — basket arbitrage\n\nWe noticed that Pebbles' prices summed up to 50,000 consistently with the exception of some steps where the sum deviated by ±15 and reverted immediately in the next tick. We found that it was rarely profitable to take a position on these deviations due to the spread. We realized that Market Making is a risk-free strategy here due to the bots always trading the same quantities at the same timestamps for all the pebbles simultaneously. We netted around 18k\u002Fday with Market Making and taking at the deviations when it was profitable, accounting for the spread.\n\n### Snackpacks - pairs trading\n\nThe very high negative correlation between `SNACKPACK_VANILLA`\u002F`SNACKPACK_CHOCOLATE` might signal cointegration. However, if you ran ADF, the reported p-value was quite large. In particular, while high correlation can signal cointegration, it does not necessarily imply it. In fact, very high correlation probably rules out pairs trading - think about a stock that always copies or reverts the move of another one. Not too tradeable IMHO.\n\nWe found that the `SNACKPACK_VANILLA − SNACKPACK_RASPBERRY` spread is the cleanest mean-reverting signal in the family - second-to-best ADF p-value, median almost 0, and also ties together all the products. When the spread crossed `±100`, we sent a signal to fill up our positions. Since `SNACKPACK_CHOCOLATE` was so negatively correlated with `SNACKPACK_VANILLA` and `SNACKPACK_STRAWBERRY` with `SNACKPACK_RASPBERRY`, we used the `SNACKPACK_VANILLA`-`SNACKPACK_RASPBERRY` signal to also take a respective `SNACKPACK_STRAWBERRY`-`SNACKPACK_CHOCOLATE` position. `SNACKPACK_PISTACHIO` was treated as an \"excluded\" product that we used market making on.\n\n### Lattice movements - the bread-winner\n\n`ROBOT_DISHES`, `ROBOT_IRONING`, `OXYGEN_SHAKE_EVENING_BREATH`, and `OXYGEN_SHAKE_CHOCOLATE` exhibited a discrete-grid micro-structure: mid mostly walked in small ticks, but occasionally it started moving by ±100 positions. This could've been explained by the fact that the mid price was rounded to a point on a 100-wide lattice. By standard martingale arguments this would've implied that after a 100 swing one way the next swing would most likely be in the opposite direction. A quick empirical check confirmed this - after the price moved by ±100, the next move was ∓100, respectively, with 85% chance. The strategy at this point is trivial - whenever the price moves by +100 sell full inventory and when it moves by -100, buy full inventory.\n\n### Microchips — within-family lead-lag\n\nThe Microchip family is the only Round 5 group where a clean integer-lag signal exists between products. `MICROCHIP_OVAL`, `MICROCHIP_SQUARE`, `MICROCHIP_RECTANGLE` and `MICROCHIP_TRIANGLE` all followed `MICROCHIP_CIRCLE` at 50, 100, 150 and 200 lags, respectively. The correlation was rather weak - around 0.05. However, when aggregated over multiple steps it could've been a tradeable signal. To our surprise, when aggregating multiple differences, i.e. looking at `MICROCHIP_RECTANGLE_{t+300} - MICROCHIP_RECTANGLE_{t+150}` and `MICROCHIP_CIRCLE_{t+150} - MICROCHIP_CIRCLE_t`, the correlation jumped to 0.15. This should not happen if there were no other hidden structure for this family. We searched hard for the other systematic pattern, but failed to find it. For unfounded reasons we resorted to overfitting - we aggregated the price difference over larger windows than what was logical (i.e., looked at `MICROCHIP_CIRCLE_{t+200} - MICROCHIP_CIRCLE_t` to predict `MICROCHIP_RECTANGLE_{t+400} - MICROCHIP_RECTANGLE_{t+200}`) and swept for thresholds. Needless to say, we netted an embarrassing -25k on microchips as a group. Nonetheless, this was the most fun asset (class) of all of Prosperity - it really made us think and write down math. I would come back next year just for a round where we get another chance to trade such products.\n\n### General market making\n\nEvery Round 5 product that isn't claimed by the four strategies above gets a basic two-sided passive MM: post `(best_bid + 1, best_ask − 1)`. The exclusion set in `PROB_MM_EXCLUDED` ensures the MM layer never fights another strategy when the logic gets too messy. One thing to mention about MM is that all products got traded at the same time, in the same quantities and in the same directions. Hence, we exposed ourselves to the overall movements of the market. However, we found the market to be overall stable, and we were equally exposed to gaining from directionality as we were to losing, so it seemed sensible to keep market making.\n\n## Overfitting\n\n### Round 3 and 4\nI want to mention that the amount of overfitting reported in Discord was actually insane - z-scores, Bollinger (?), EMA, blah blah. Before implementing any of these you should have a solid reason. For example, if you take a rolling mean as your \"fair price\" and plot the residuals you will find that the latter was also mean-reverting. However, this holds true for almost any time series under the sun :). You would need a more rigorous analysis to claim that local mean-reversion would be more profitable than global mean-reversion that would necessarily have to include the stability of the rolling mean. \n\nYou should really think about what you are trading here - you are betting that the current price is too high for whatever happened in the past 100 ticks, and that it is going to revert, in say 1000 ticks. Then you are selling now, and then in 1000 ticks you would want to buy back because the rolling mean in 900 ticks will be lower than the price in 1000 ticks? Take a step back to think about what exactly you are doing. The rolling mean could've drifted as well - what if the rolling mean started dropping itself and the 1000th tick price was higher with respect to the rolling mean, but substantially lower than when you bought? If there are no fundamental statistics to confirm that you do not expose yourself to the rolling mean deviations then you should just assume overfit if the PnL on backtest is good. \n\nNeedless to say, I am not claiming that local mean-reversion is bad, all I am trying to communicate is that there needs to be concrete reasoning and logic to back this up. Better backtest results is not logic, it's just a number :) \n\nAs for thresholds - when you do large sweeps searching for best thresholds, try to aim for those that are in stable regions of PnL. I.e., if you pick say 100 as a threshold, make sure that 99 and 101 could be decent thresholds too. You do not want to pick a threshold that was lucky in-sample but it is in a region where PnL is overall low. If you see a major spike in PnL at 100, but 95-99 and 101-105 thresholds are much lower, then probably the only special thing about 100 is just noise.\n\n### Round 5 \nWhen I opened Discord and saw the reported backtest PnLs I was pleasantly surprised. People were reporting 1.8mln backtests, or even upwards of 2.2mln. In total we had 50 products, sure, but positions were 10 for all of them. These were significantly fewer positions than what we could've taken in round 3 and 4. It was clear that everyone was pulling mean-reversion and cointegration out of thin air. And to be fair, if you looked at the graph of some random UV product and then some sleeping-pod, maybe, it looked like you could pairs trade them. And that's totally normal, in fact, in the training sample, I am sure that tens, if not hundreds of pairs exhibited a cointegration pattern. In fact, if you simulate 50 simple random walks over 30,000 steps and check their ADF p-values you will get something like this:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"images\u002Fsrw_adf_simulation.png\" width=\"78%\" alt=\"ADF p-values across 1225 pairs of independent simple random walks\"\u002F>\n\u003C\u002Fp>\n\nQuite astonishing, over 100 pairs that you could arbitrage. Yet, if you take a step back, and remember that these are simple random walks you will realize that there is no way you can trade them profitably. This is the multiple testing problem - when you have so many random variables, some will exhibit certain structures that are there just by chance. And in the little experiment we ran, as well as in Round 5, these structures held true only on training data, and not on the evaluation data. \n\n***Take a step back and think about what you are doing. Think about your evidence for a strategy - is it just \"this strategy does well for these products\" or do you have some fundamentals to back it up? Or is there any logic to back this up? Trust your logic and reasoning before you trust your backtester.***\n\n## Manual\nTo be fair our manual was not the brightest, so we will keep it short.\n\n### Round 3 - Crowd +70,684\nNathan simulated a bunch of bots for this round to place their own bids and see what would be the best choice afterwards. \n\n### Round 4 - Options +65,024\nWe did Markowitz and looked at the PnL graph against the std for various values of the regularizer lambda. Then, we picked the lambda where the derivative started to get smaller - i.e. where we were trading off more PnL for less std. Did not spend too much time on this, hence the rather straightforward and naive strategy. \n\n***Interesting stuff:*** However, we realized afterwards that this round is amazing. Due to the extreme variance, around 30% of the seeds would've resulted in the most popular (max-EV) solution either gaining or losing 500k. No one who puts this much effort into making a competition as IMC does would let this kind of noise simply randomize the top ranks. Hence, you definitely could've expected that the seed would've been changed once, twice or a few times till the admins saw an acceptable results distribution. This radically changes the outcome of any strategy, especially that you know which \"randomness\" is the \"bad randomness\". And I think this aspect is great, because IRL you should never believe someone who tells you that a stock price is a geometric brownian motion. Humans move the price with their actions, just how the admins have the final say on which seed gets used :)\n\nAlso, it's amazing that Manual could've finally been of importance for the overall challenge. \n\n### Round 5 - News +104,014\nFor this round just look up the news bulletins and final movements from last years that you can find on the internet and try to pattern match.  Then assume everyone is already doing that and don't overthink too much - the price moves based on what other contestants do too (don't forget to read the wiki).\n\n","IMC Prosperity 4是一个使用Python编写的交易算法项目，旨在通过均值回归策略在虚拟市场中实现盈利。该项目的核心功能是基于历史价格数据计算出每个产品的“公平价值”，并设定买卖阈值以执行自动交易。技术特点包括简洁的代码实现（约100行），以及针对不同产品调整的对称或非对称阈值策略。适用于需要自动化参与金融模拟竞赛或希望探索简单量化交易策略的场景。",2,"2026-06-11 04:02:18","CREATED_QUERY"]