https://rifaki.me/ 2026-04-28 weekly https://rifaki.me/portrait.jpg Mouhssine Rifaki Reinforcement learning researcher. Visiting researcher at NYU Tandon EMERGE Lab. Incoming PhD candidate at Imperial College London. Bachelor's in mathematics from Sorbonne, master's at the MVA program at ENS Paris-Saclay. https://rifaki.me/notes/ 2026-04-28 monthly https://rifaki.me/publications/ 2026-04-28 monthly https://rifaki.me/projects/ 2026-04-28 monthly https://rifaki.me/notes/adversarial-examples/ 2026-04-28 yearly https://rifaki.me/notes/img/adversarial-fgsm.png Adversarial examples Adversarial examples, FGSM, PGD, Madry's saddle-point formulation, the robust-features view, and the accuracy trade-off. https://rifaki.me/notes/diffusion-models/ 2026-04-28 yearly https://rifaki.me/notes/img/diffusion-forward-reverse.png Score matching and diffusion Denoising score matching, the SDE view, Karras's design-space disentanglement, and where diffusion sits relative to flow matching and consistency models. https://rifaki.me/notes/double-descent/ 2026-04-28 yearly https://rifaki.me/notes/img/belkin-fig1.png Double descent Belkin's bias-variance picture, Nakkiran's three axes, the label-noise caveat, and what the phenomenon does and does not say at scale. https://rifaki.me/notes/edge-of-stability/ 2026-04-28 yearly https://rifaki.me/notes/img/cohen-fig1.png The edge of stability Cohen's edge-of-stability finding, Arora's analysis, and what it does to the older flat-minima story. https://rifaki.me/notes/flat-minima/ 2026-04-28 yearly https://rifaki.me/notes/img/keskar-fig1.png On flat minima Hochreiter and Schmidhuber, Keskar's small-batch result, Dinh's reparameterization objection, SAM, and where the flatness-generalization debate currently sits. https://rifaki.me/notes/grokking/ 2026-04-28 yearly https://rifaki.me/notes/img/grokking-fig1.png Four explanations for Grokking Power's modular-arithmetic finding, Nanda's circuit-level analysis, and the four candidate explanations that may all be the same mechanism. https://rifaki.me/notes/implicit-bias/ 2026-04-28 yearly https://rifaki.me/notes/img/implicit-bias-margin.png The implicit-bias program Soudry's max-margin result for linear models, the geometry of optimizer choice, and how cleanly the linear case does and does not transfer to deep networks. https://rifaki.me/notes/information-bottleneck/ 2026-04-28 yearly https://rifaki.me/notes/img/tishby-fig2.png Reading Tishby's information bottleneck The original Tishby claim, Saxe's reply on activation choice, Goldfeld's estimator critique, and what survives. https://rifaki.me/notes/lottery-tickets/ 2026-04-28 yearly https://rifaki.me/notes/img/frankle-fig3.png Lottery ticket hypothesis Frankle and Carbin's original procedure, Liu's rebuttal, the rewinding fix, and what holds up after the Frankle-to-Liu exchange. https://rifaki.me/notes/mode-connectivity/ 2026-04-28 yearly https://rifaki.me/notes/img/mode-connectivity-paths.png Mode connectivity Garipov and Draxler's curved low-loss paths, the permutation turn, linear mode connectivity, and what the geometry does and does not prove. https://rifaki.me/notes/neural-scaling-laws/ 2026-04-28 yearly https://rifaki.me/notes/img/kaplan-fig1.png Kaplan, Chinchilla, and broken laws Kaplan's original fit, Chinchilla's correction, Caballero's broken-power-law alternative, and what predictive ability the labs actually use. https://rifaki.me/notes/neural-tangent-kernel/ 2026-04-28 yearly https://rifaki.me/notes/img/ntk-linearization.png The neural tangent kernel Jacot's infinite-width limit, the lazy-training regime, and what the NTK explains and what feature learning leaves on the table.