Here's a recent paper by Arun Gautham Chandrasekhar & Juan Pablo Xandri on how to preserve stage-game payoffs in an experiment implementing a repeated game that ends with a fixed probability after each stage. They show that paying participants for all periods, which works for risk neutral players, isn't as robust as playing for the (randomly determined) last period only.
A note on payments in the lab for infinite horizon dynamic games with discounting. Economic Theory (2022). https://doi.org/10.1007/s00199-021-01409-x
Abstract: "It is common for researchers studying infinite horizon dynamic games in a lab experiment to pay participants in a variety of ways, including but not limited to outcomes in all rounds or for a randomly chosen round. We argue that these payment schemes typically induce different preferences over outcomes than those of the target game, which in turn would typically implement different outcomes for a large class of solution concepts (e.g., subgame perfect equilibria, Markov equilibria, renegotiation-proof equilibria, rationalizability, and non-equilibrium behavior). For instance, paying subjects for all rounds generates strong incentives to behave differently in early periods as these returns are locked in. Relatedly, a compensation scheme that pays subjects for a randomly chosen round induces a time-dependent discounting function. Future periods are discounted more heavily than the discount rate in a way that can change the theoretical predictions both quantitatively and qualitatively. We rigorously characterize the mechanics of the problems induced by these payment methods, developing measures to describe the extent and shape of the distortions. Finally, we prove a uniqueness result: paying participants for the last (randomly occurring) round, is the unique scheme that robustly implements the predicted outcomes for any infinite horizon dynamic game with time separable utility, exponential discounting, and a payoff-invariant solution concept."
"Infinite horizon dynamic games are typically implemented in the lab using the random termination method and paying for all rounds or a random round. Typically, a participant plays a round of a game which then continues to the subsequent round with a given probability (Roth and Murnighan 1978). To incentivize behavior, the experimenter pays the participant as a function of the history of play. The central problem is that in the lab payments are made after the experiment and therefore not consumed between stages of the game (as they would be in the realm of the model). Experiments in the literature, following Murnighan and Roth (1983), usually pay subjects for all rounds. More recently Azrieli et al. (2018) systematically catalogue work in a collection of top journals and show that 56% pay for all rounds and 37.5% pay for one or several randomly chosen rounds.
"Paying for all rounds is only valid when agents are assumed to be risk neutral (Murnighan and Roth 1983). While a section of the literature is interested in worlds respecting risk-neutrality, paying individuals for all rounds was (and often is) standard even when the models being tested explicitly deviated from risk neutrality. This lead to a dissonance between the theoretical ambitions and experimental implementation of such work. Indeed, Azrieli et al. (2018) document that 48% of the papers they examine do not justify their payment scheme in the given experiment whatsoever/
...
"In Sect. 3, we introduce and study the scheme wherein individuals are paid for the last round in a random-termination dynamic game. It implements all such models.2 The reason why this works is due to the (standard) observation that when we have exponential discounting, myopia with respect to the future is isomorphic to random termination of the game with some probability.3 Further, we show that in fact last round payment is the unique payment scheme that implements the game robustly. What this means is that for a given environment, one can find some agent and some history where for some preference (e.g., small amounts of curvature in some cases) the scheme fails to implement the target game. That is, the preference ordering for the agent in that situation is muddled and does not reflect that within the target game.
"Next, in Sect. 4, we provide a rigorous analysis of two payment schemes used in the literature – either payment for a randomly chosen round or all rounds – and show how these may induce games different from the target game."