This dissertation explores three stochastic models: additive functionals of reflected jump-diffusion processes, two-time scale dynamical systems forced by α-stable Lévy noise, and a variant of the Optimistic Policy Iteration algorithm in Reinforcement Learning. The connecting thread between all three projects is in showing convergence of these objects whose results have direct applied implications. A large deviation principle is established for general additive processes of reflected jump-diffusions on a bounded domain, both in the normal and oblique setting. A characterization of the large deviation rate function, which quantifies the rate of exponential decay for the rare event probabilities of the additive processes, is provided. This characterization relies on a solution of a partial integro-differential equation with boundary constraints that is numerically solved with its implementation provided. It is then applied to a few practical examples, in particular, a reflected jump-diffusion arising from applications to biochemical reactions. We derive the weak convergence of the functional central limit theorem for a fast-slow dynamical system driven by two independent, symmetric, and multiplicative α-stable noise processes. To do this, a strong averaging principle is established by solving an auxiliary Poisson equation where the regularity properties of the solution are essential to the proof. The latter allow for the order of convergence to the averaged process of 1-1/α to be established and subsequently used to show weak convergence of the scaled deviations of the slow process from its average. The theory is then applied to a Monte Carlo simulation of an illustrative example. In the Optimistic Policy Iteration algorithm, Monte Carlo simulations of trajectories for some known environment are used to evaluate a value function and greedily update the policy which we show converges to its optimal value almost surely. This is done for undiscounted costs and without restricting which states are used for updating. We employ the greedy lookahead policies used in previous results thereby extending current research to discount factor α=1. The first-visit variation of this algorithm follows as a corollary and we further extend previous known results when the first state is picked for updating.