With the rapid emergence of Internet of Things (IoT), we are more and more surrounded by smart connected devices (agents) with integrated sensing, processing, and communication capabilities. In particular, IoT-based positioning has become of primary importance for providing advanced Location-based Services (LBSs) in indoor environments. Several LBSs have been developed recently such as navigation assistance in hospitals, localization/tracking in smart buildings, and providing assistive services via autonomous agents collectively act as an Internet of Robotic Things (IoRT). The focus of the thesis is on the following two research topics when it comes to autonomous agents providing LBSs in indoor environments: (i) Self-Localization, which is the autonomous agent’s ability to obtain knowledge of its own location, and; (ii) Localized Decision Support System, which refers to an autonomous agent’s ability to perform optimal actions towards achieving pre-defined objectives. With regards to Item (i), the thesis develops innovative localization solutions based on Bluetooth Low Energy (BLE), referred to Bluetooth Smart. Given unavailability of Global Positioning System (GPS) in indoor environments, BLE has attracted considerable attention due to its low cost, low energy consumption, and widespread availability in smart hand-held devices. Because of multipath fading and fluctuations in the indoor environment, however, BLE-based localization approaches fail to achieve high accuracies. To address these challenges, different linear and non-linear Bayesian-based estimation frameworks are proposed in this thesis. Among which, the thesis proposes a novel Multiple-Model and BLE-based tracking framework, referred to as the STUPEFY. The proposed STUPEFY framework uses set-valued information and is designed by coupling a non-linear Bayesian-based estimation model (Box Particle Filter) with fingerprinting-based methodologies to improve the overall localization accuracy. With regards to the Item (ii), there has been an increasing surge of interest on development of advanced Reinforcement Learning (RL) systems. The objective is development of intelligent approaches to learn optimal control policies directly from smart agents’ interactions with the environment. In this regard, Deep Neural Networks (DNNs) provide an attractive modeling mechanism to approximate the value function using sample transitions. DNN-based solutions, however, suffer from high sensitivity to parameter selection, are prone to overfitting, and are not very sample efficient. As a remedy to the aforementioned problems, the thesis proposes an innovative Multiple-Model Kalman Temporal Difference (MM-KTD) framework, which adapts the parameters of the filter using the observed states and rewards. Moreover, an active learning method is proposed to enhance the sampling efficiency of the overall system. The proposed MM-KTD framework can learn the optimal policy with significantly reduced number of samples as compared to its DNN-based counterparts.