Radio access technologies for mobile communications are characterized by multiple access (MA) strategies. Orthogonal MA techniques were a reasonable choice for achieving good performance with single user detection. With the tremendous growth in the number of mobile users and the new internet of things (IoT) shifting paradigm, it is expected that the monthly mobile data traffic worldwide will exceed 24.3 exabytes by 2019, over 100 billion IoT connections by 2025, and the financial impact of IoT on the global economy varies in the range of 3.9 to 11.1 trillion dollars by 2025. In light of the envisaged exponential growth and new trends, one promising solution to further enhance data rates without increasing the bandwidth is by increasing the spectral efficiency of the channel. Non-orthogonal MA techniques are potential candidates for future wireless communications. The two corner points on the boundary region of the MA channel are known to be achievable by single user decoding followed by successive decoding (SD). Other points can also be achieved using time sharing or rate splitting. On the other hand, machine-to-machine (M2M) communication which is an enabling technology for the IoT, enables massive multipurpose networked devices to exchange information among themselves with minor or no human intervention. This thesis consists of three main parts. In the first part, we propose new practical encoding and joint belief propagation (BP) decoding techniques for 2-user MA erasure channel (MAEC) that achieve any rate pair close to the boundary of the capacity region without using time sharing nor rate splitting. While at the encoders, the corresponding parity check matrices are randomly built from a half-rate LDPC matrix, the joint BP decoder employs the associated Tanner graphs of the parity check matrices to iteratively recover the erasures in the received combined codewords. Specifically, the joint decoder performs two steps in each decoding iteration: 1) simultaneously and independently runs the BP decoding process at each constituent sub-graph to recover some of the common erasures, 2) update the other sub-graph with newly recovered erasures and vice versa. When the number of erasures in the received combined codewords is less than or equal to the number of parity check constraints, the decoder may successfully decode both codewords, otherwise the decoder declares decoding failure. Furthermore, we calculate the probability of decoding failure and the outage capacity. Additionally, we show how the erasure probability evolves with the number of decoding iterations and the maximum tolerable loss. Simulations show that any rate pair close to the capacity boundary is achievable without using time sharing. In the second part, we propose a new cooperative joint network and rateless coding strategy for machine-type communication (MTC) devices in the multicast settings where three or more MTC devices dynamically form a cluster to disseminate messages between themselves. Specifically, in the basic cluster, three MTC devices transmit their respective messages simultaneously to the relay in the first phase. The relay broadcasts back the combined messages to all MTC devices within the basic cluster in the second phase. Given the fact that each MTC device can remove its own message, the received signal in the second phase is reduced to the combined messages coming from the other two MTC devices. Hence, this results in exploiting the interference caused by one message on the other and therefore improving the bandwidth efficiency. Furthermore, each group of three MTC devices in vicinity can form a basic cluster for exchanging messages, and the basic scheme extends to N MTC devices. Furthermore, we propose an efficient algorithm to disseminate messages among a large number of MTC devices. Moreover, we implement the proposed scheme employing practical Raptor codes with the use of two relaying schemes, namely amplify and forward (AF) and de-noise and forward (DNF). We show that with very little processing at the relay using DNF relaying scheme, performance can be further enhanced. We also show that the proposed scheme achieves a near optimal sum rate performance. In the third part, we present a comparative study of joint channel estimation and decoding of factor graph-based codes over flat fading channels and propose a simple channel approximation scheme that performs close to the optimal technique. Specifically, when channel state information (CSI) is not available at the receiver, a simpler approach is to estimate the channel state of a group of received symbols, then use the approximated value of the channel with the received signal to compute the log likelihood ratio. Simulation results show that the proposed scheme exhibits about 0.4 dB loss compared to the optimal solution when perfect CSI is available at the receiver.