Introduction.- Learning the Optimal Network with Handoff Constraint: MAB RL Based Network Selection.- Learning the Optimal Network with Context Awareness: Transfer RL Based Network Selection.- Meeting Dynamic User Demand with Transmission Cost Awareness: CT-MAB RL Based Network Selection.- Meeting Dynamic User Demand with Handoff Cost Awareness: MDP RL Based Network Handoff.- Matching Heterogeneous User Demands: Localized Cooperation Game and MARL based Network Selection.- Exploiting User Demand Diversity: QoE game and MARL Based Network Selection.- Future Work.