Over the recent years, there has been tremendous progress in deep learning for many fields of AI, such as visual perception, speech recognition, language understanding, and robotics. However, many of these methods require large amounts of supervision and do not generalize well for unseen and complex real-world tasks. By overcoming these challenges, we aim to develop a more general-purpose artificial intelligence agent that can perform many useful tasks for humans with high sample efficiency and strong generalization abilities to previously unseen tasks. We will present ongoing research at LG AI Research on tackling some of these challenges. First, we present methods for learning AI agents to sequentially perform simpler low-level subtasks by combining them. Specifically, our method can learn dependencies between subtasks through experience, which allows the agent to efficiently plan and execute complex tasks in unseen context/scenarios. We further advance this framework by investigating how to expand existing knowledge to new objects/entities, how to efficiently adapt from previous related tasks to new tasks, and how to incorporate large language models and multimodal learning to improve generalization and learning efficiency. These methods are demonstrated on learning agents in 3d simulated environments, game playing, and web navigation. Second, we will describe our EXAONE project (EXpert Ai for everyONE) which aim to integrate large-scale language models and multimodal generative models for developing expert-level AI for various vertical applications with high learning efficiency. More concretely, we will present our ongoing projects powered by EXAONE for developing various real-world applications, such as AI customer center, creative collaborations on arts, and deep document understanding.