Dis-Co ×

DisCo_01
Seamless collaboration between humans and robots is vital in the Architecture, Engineering, and Construction (AEC) industry, especially for complex tasks like selective disassembly. Traditional robot control methods require specialized expertise, hindering natural interaction between workers and robots. This research explores the use of Large Language Models (LLMs) to enhance human-robot collaboration, aiming to develop a methodology that enables construction workers to work alongside robots as colleagues.

The project focuses on equipping robots with artificial reasoning and natural language communication abilities. By enabling robots to understand tasks, provide advice, and physically participate, workers can interact with them without needing coding knowledge. The construction industry faces several hurdles in automation that set it apart from other sectors. Environmental variability on construction sites creates dynamic conditions that challenge robot operation (Huang et al., 2022). Workers often resist new technologies due to concerns about job displacement and the complexity of operating advanced machines. The industry relies heavily on tacit knowledge and human expertise, which are difficult to automate fully.

The disassembly process is the primary focus due to its complexity and underexplored nature, presenting unique challenges that require careful planning and adaptability. Process integration focuses on the disassembly phase, which is more chaotic and nuanced than assembly. Effective disassembly requires careful planning to maintain structural stability during the process, consider material preservation, and manage sequential dependencies. Disassembly is not merely the reverse of assembly; it involves unique challenges due to varying values of building elements, the need to maintain stability during deconstruction, unknown materials, lacking data, and complex sequencing decisions (Circular Construction Lab, AAP Labs, 2024).

Human-Robot Collaboration (HRC) emerges as a strategic solution. By leveraging the complementary capabilities of humans and robots, HRC combines human adaptability and decision-making with robotic precision and consistency (WIRED UK, 2022). Robots handle repetitive or physically demanding tasks, while humans perform complex operations requiring dexterity and cognitive flexibility. This approach enhances productivity and safety, aligning with the human-centric vision of Industry 5.0 (Zizic et al., 2022).

The methodology involves creating a multi-agent system that integrates LLMs for natural language processing, allowing robots to interpret and act on human instructions. This agent works similarly to how a human team would, including multiple experts who communicate together and build their ideas on top of each other. However, this team is digital. The system includes:

Manager Agent: Coordinates tasks among specialized agents and communicates with human operators.
Foreman Agent: Validates user requests against up-to-date disassembly manuals using Retrieval-Augmented Generation (RAG) systems.
Stability Agent: Assesses structural stability using physics simulations to ensure safe disassembly sequences.
Planning Agent: Translates high-level plans into executable robotic actions, handling motion planning and collision checking.

By incorporating an LLM agent system that makes use of RAG, the robot accesses current information without the need for constant retraining, reducing costs and ensuring accuracy (Gao et al., 2023). The physics simulation component allows the robot to predict and mitigate risks by simulating potential actions before execution. This ensures that each step in the disassembly process is safe and effective.

Robot development focuses on creating a function library for core actions like moving, picking, placing, and holding. The robot uses motion planning capabilities to execute tasks safely, adjusting for factors like collision avoidance and reachability. Target detection methods, such as QR codes and camera systems, enable the robot to identify and interact with specific elements based on human instructions. This allows for flexible adjustments and adaptability in dynamic environments.

Deployment involves testing the system with physical demonstrators, progressively increasing complexity to refine the workflow. Scenarios include replacing elements, collaborative decision-making in disassembly, and handling repetitive components. In each case, the robot and human collaborate closely, with the robot providing advice, adjusting actions based on structural assessments, and learning from each interaction to improve future performance.

In one demonstrator, the robot assists in replacing a structural element. The robot analyzes the environment, detects potential collisions, and adjusts its approach accordingly (Circular Construction Lab, AAP Labs, 2024). It communicates these adjustments to the human operator, ensuring a safe and efficient process. This demonstrates the robot's ability to adapt to real-world limitations while maintaining smooth interaction with human collaborators.

Another scenario involves the robot advising against removing a particular element due to stability concerns. The robot explains that removing the element could compromise the structure and suggests an alternative sequence. This interaction highlights the robot's ability to reason and contribute to decision-making processes, enhancing safety and efficiency.

The system evaluation reveals significant improvements in collaboration, adaptability, and safety. The robot can understand and execute complex instructions, provide valuable feedback, and work seamlessly with human operators. The use of natural language allows for intuitive communication, reducing the need for workers to have technical expertise in programming (Mitterberger et al., 2022).

Conclusions highlight the potential of multi-agent systems and LLMs to transform human-robot collaboration in the AEC industry. By enabling robots to understand context, reason logically, and communicate naturally, the research contributes to making robotic systems more accessible and effective. The methodology addresses key challenges in automation, offering solutions that enhance productivity while maintaining the crucial role of human expertise.

The discussion acknowledges limitations, such as the reliance on language-based understanding and the need for improved speed and multimodal sensing for safety. The robot's understanding is based solely on language, which can lead to misinterpretations if commands are unclear. Additionally, the system's speed in processing and executing tasks needs improvement for practical implementation.

Challenges include adapting to noisy construction environments, where voice commands may be less effective (Mitterberger et al., 2022). Integrating additional sensors, such as vision systems and pressure sensors, can enhance real-time environmental understanding and safety. These multimodal sensing capabilities are essential for the robot to operate safely and effectively in dynamic and unpredictable settings.

The short-term outlook focuses on validating the system through industry trials and expanding the robot's knowledge base using simulation-based training. By testing the system in real-world conditions, feedback can be gathered to refine its effectiveness and address issues related to worker acceptance and practical implementation. Simulation-based training allows the system to learn from a wide range of scenarios, enhancing its adaptability and performance.

In the long term, the research aims to extend the methodology beyond disassembly to assembly processes, integrating multimodal perception systems like vision-language processing. This would enable the robot to develop a deeper comprehension of construction environments, leading to more precise and contextually appropriate responses. The integration of visual data with language understanding allows for nuanced interpretation of complex situations, enhancing the robot's decision-making capabilities.

The research contributes to the broader field by demonstrating how LLMs can be integrated into robotics to handle complex tasks in dynamic environments. It shows that robots can be more than tools; they can be collaborative partners that enhance human capabilities. By addressing both technical and human factors, the methodology hints at the way for more intuitive and effective human-robot interactions.

Future work includes improving the system's speed and integrating multimodal sensing to enhance safety and real-time understanding. Developing methods to handle the challenges of noisy construction environments will be critical. The potential for the robot to learn and adapt over time, creating an open-source database of knowledge that can be shared across users, offers exciting possibilities for continuous improvement and broader adoption.

Moreover, the system's ability to update its knowledge through Retrieval-Augmented Generation means it can stay current with the latest protocols and safety guidelines without costly retraining (Gao et al., 2023). This makes the system scalable and adaptable to changes in construction practices and regulations.

In conclusion, the integration of LLMs and multi-agent systems in human-robot collaboration holds significant promise for the AEC industry. By making robots more accessible and capable of understanding and adapting to complex tasks and using natural language, this research contributes to advancing automation while maintaining the essential role of human expertise and decision-making. The approach offers practical solutions to current challenges and sets the stage for future innovations in construction robotics.

The research demonstrates that robots equipped with artificial reasoning and natural language communication can work alongside humans more effectively. By focusing on the disassembly process, the methodology addresses a critical but underexplored area in construction. The collaboration between humans and robots leads to enhanced safety, efficiency, and adaptability.

This work lays the foundation for future studies to build upon, with the potential to introduce automation into construction projects in a more intuitive and approachable way. The integration of advanced technologies like LLMs and multi-agent systems represents a significant step toward more intelligent and collaborative robotic systems in construction, ultimately contributing to a more sustainable and resilient built environment.
DisCo_02 DisCo_03 DisCo_04 DisCo_12 DisCo_08 DisCo_10 DisCo_05 DisCo_07 DisCo_09 DisCo_11 DisCo_13

Stuttgart, Germany

2024

Graduation Thesis - ITECH, University of Stuttgart

In collaboration with Shirin Shevidi & Zahra Shakeri