New technology achieves world-leading accuracy while significantly extending video duration processing capability
Fujitsu announced the development of a video analytics AI agent for frontline workplaces. The AI agent uses spatial video and image data from workplace camera footage, as well as written information, to draft reports and make recommendations for workplace improvements. The AI agent will be positioned as a core technology of Fujitsu’s AI service “Fujitsu Kozuchi”. Fujitsu will provide a trial environment for the AI agent in fiscal year 2024 and commence in-house implementation from January 2025.
The AI agent is based on a multimodal large language model (LLM). The AI agent trains itself to recognize 3D images of the workplace using information from written documentation (i.e., safety rules, etc). Context memory technology uses written information to selectively retain only the relevant data, enabling the analysis of long-duration video content with world-leading accuracy.
Read More : HRTech Interview with Henri Nordström, CEO at Jobilla
The AI agent will be evaluated by FieldWorkArena, an evaluation environment newly developed by Fujitsu, under the supervision of Carnegie Mellon University. FieldWorkArena will be made available for the researcher community from December 2024, with tasks being added to GitHub and the Fujitsu Research Portal in December 2024.
Training to operate in the frontline workplace based on written documentation
This technology augments the AI agent’s video data comprehension capabilities using information from written documentation to help the LLM understand what it cannot from video content alone.
Efficiently retaining context data from video content
This technology allows for the user to provide a prompt for a specific type of behavior to focus on in a video, i.e., “safe behavior in humans.”
Read More: Building Skills in the Workforce
[To share your insights with us, please write to psen@itechseries.com ]