Agentic AI for Research Workflows: SSEC hosts AI Panel

This fall, the UW Scientific Software Engineering Center (SSEC) hosted a community-driven meetup during Seattle AI Week focused on how AI is reshaping research workflows in academia, industry, and startups. The meetup included a panel moderated by SSEC Head of Engineering Vani Mandava and featuring Bodhisattwa Majumder from AI2, Shamsi Iqbal from Microsoft, Luke Kim from Spice AI and Carlos Garcia Jurado Suarez from SSEC. Around sixty audience members listened as panelists discussed the exciting potential and challenges surrounding the application of artificial intelligence to research.

When asked about how AI is transforming research software, panelists highlighted that AI agents are becoming game-changers for software engineers, enabling them to write large amounts of code and rapidly prototype new features. AI also excels at gathering and summarizing information from diverse sources, making complex research findings more accessible to various audiences. However, the panelists noted that despite its strengths, AI has yet to match human creativity in envisioning future directions in research.

The discussion explored which aspects of research are most compatible with AI integration examining key issues around fine-tuning and benchmarking, and barriers to adoption and strategies for managing expectations. The group expressed concern about how AI might hinder the development of core intellectual capabilities and foundational skills necessary for future learning and work, especially for those who are new to a field.

The conversation also addressed crucial issues like risks and safety in deploying AI, as well as hidden costs that extend beyond operational expenses, including effects on human intellect and job skills. Ethical implications involving access, open source, and accountability were considered, in addition to the tools and processes used for building and refining prototypes.

Majumder highlighted that while models can provide answers quickly, they often loop back to similar ideas after a few rounds, pointing to a significant “blue sky research area” for developing AI that can truly explore and generate novel concepts for scientific discovery.

Iqbal pointed out that doing basic, foundational tasks helps humans build the context needed to tackle more complex work, and giving these away could hinder that crucial development, implying that AI over-reliance results in a hidden cost of human intellect. Noting the shift in human role from content creation to content consumption, she questioned how coders can effectively review or debug code if they’ve never done the foundational creation work themselves. Suarez emphasized that while AI is changing his coding workflows significantly, there’s a “fine line” where you can really leverage it, but “you can’t fully trust it.” This highlights the need for human oversight and critical evaluation even when using AI in research software development.

Kim recounted how his team heard about new technology at a conference and by the next morning had productized and shipped it into their actual product, something they “wouldn’t have even attempted before” without AI, highlighting the incredible speed and efficiency AI brings to development.

On evaluating agentic AI systems, the group discussed how benchmarking remains essential but difficult, as agents operate in dynamic environments and often take varied, multi-step paths to reach a solution, making it important to assess both the final output and the process or “trajectory” leading to it. Performance must be weighed against operational costs, with attention to finding efficient agents that deliver strong results without excessive resource consumption. When direct observation is not possible, telemetry signals and inferred metrics, such as time savings or patterns of use, can provide valuable insights, though these may be limited by privacy considerations. Importantly, measuring the “lived value” via sustained user retention metrics is still fundamental to effective assessment.

Finally, the panelists discussed methods for measuring the value and engagement of AI systems. Essentially, the discussion revolved around not just what AI can do, but how it’s being integrated, the challenges faced, the ethical considerations, and its broader implications for individuals, organizations, and society.

When asked about their advice for people looking to enter the evolving tech industry in Seattle, panelists unanimously agreed to “just start building!”

Click below to watch the full panel on the eScience YouTube channel.

eScience News

Events & Seminars

Agentic AI for Research Workflows: SSEC hosts AI Panel