
NAIRR Award
SSEC won a National AI Research Resource (NAIRR) Award to build a tool library, named LLMaven, using a Generative AI approach. We will use RAG (Retrieval Augmented Generation) techniques as a means of extending LLMs by utilizing data that has privacy concerns in a manner that is safe and cost effective for individual researchers who do not have the resources to develop their own models (or purchase expensive equipment). LLMaven will leverage publicly available diverse datasets and disparate academic knowledge bases.
RAG Office Hours
As part of eScience Institute’s Office Hours program, SSEC is offering office hours every Tuesday from 10 AM – 11 AM at eScience Institute’s Data Science Studio on UW campus to help support the UW community on issues related to RAG (Retrieval-Augmented Generation) based workflows for Generative AI. Researchers who are curious about leveraging generative AI tools with private or pre-publication data are welcome to sign up here and stop by with their questions.


Projects
AutoDoc: SSEC worked with researchers from Brown University and University of Osnabruck to build a pipeline and train a freely available large language model (LLM) to translate research processes implemented in AutoRA (a collection of Python packages that together form a framework for closed loop empirical research). Such descriptions provide the basis for an automated and transparent documentation of the empirical research process. More details are available here.
Tutorials
SciPy2024 tutorial: The SSEC team presented a tutorial at the annual SciPy conference in Tacoma, WA on Jul 09 2024 to cover (1) the basics of language models, (2) setting up the environment for using open source LLMs without the use of expensive compute resources needed for training or fine-tuning, (3) learning a technique like Retrieval-Augmented Generation (RAG) to optimize output of LLM, and (4) build an app to demonstrate how researchers could turn disparate knowledge bases into special purpose AI-powered tools.
