AutoDoc: Automated Documentation of Empirical Research 

Partners

Sebastian Musslick, Director, Autonomous Empirical Research Group 

Younes Strittmatter, Research Assistant, Autonomous Empirical Research Group 

SSEC Engineers

Carlos Garcia Jurado Suarez, Senior Principal Software Engineer

AutoRA is a collection of Python packages that together form a framework for closed loop empirical research. The packages allow users to set variables, weights, and actions to perform closed-loop empirical research studies.

Reproducibility is a foundational pillar of the scientific process. However, numerous empirical studies are difficult to replicate due to inadequate and opaque documentation of their research steps. SSEC is working with researchers from Brown University and University of Osnabruck to build a pipeline and train a freely available large language model (LLM) to translate research processes implemented in AutoRA and other tools into academic, non-computational descriptions. Such descriptions provide the basis for an automated and transparent documentation of the empirical research process.

AutoDoc will elucidate crucial steps of the research process in an automated fashion. By doing so, the project hopes to standardize documentation of the experimental pipeline and drive greater accessibility and reproducibility. This fusion of LLMs and declarative programming will enable the specification of problems without the need to implement bespoke workflows for corresponding solutions. SSEC is developing an end-to-end framework: sourcing models, setting up ML Ops, fine tuning, and delivering a platform for inference.

The ultimate version of the software project is a translator tool that allows users to turn their entire research code–expressed across multiple code files and in terms of scientific computing packages such as AutoRA–into an automatically generated methods description describing the research process. The user (the researcher) will also be able to upload their generated documentation to the Open Science Framework, made available for everyone to review.