Methods: Software
Fields: Computer Science

Joseph L. Hellerstein (eScience Institute, Computer Science & Engineering)


“Spaghetti” Spreadsheet – An Undesirable Result of Current Spreadsheet Systems

According to the Bureau of Labor Statistics, the ratio of spreadsheet users to programmers is approximately 5-to-1. Despite many efforts to train scientists in programming, a large number of scientists only use spreadsheets for their computational work. However, as depicted by the above graphic, spreadsheets are notorious for their lack of readability (a problem for sharing scientific results and reproducibility), inability to represent complex relationships between scientific data, poor reuse, and poor scalability

The SciSheets project is creating a new spreadsheet system for scientists to develop sophisticated, scalable, robust calculations with reusable components without a knowledge of programming. SciSheets addresses shortcomings in current spreadsheet systems through a number of innovations:

  • Poor readability. The computational chaos of cell level formulas is addressed by not allowing ¬†cell formulas, only column formulas. Many existing spreadsheet systems do this, but they lack computational power because of the inability to do column level iteration (e.g., recurrence relationships such as computing compound interest). A SciSheets innovation is the ITERATE function with addressing for the current and previous rows in referenced columns.
  • Reuse and scalability. SciSheets provides for exporting spreadsheet calculations as a program (initially python) and using external programs in spreadsheet formulas.
  • Handling complex scientific data. SciSheets structures data as nested tables, thereby permitting the expression of more complex data relationships, such as n-to-m relationships.

SciSheets is implemented using web technologies, an approach similar to iPython.