Date of Award:

5-2026

Document Type:

Thesis

Degree Name:

Master of Science (MS)

Department:

Computer Science

Committee Chair(s)

Curtis Dyreson

Committee

Curtis Dyreson

Committee

John Edwards

Committee

Steve Petruzza

Abstract

JavaScript Object Notation (JSON) is a common format for representing and exchanging data on the web. Most systems only keep the current version of a JSON document, even though, in many situations, it is also important to know how that document changed over time. For example, an application might need to answer questions such as “What did this record look like last week?” or “How has this list grown over the past year?” A simple way to keep this history is to store a full copy of the document every time it changes, but this quickly becomes wasteful, most of the information is repeated, and it can be slow to search through many versions.

In this research, we studied how to represent the history of a JSON document in a way that uses less space and still supports fast queries. We built on earlier work that treated the history as a temporal document with time information attached to the data. Then we designed an alternative representation that groups together the different versions of each part of a document and records when each version was valid. In this design, large parts of the document that do not change, such as long lists or stable subtrees are stored once and reused across many versions, while only the parts that actually change grow over time. We implemented this representation in Python, created a tool to generate histories from ordinary JSON files, and ran experiments on real datasets. We also developed a simple Python interface that lets programmers ask time-based questions using natural code. The results showed that the new representation can reduce storage needs and still answer temporal queries efficiently.

Share

COinS