Date of Award

5-2011

Degree Type

Report

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chair(s)

Curtis Dyreson

Committee

Curtis Dyreson

Committee

Stephen Clyde

Committee

Stephen Allan

Abstract

This report describes proposed templates for supporting sequenced temporal semantics in Pig Latin, a dataflow language used primarily for the analysis of very large data sets. Sequence semantics says that if we take a relation and divide it into smaller relations based on timestamps, while still carrying out the regular Pig Latin program over it, the result should be the same as when carrying out the temporal Pig Latin program over the original relation. In real time, the relations can be enormous, and dividing such relations into smaller ones based on every possible timestamp creates an extremely large number of smaller relations. Hence, we create temporal programs, which eliminates the need to divide a relation into smaller relations and carry out additional operations over those smaller relations. We look at each of the templates and discuss their functionality. One example of such a template is temporal grouping, which provides an ability to group a set of tuples or a whole relation based on timestamps. Using temporal grouping, a user can find the number of tuples that exist at a given point of time. Another example is temporal coalescing, which allows a user to project multiple tuples and the timestamps of their existence in the database. We compare the complexity of the templates with the existing operations.

Comments

This work made publicly available electronically on June 13, 2011

Share

COinS