MigCast in Monte Carlo: The Impact of Data Model Evolution in NoSQL Databases

by   Andrea Hillenbrand, et al.

During the development of NoSQL-backed software, the data model evolves naturally alongside the application code. Especially in agile development, new application releases are deployed frequently causing schema changes. Eventually, decisions have to be made regarding the migration of versioned legacy data which is persisted in the cloud-hosted production database. We solve this schema evolution problem and present the results of near-exhaustive calculations by means of which software project stakeholders can manage the operative costs for data model evolution and adapt their software release strategy accordingly in order to comply with service-level agreements regarding the competing metrics of migration costs and latency. We clarify conclusively how data model evolution in NoSQL databases impacts the metrics while taking all relevant characteristics of migration scenarios into account. As calculating all possible combinatorics in the search space of migration scenarios would by far exceed computational means, we used a probabilistic Monte Carlo method of repeated sampling, serving as a well-established means to bring the complexity of data model evolution under control. Our experiments show the qualitative and quantitative impact on the performance of migration strategies with respect to intensity and distribution of data entity accesses, the kinds of schema changes, and the characteristics of the underlying data model.


page 1

page 2

page 3

page 4


A Taxonomy of Schema Changes for NoSQL Databases

Schema evolution is a crucial aspect in database management. The propose...

A Unified Metamodel for NoSQL and Relational Databases

The Database field is undergoing significant changes. Although relationa...

An Empirical Study on the Design and Evolution of NoSQL Database Schemas

We study how software engineers design and evolve their domain model whe...

Patterns for Blockchain Migration

With the rapid evolution of technological, economic, and regulatory land...

Schema Validation and Evolution for Graph Databases

Despite the maturity of commercial graph databases, little consensus has...

Data Migration using Datalog Program Synthesis

This paper presents a new technique for migrating data between different...

Temporal evolution of the Covid19 pandemic reproduction number: Estimations from proximal optimization to Monte Carlo sampling

Monitoring the evolution of the Covid19 pandemic constitutes a critical ...

Please sign up or login with your details

Forgot password? Click here to reset