Any application that handles data (any application?) has to define data models to operate on.
In this page, data model refers to a data type or equivalent language feature
that represents data across the application.
In rust this would be a struct
.
Applications can have multiple models for the same data to support multiple contexts:
Having multiple ways of viewing the same data can have pros and cons. Having too many models can cause complexity, inconsistencies and errors. Having only one model forces it to be responsible for everything.
Some of the cons are:
On the other hand when models have well defined roles they have some value too:
View models:
Internal models:
Persistence models:
If models for different layers happen to coincide it is of course possible to share them, as long as they are split into separate models as soon as the needs of different layers start to diverge.
These are some examples of where the distinction above is useful.
Replicante Core provides a report to users about the latest cluster orchestration task. This report is a view model, built incrementally during orchestration and with no relation to how data is stored or processed internally.
This means that cluster orchestration can change how it does things, steps can move around and the logic even entirely replaced without changing anything of how the results are presented.
Internal models are what replicante operates on. Keeping them private means that changes to the logic do not unexpectedly leak the the API or break the storage layer.
For example the cluster view internal model can be used to represent a cluster with no impact on how data is stored or presented to the user.
Persistence models allow the structure of data in the DB to be more efficient for storage and search operations.
For example some models are stored with additional timestamp information for periodic loops to efficiently find only the records they need to look at and not scan all records. This is not exposed to the application as it does not care for this information outside of querying.
Replicante Core stores data in a document store.
Care must be taken to allow for a zero downtime upgrade path. This means that changing the data format must be incremental:
Each data item is stored and updated atomically. The encoding details vary from store to store but Replicante Core uses a general interface to interact with the store, regardless of the functionality it exposes.
The main reasons for choosing MongoDB as the store are:
The main reason store transactions are not in use is because the desired database (MongoDB) does not support them. The reasons MongoDB was the preferred store to begin with are listed above.
Although the current implementation is not making use of transactions, things may change in the future as the project changes and matures.
Finally the value of transactions is questionable in a lot of Replicante Core use cases since the data source has no atomicity guarantee to begin with. When refreshing the state of each cluster node, all we can say is the data returned by a single call to an agent endpoint is consistent in itself. There are no guarantees the result of two calls to the same agent would return results that are consistent across the two calls, even during the same orchestration cycle.