One of the highest priorities for a SaaS architect is creating a SaaS data model that is both robust and secure enough to satisfy tenants or customers who are concerned about surrendering control of vital business data to a third party, while also being efficient and cost-effective to administer and maintain.
In this article, we’ll look at the three popular SaaS Database Design Models / approaches and explore some of the technical and business factors to consider when deciding which approach to use.
Three Approaches to Managing Multi-Tenant Data
1. Isolated Databases – One database per tenant
This approach assumes creating dedicated database instances for each customer. It basically means sharding the database by tenants.
Note: Sharding is the process of storing data records across multiple machines
We have to keep some type of a centralized tenants’ metadata dictionary with information that allows connections to the tenant’s database.
The simplest way is to have a master database with a tenants table where we keep associations of tenants and their database connection data
tenant1 -> Tenant1 db connection string; auth data tenant2 -> Tenant2 db connection string; auth data
Based on the login credentials we retrieve ID/name of the tenant from the master database and, using retrieved data, connect to his tenant specific database.
Pros:
• Scalability – better performance under expected high number of resources added by each tenant.
• Security – distribution of tenants gives us physical isolation of their data. The risk of mixing customers records in database queries is avoided.
• Custom schema for tenant – it’s possible to add some custom entity properties dedicated for tenant.
• Ease of migration (old applications to multi-tenant approach) – if we already have application designed in a single-tenant approach, such model will facilitate the migration of such platform to multi-tenancy.
Cons:
• Uneven distribution of data – there can be situations when some of tenants will use many more resources than others. It’ll be strongly visible when having account types with different limitations (e.g. trial, basic or enterprise accounts).
• More resources and Cost – creating new database for every tenant will lead to higher usage of server’s resources (memory, disk) and hence higher cost. Also you should remember about limitations of the technology e.g. maximum number of databases in a MySQL instance.
• Need of holding second database connection – when storing tenants’ metadata in a master database, we have to make another connection to a tenant’s specific database. It likely won’t be possible to maintain a connections pool for every tenant’s databases.
• Difficult deployment and maintenance – adding new tenant will lead to creating a new database, which usually isn’t a fast operation. Also maintaining multiple databases is much harder in comparison to a single one – operations of updating schema must be done on every tenant’s database.
2. Shared Database, Separate Schemas
This approach involves housing multiple tenants in the same database, with each tenant having its own set of tables that are grouped into a schema created specifically for the tenant.
The separate schema approach is appropriate for applications that use a relatively small number of database tables, on the order of about 100 tables per tenant or fewer. This approach can typically accommodate more tenants per server than the separate-database approach can, so you can offer the application at a lower cost.
Pros:
• Cost Effective – Because of single instance, the cost is less when compared to isolated instances.
• Custom schema for tenant – it’s possible to add some custom entity properties dedicated for tenant, though not as independent as the isolated model.
Cons:
• Data Backup & restoration – Tenant data is harder to backup / restore in the event of a failure.
3. Shared Database, Shared Schema
In a Single Database Model, we keep all tenants’ data in a single database. Entities are marked with some Tenant ID attribute which is just a foreign key to the Tenants table.
Pros:
• Ease of implementation – on an application side there aren’t needed any difficult to implement databases switch mechanisms. We’re using single database connection for a HTTP request.
• Fast deployment – adding new tenant is as simple as inserting the new Tenant entity to the database. We don’t have to make any schema changes for new tenants nor creating new databases.
Cons:
• Security – access for resources must be strongly controlled on the application side to prevent unauthorized access between tenants and mixing different tenants’ data. To achieve that, some type of an Access Control List mechanism must be implemented.
• Scalability – lack of partitioning can lead to scalability problems – as the customers increase, switching to a sharded (grid) model may be required to ensure good performance and stability of the data storage.
Choosing Appropriate Pattern for your SaaS Application
Each of the three approaches described above offers its own set of benefits and tradeoffs that make it an appropriate model to follow in some cases and not in others, are determined by a number of business and technical considerations.
Leave a Reply