Fighting a growing Django project
I have been working with Python and Django for a decade, and while it has been a joy, for the most part, it is starting to let me down, and I wanted to go into why that is
When the decision to use Django was made, the reasoning may seem very familiar:
- We already knew how to work with Python
- We didn't want the fuzz of dealing with I/O like HTTP and DB access, etc.
- We Knew SQL well enough, but would rather work with an ORM, to more elegantly compose my queries
Django was the ideal match! It has all batteries included, and scaffolding lets you get started quickly. In addition, their documentation is fantastic, and the tutorial teaches you all you need to get up and running really quickly.
In essence, it allows for rapid prototyping, and time-to-market is blazingly fast!
However, software is so much more than making a thing and deploying a thing. Afterwards comes the maintenance phase, where the product evolves with new features as the scope expands, and before long, the design and architecture of the solution starts to matter.
Symptoms of bad design
When following along with Django’s tutorials and documentation, you’ll see examples of both function- and class-based views that allows for taking shortcuts by leaning on the structure of your data models. This is awesome, because you can do a lot with very little code. Actually, almost every cool feature in Django takes advantage of the Django ORM to do much with less code.
The problem arises when the data models no longer maps one-to-one with your domain models. If you want CRUD views that need to act on several data models to do their job, then the advantage of all the shortcuts fall away, because you need extra code to cover all the model relationships.
Taking this a bit further; once domain rules start spanning across multiple data models, you'll see many posts online from novices asking “where to put domain logic?!”. Most answers reasonably suggests putting logic on the models (the “Fat models” pattern) or making a separate service layer.
That helps avoid duplication and gives a lot of freedom for the developers to implement the right cuts and seems between models. Unfortunately, as mentioned earlier, you’ll lose the cool shortcuts from Django, but that is fine. Eventually, we have to write actual code to do what we need and not just lean on the framework.
Right, what does a service layer look like? Well, some would say that the service layer should encapsulate queries and data mutations to keep things dry. Others say that querying should live elsewhere, separating Queries from Commands (CQRS pattern). While both work, there is still a problem: Testing!
Ideally, tests can be written independently of any I/O and independently of the framework, so that domain logic has no external dependencies. The problem is, with fat models, your domain model IS the data model, which is defined by Django. So there is no way of separating the two.
Also, fetching from - and saving to - the database is baked right into the models (Active Record pattern), so testing models without hitting the database would need to be done very carefully, if spanning across model relationships. You can create an in memory model instance without saving it, but then you would need to mock the relationships which is not ideal to work with.
Ultimately, in order to keep tests simple, we may just accept that the tests will hit the database, and actually, the built in testing tools come with functionality that manages a test database for us, so that is easy. But then the tests will be very slow, due to the overhead of dealing with a database and also the tests cannot run without a database service present. It works, but it is messy and frustrating once the size of the test suite grows.
If I wanted to model a "Heat map" or "Matrix" in Django, I would want to break that up into multiple tables, i.e. models
class Matrix(models.Model): name = models.CharField(...) row_count = models.IntegerField() column_count = models.IntegerField() class RowHeader(models.Model): matrix = models.ForeignKey(Matrix) name = models.CharField(...) row = models.IntegerField() class ColumnHeader(models.Model): matrix = models.ForeignKey(Matrix) name = models.CharField(...) column = models.IntegerField() class MatrixCell(models.Model): matrix = models.ForeignKey(Matrix) value = models.FloatField() row = models.IntegerField() column = models.IntegerField()
Conceptually, we just wanted a domain model called
Matrix with appropriate business logic, but in order to model it in a normalized fashion, we've broken it up into multiple tables with foreign key relationships.
Now, all of Django's nifty shortcuts are essentially useless. The CRUD class-based views are little help, because they don't address the conceptual
Matrix, but just the matrix table, and likewise all of its components. We can certainly make it work, sure, but any attempt at encapsulation and defining Root Entities will fail, because it doesn't fit with the framework.
That was a loaded paragraph. I'm sure many will raise an eyebrow and I'm sure that it sparks a lot of discussion here. If we were to discuss it for a while, we would probably end with the conclusion, that I am using Django wrong.!
Django is not an MVC (Model-View-Controller) system, but an MVT (Model-View-Template). In MVC, the "Model" is the domain model (or a view model, or DTO in some cases), but in MVT, the model is literally the data model. This means that what I am trying to do is not really supported by Django's design, and that is where the pain comes from.
What I am working towards is "Enterprise Software" and "Domain Driven Design". The scope of the projects that I am working on dictates that these concepts be followed. But that is a very bad fit for the Django framework
Let’s apply the right design patterns then?!
From the SOLID principles and software design best practices from the past decades, we know that we should separate our software into layers to keep them manageable and stable. A notable pattern is the repository pattern which abstracts away data storage and defines a clear interface for storing and retrieving domain entities.
This pattern turns out to be working very poorly for the Active Record pattern, which is what the Django framework is designed around. Either we’d accept that repositories return instances and querysets of our data models (exposing the data layer, which we were supposed to encapsulate), or we create separate domain entity classes and then have the repositories map those to the data models. This would be a great approach, but then you’ll soon find that Django will get in your way. Remember that almost everything that is useful in Django is useful because it can reach for the ORM and do work based on the data model. Generic Class-based views are gone, ModelForms are gone, the admin site is gone. All of it. You’re left with raw views and handmade forms, just because we want to be able to abstract away the data layer to afford for better testability.
The authors of Cosmic Python (https://www.cosmicpython.com/) wrote at great length about how to apply these patterns with Python and Flask with SQLalchemy as the ORM, and succeed in laying it all out because SQLalchemy doesn’t need to do Active Record, but can also do the Data Mapper approach, so it doesn’t get in our way. When they do mention Django, they reveal the same problems as I mention here, and give examples of how it could be done, but with the trade-off that big parts of the Django framework can then not be used.
So what then?
Personally, my conclusion is that while Django is fantastic for rapid prototyping, it doesn’t really grow with the application and ends up getting in the way once the application grows to a certain size.
For enterprise level software, I would consider looking into alternatives. I am now looking into tactics for migrating away from Django. What steps could be taken to move a Django application over to something else iteratively? That will be the topic of a later blog entry.