Building a Chat App: Step-by-Step Guide — Database Modeling — Part3
Learn how to build a real-time chat application from scratch. This step-by-step guide covers all the key features of a chat app with working code

As a software engineer, I have a strong passion for creating innovative and efficient solutions through code. I am a quick learner and always eager to expand my skills and knowledge. I enjoy working in a team environment. I am able to explain technical concepts to non-technical team members in a clear and understandable manner.
Database modeling is a critical aspect of software development, as it plays a key role in determining the performance and scalability of an application. A well-designed database can ensure efficient data storage and retrieval, which can significantly impact the user experience. In today's world, data is generated and collected at an unprecedented scale, making it essential for organizations to have a robust and scalable database infrastructure. Thanks to advancements in cloud computing, storage limitations are no longer a concern, as large amounts of data can be stored in the cloud with ease. However, designing a database that is efficient, secure, and scalable is not a trivial task and requires a deep understanding of data modeling principles and techniques. It's crucial to carefully plan the database structure and schema, taking into account the needs and requirements of the application, as well as the expected growth and evolution of the data over time.
When I was working on a client project that involved database modeling, I searched various resources such as StackOverflow and articles for guidance, but I found the information to be insufficient for my needs. As a result, I developed my approach to database modeling. Let's dive in.
Database Modeling

I understand that this image may appear intimidating at first, but I will break down every aspect of it and explain how I modeled it to make it easier to understand.
As you can see in the image, I have added sub-schemas or sub-collections, such as 'Inbox' and 'UserChannel,' which I will go over in more detail in their respective sections.
For now, we need three MongoDB collections, or in SQL terms, three tables:
User
Channel
Message
User Modeling
import mongoose, { Schema } from "mongoose";
import { MODEL_NAMES } from "./types/model-names.type";
import { InboxModel, UserModel } from "./types/user.type";
const userSchema = new Schema<UserModel>({
name: { type: String, required: true },
email: { type: String, required: true },
password: { type: String, required: true },
profileImage: { type: String, required: false },
inbox: [
new Schema<InboxModel>(
{
channel: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.channel },
user: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.user },
},
{ _id: false }
),
],
});
const User = mongoose.model<UserModel>(MODEL_NAMES.user, userSchema);
export default User;
As you can observe in the code snippet, the first four collection entries are straightforward. The last entry requires some explanation. In a relational database, when implementing a many-to-many relationship, we need to create a junction or pivot table. However, in a NoSQL database, we can achieve the same relationship by simply storing references in an array. This provides us with the best of both worlds, as we can either add a separate table as in SQL, or we can store all the references in the same collection, as demonstrated in the array.
The advantage of having all the references in the same collection is that we can easily retrieve the user inbox, which includes all the user channels because they are stored in the same collection. If we stored the user inbox in a separate collection, or SQL terms, a pivot table, then we would need to query that table and join the records with the user and channel to obtain the actual channel name. This would not be an efficient use of a NoSQL database, as we should take advantage of its unique capabilities.
As you can observe in the code snippet, I am also storing the user in the Inbox model. The reason is straightforward. When we need to retrieve the user channels, we need to obtain the channel name and image, which would be the opponent user name and image in the case of a one-to-one chat. To achieve this, we need to perform another query when fetching the user channels. However, in the case of a group chat, the channel name and image will be stored in the Channel collection, and the user entry in the Inbox model will be empty.
Now, the question arises: if we are already storing the users in the Channel collection, then why do we need to store the opponent user in the Inbox Model? The answer is simple: it makes our queries clearer and easier to understand. Instead of searching through the mapping of channel users to find the opponent user and then retrieve their name and image, we can simply store the user's channels and the opponent user in their Inbox collection for each user.
In the code snippet, you'll notice that I added _id: false. This is because the Inbox is also a schema, or sub-schema, and each Inbox record will have its own unique identifier (_id). However, we do not need this identifier for any particular purpose, so I have set it to false. However, you can choose to include it if you'd like; it will not have any impact on the functionality of your code.
You can also add a unique constraint on the email field to ensure that each email is unique. I did not add this constraint as the unique requirement may vary depending on your application needs. For example, you may want the username to be unique instead.
Channel Modeling
import mongoose, { Schema } from "mongoose";
import { ChannelModel } from "./types/channel.type";
import { MODEL_NAMES } from "./types/model-names.type";
const channelSchema = new Schema<ChannelModel>({
name: { type: String, required: false },
image: { type: String, required: false },
autoId: { type: Number, default: 0 },
createdBy: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.user },
lastMessage: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.message },
users: {
type: Map,
of: {
userId: Schema.Types.ObjectId,
messageStatus: String,
autoId: Number,
deliveredAutoId: Number,
},
},
});
const Channel = mongoose.model<ChannelModel>(MODEL_NAMES.channel, channelSchema);
export default Channel;
When implementing a one-to-one chat, the name, image, and createdBy fields will be empty. This is because it doesn't make sense to store the name and image in the channel. These details should come from the opponent user. If we were to store this information in the channel, we would have to create a new channel for each user and update the opponent user's name and image in that channel entry every time the user changes their name or image. This operation would be very costly as users often change their name and image.
But in the case of group chat, the name and image will be stored in the Channel collection.
Let's talk about the lastMessage entry. In a chat app, it's common to display the last message for each channel on the frontend. Querying for this information every time would be very costly, given the large number of messages in chat applications. To avoid this, I have added the lastMessage in the channel. This way, when we query all the channels, the lastMessage will be automatically retrieved along with it.
I will explain the purpose of the autoId and deliveredAutoId fields in future articles when we cover how to track the number of unread, delivered, or read messages. These two fields will play an important role in that process.
Lastly, I am storing the users as a Map. MongoDB allows us to store maps, which are similar to JavaScript maps, consisting of key-value pairs. As I mentioned earlier, we need to determine the status of messages, whether they are read, delivered, or sent. I'll explain this further in future articles, where it will make more sense.
Message Modeling
import mongoose, { Schema } from "mongoose";
import { MessageModel } from "./types/message.type";
import { MODEL_NAMES } from "./types/model-names.type";
const messageSchema = new Schema<MessageModel>({
autoId: { type: Number, default: 0 },
body: { type: String, required: true },
channel: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.channel },
sentBy: { type: Schema.Types.ObjectId, ref: MODEL_NAMES.user },
date: { type: Date, default: Date.now },
});
const Message = mongoose.model(MODEL_NAMES.message, messageSchema);
export default Message;
The Message Modeling is straightforward, as you can see in the code snippet. We need to ensure that each message belongs to a specific channel, so that we can query all messages for that channel. The 'sentBy' field refers to the sender of the message, and it is straightforward.
Concept behind AutoID
The autoId serves the same purpose as an auto-incrementing ID in a relational database. It helps keep track of the number of messages stored in the database for a specific channel and allows us to monitor the status of messages, such as whether they have been read, delivered, or sent. This information will become more relevant when we implement the "read receipt" functionality in future articles.
Source Code
For this article, you can check all the source code in my GitHub Repository here.
Conclusion
I understand that there is a lot of information to take in, but don't worry, I will explain everything in detail with the help of code and diagrams to ensure that you have a clear understanding.
In conclusion, we have discussed the various models used to implement a chat application. We talked about the User, Channel, and Message models, as well as their attributes, relationships, and functions. The user model contains information about each user, including their email, username, image, and channels. The channel model contains information about the channels that the user is a part of, such as the channel name, image, last message, and users. The message model contains information about the messages that are sent and received in a channel, including the sentBy and autoId fields.
It is important to note that the attributes and relationships between these models can be customized to fit the specific needs of the chat application. We also discussed the importance of using an autoId in the message model to keep track of the messages in the database.
Please follow me on GitHub and Twitter to help me grow my community. If you'd like, you can also subscribe to my newsletter to be notified whenever I publish new articles. Learn how to build a real-time chat application from scratch. This step-by-step guide covers all the key features of a chat app with working code
If you have any questions or concerns regarding the information provided above, or if you would like me to cover a specific topic in the future, please let me know in the comments section. If you want to support me, buy me a coffee here or sponsor me in the HashNode. Thanks for your time and support. Happy Coding.


