- Регистрация
- 1 Мар 2015
- Сообщения
- 1,481
- Баллы
- 155
A while back, I built Notify, a notification service for University College of Engineering, Kariavattom (my college) to deliver 1000+ users about university notifications, results, and exam timetable updates to their emails. This article is about how I built a reliable & scalable system to deliver content updates in a cutely packed email, FREE forever. I have done a few workarounds to keep things free and reliable, read till the last to find those.
Context: I am B.Tech student at who loves trying new tech and innovating with it. Also, 'KU' refers to Kerala University in this post.
The Need.
Kerala University has a very user-friendly website for its , & , which users have clearly no issue navigating through (/s ;). Also, the genuine urge to know about the exam notifications right after it get updated hinders me. To solve these, I devised a plan to create a mailing service that mails me (& everyone in my college) whenever there is an update- hence 'Notify' was created.
The Design.
We can divide the design part to:
While inspecting the KU's site, I found it difficult to find any sources, such as an RSS feed or any other form of compiled content. So we have to stick to the good old web scraping for this. Thankfully, the site renders in HTML and is not using any JS frameworks, which is ideal for web scraping.
Now to check if there is any content changed, the HTML content of the site is stored in Supabase. The next time we crawl for changes, we cross-check the content in the DB with the latest content. A simple algorithm for checking the changes is added (refer to codebase).
We check for updates in three different pages whose HTML structure is (thankfully) similar.
2. Compiling the data
KU has a single page for announcements for all its courses and without any options to filter or search. I did not want to confuse the students with irrelevant updates of other courses in their mail. For this, a simple solution is to search for the keyword "B.Tech" in the title using a regex.
But then, the problem here is that entries such as the one below are for Supplementary exams for past students — a data clearly not useful for currently pursuing students. To resolve this, we can check for the keyword "2020 scheme" in their titles.
That solves the issue -or so I thought :-)
KU drops this banger- no "2020 scheme" or any information regarding who this is intended for. This might have some important updates, but we can't be sure unless we have downloaded the attachment it has. And yes, we are checking the attachments.
We now download the attachments and check for the same keywords we checked in the title. The attachment is downloaded only when the title has "B.tech" keyword but not "2020 scheme". This method works every time until very recently they forgot to put a space between "2020scheme" :-)
The information is then extracted using an HTML parser and is compiled into an email template.
3. Sending this to 1000+ users via mail.
Next step is to set up the notification system. Our objective here is to be reliable and scalable at the same time, so as to cover the increase in students every year. Those who have sent emails using a SMTP server and Node.js should know how much time it would take to send emails to a list of users, only to end up in the spam. I have to come up with a solution which is reliable, scalable, and importantly FREE of cost.
Our college uses Google's Workspace for Education (about which I could discuss in an upcoming article), which has Google Groups. So, what I did was, I created a Google Group with all members of the college, and allowed only managers and admins to post in it. I created an email for Notify from our college's domain and added it to the group as a Manager. The group also had an email, to which whatever is sent to -is forwarded to everyone in the group via email. I utilised this feature to make Notify send the email to everyone's inboxes.
By this, I have managed to put the scalable and reliable factor for Google's infrastructure to handle while I take care of the content part.
4. Deployment (For FREE)
There are a lot of free SaaS that allow us to host this small script. But to keep things simple and reliable, I choose Github Actions, in which I have set a cron job to run the script.
The timings to run the cron job is adjusted to KU's update timing, usually during office hours and during 6-7 PM.
Wrapping it up
This was an interesting project to work especially with budget constraints. It has now started to become my niche to build things at minimal cost and with workarounds using existing frameworks. Also, I felt like the environment in our college is gradually getting better day by day due to student innovations such as this.
Here is the GitHub repo link for your reference: . Contributions are always welcome, and feel free to customise the code to deploy it to your college.
I have some more such design journeys to share, while then, keep innovating, folks. And of course, thank you for reading.
Context: I am B.Tech student at who loves trying new tech and innovating with it. Also, 'KU' refers to Kerala University in this post.
The Need.
Kerala University has a very user-friendly website for its , & , which users have clearly no issue navigating through (/s ;). Also, the genuine urge to know about the exam notifications right after it get updated hinders me. To solve these, I devised a plan to create a mailing service that mails me (& everyone in my college) whenever there is an update- hence 'Notify' was created.
The Design.
We can divide the design part to:
- Checking for updates (web scraping in our case)
- Compiling the data into an email template
- Delivering the email reliably to 1000+ users of the college via email.
- Deployment (for FREE)
While inspecting the KU's site, I found it difficult to find any sources, such as an RSS feed or any other form of compiled content. So we have to stick to the good old web scraping for this. Thankfully, the site renders in HTML and is not using any JS frameworks, which is ideal for web scraping.
Now to check if there is any content changed, the HTML content of the site is stored in Supabase. The next time we crawl for changes, we cross-check the content in the DB with the latest content. A simple algorithm for checking the changes is added (refer to codebase).
We check for updates in three different pages whose HTML structure is (thankfully) similar.
2. Compiling the data
KU has a single page for announcements for all its courses and without any options to filter or search. I did not want to confuse the students with irrelevant updates of other courses in their mail. For this, a simple solution is to search for the keyword "B.Tech" in the title using a regex.
But then, the problem here is that entries such as the one below are for Supplementary exams for past students — a data clearly not useful for currently pursuing students. To resolve this, we can check for the keyword "2020 scheme" in their titles.
That solves the issue -or so I thought :-)
KU drops this banger- no "2020 scheme" or any information regarding who this is intended for. This might have some important updates, but we can't be sure unless we have downloaded the attachment it has. And yes, we are checking the attachments.
We now download the attachments and check for the same keywords we checked in the title. The attachment is downloaded only when the title has "B.tech" keyword but not "2020 scheme". This method works every time until very recently they forgot to put a space between "2020scheme" :-)
The information is then extracted using an HTML parser and is compiled into an email template.
3. Sending this to 1000+ users via mail.
Next step is to set up the notification system. Our objective here is to be reliable and scalable at the same time, so as to cover the increase in students every year. Those who have sent emails using a SMTP server and Node.js should know how much time it would take to send emails to a list of users, only to end up in the spam. I have to come up with a solution which is reliable, scalable, and importantly FREE of cost.
Our college uses Google's Workspace for Education (about which I could discuss in an upcoming article), which has Google Groups. So, what I did was, I created a Google Group with all members of the college, and allowed only managers and admins to post in it. I created an email for Notify from our college's domain and added it to the group as a Manager. The group also had an email, to which whatever is sent to -is forwarded to everyone in the group via email. I utilised this feature to make Notify send the email to everyone's inboxes.
By this, I have managed to put the scalable and reliable factor for Google's infrastructure to handle while I take care of the content part.
4. Deployment (For FREE)
There are a lot of free SaaS that allow us to host this small script. But to keep things simple and reliable, I choose Github Actions, in which I have set a cron job to run the script.
The timings to run the cron job is adjusted to KU's update timing, usually during office hours and during 6-7 PM.
Wrapping it up
This was an interesting project to work especially with budget constraints. It has now started to become my niche to build things at minimal cost and with workarounds using existing frameworks. Also, I felt like the environment in our college is gradually getting better day by day due to student innovations such as this.
Here is the GitHub repo link for your reference: . Contributions are always welcome, and feel free to customise the code to deploy it to your college.
I have some more such design journeys to share, while then, keep innovating, folks. And of course, thank you for reading.