Digitising the diary entries
I need to make digital copies of the diary entries. Typing up years of handwritten entries would be too tedious and time consuming.
So instead, I take scans of each diary entry. An example (blurred to hide details) is shown below.
Next I crop and label each image using the date the entry corresponds to. For example `20210624.jpg`. If the handwritten entry spans multiple pages, append a letter to each (`20210624A.jpg`, `20210624B.jpg`, etc.).
Deploying to cloud
This solution is deployed using Google Cloud Platform, orchestrated using Terraform.
- Create a bucket in GCS.
- Manually upload the diary entries to the bucket.
- Deploy a Google Cloud Function that finds the diary entry in the bucket that matches today's date.
- Create a Cloud Scheduler to trigger a Pub/Sub message once a day. The Pub/Sub then triggers the GCF.
Discord Notifications
I have set up a Discord channel in a private server to receive the notifications. I create a webhook for this channel and store the webhook in Google Secret Manager.
The GCF has access to this and sends the diary image to the channel. For example:
Part 2: Machine Learning
Receiving the images daily is fine for now, but there is so much more I want to do with this project:
•Handwritten text recognition.
Here I would train a OCR model to learn my handwritting and then convert the digital images to text.
•Analysis of text.
What words, phrases do I tend to use? Can I run a sentiment analysis or find trends?
•Train a NLP model on the text.
Can I create a model that talks like me?