Jezewski's Nest

My ML Report project journey

At one of the companies I recruited for, I was given a homework assignment to create a small project using Django. My findings are described in this post.

The requirements of the application were specified as follows:

The final code for the project can be found on my github , along with instructions for firing it up. In this post, I'd like to share my insights from the process of creating this little project and the things I learned while doing it. I got a sample model with data from the company I recruited for, but as far as I can see these are simply artifacts from a free Machine Learning tutorial from the internet.

Learning Machine Learning based only on programming knowledge (Python) is difficult and does not seem to me to be an optimal solution.

Machine Learning is a huge science, and as far as I can see, it requires completely different skills than with other "everyday" programming tasks. Practically every now and then while reading, I hit my head against the wall of a concept I don't understand and slowly fell down the rabbit hole of googling. The documentation of the Python framework sci-kit learn is very crude and practically does not explain the implemented concepts (functions, instructions, etc.). The authors expect you, when wanting to use their tools, to already have all the ML knowledge you need, and you check the documentation to know HOW to use the tools in question, not WHY.

The amount of life-saving abstractions in Django is huge, but it can be a problem.

Doing a small project in Django, implementing "typical" usecases is easy and fast. Compared to my experience creating a microservice in Golang, Django has a huge amount of functionality already implemented, and it was more of a problem for me to figure out what syntax developers expect me to use (e.g. what needs to be implemented using generic views like ListView).

If you want to pass some data in context to a template in Django in JSON/dictionary form, avoid as much as possible special characters in keys.

More than an hour/two of work was added by the small detail that the dictionary I was passing to the template had the "f1-score" key, which blew up the whole template, and I needed to create a special function-adapter that removed the "-" character from all keys in the dictionary.

Dynamically creating .pdf files using Python can be overwhelming and difficult to understand at first, but once you catch on to the concepts it becomes simple and convenient.

I used the reportlab library to generate .pdf files and the first use cases are unnecessarily complicated. The documentation describes the whole process of how to create a Canvas and then draw elements on them, indicating the individual pixel coordinates, which is a misunderstanding to me, because simply creating a file with text this way is hell. For most cases, a platypus module with auto-scaled, abstract objects like Paragraph, Table, etc., which are relatively well described and easy to use, will probably suffice.

Thanks a lot for your time, if you are interested in the code for this project take a look at my GitHub, and if you want to know about future posts- follow my Twitter and Mastodon.

Best regards and enjoy coding!