Data (Format) Minimisation

The article 5 of the EU General Data Protection Regulation defines the principles of personal data processing that processors should follow. In particular, there’s the principle 5(1)(c):

Personal data shall be: […] adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed

This is referred to as ‘data minimisation’. For example, if a processor wants to know whether you’re over 18 or not, but doesn’t verify it, then it shouldn’t need your full date of birth (looking at you, Steam).

I have recently realized, however, that this principle can be applied to areas outside of data processing. The area that I wish it would be applied to are the various data formats.

Let’s say I have written a poem. It’s 20 lines long, each line containing between 3 and 7 English words with punctuation. Now, what format do I choose to save it on my computer? I could pick DOCX (god forbid) or RTF, but their containers are bigger than the content itself. HTML or Markdown are useless for me, as I do not need any formatting. So, plain text it is: It’s lightweight, not prone to breaking and can be viewed on any device out of the box.

I am so tired of people using big and complicated formats to store information that doesn’t need it. Why do some bosses send out 30-word memos as DOCX documents via HTML emails? Why even HTML emails, if one doesn’t even use bold or italics?

Of course, there are some exceptions. If I want other people on the Internet to read my poems, I’ll render it into an HTML file, because that’s what the browsers expect. And if my presentation has cool animations, it’s okay to keep it as a PPTX.

Although, I’d prefer Microsoft Office formats just stopped existing altogether. Do we really need more than what Markdown can offer?

This is post 007 of #100DaysToOffload.

Comments (via Mastodon)

No comments yet ;)

Comments last fetched . Updates at least every six hours.