Beautiful reprs

When creating libraries in Python, I have always strived to make their behavior transparent to other programmers. No matter how complex the library is internally, it should have a clear and obvious facade externally. One of the important parts of such a facade is the way objects are presented as strings.

What does a user do when they first encounter my library? I assume they try to run it in REPL. It’s a pretty handy thing that immediately prints the results of any expressions to the console, like this:

>>> "hello,  " + "world!"
'hello,  world!'
>>> 13 + 666
679

Under the hood, this works by calling each object’s so-called «magic method» __repr__, if it is defined at all. If it is not defined, something like the following is printed:

<my_module.MySuperClass object at 0x7f905814cf90>

Not very pretty, is it? This means that we need to define this method in our classes if we want them to display nicely for the user in the console.

There are several mistakes that beginners often make when formatting __repr__:

Do not format the string as valid Python code. Ideally, your __repr__ should return a string that can be inserted into code in its «raw» form and produce a completely identical object.
Use «naive» f-strings or the str() function in your __repr__. A correct __repr__ is recursive, i.e., it also calls repr() on all other objects whose contents it needs to output. Thus, if all other objects also adhere to the rule of complete reproducibility, your __repr__ output will be reproducible.
Place complex code in the __repr__ method. The string representation of an object is extremely important for the convenience of debugging your program, and it would be extremely annoying to encounter an accidentally missed error here. The more complex the code posted here, the worse it gets.

I have created quite a few libraries in Python and over time noticed that the code for string representation of objects often turns out to be quite similar. And I have to write quite similar unit tests for this. It didn’t take up much space, but it occurred so often that I decided to extract it into a separate micro-library. I named it printo, you can install it with the following command:

pip install printo

How does it work? The library has only one function, to which you pass the class name and arguments (by dividing positional and named ones between themselves) that need to be passed to the constructor, and the output is a nice representation of the object as a string:

from printo import descript_data_object

print(
    descript_data_object(
        'MyClassName',
        (1, 2, 'some text'),
        {'variable_name': 1, 'second_variable_name': 'kek'},
    )
)
#> MyClassName(1, 2, 'some text', variable_name=1, second_variable_name='kek')

This library is designed to automatically avoid childish mistakes, which I wrote about above. For example, it uses the repr() function to ensure that the output is recursive (that is, so that the objects that are passed to the constructor are also displayed beautifully). However, over time, I decided to expand the library’s functionality to provide more options for formatting objects in various niche cases.

One such case was security requirements that are often applied to objects. Sometimes we don’t fully trust the system where our program logs are stored (this is often the case in corporations, logs are stored in some kind of centralized storage, which a lot of people have access to, and various secrets from corporate applications can leak through such storage) and don’t want sensitive data to accidentally end up there. For such cases, there is an option to set placeholders for specific values of certain fields:

print(
    descript_data_object(
        'MySuperClass',
        (1, 2, 'lol'),
        {'variable_name': 1, 'second_variable_name': 'kek'},
        placeholders={
            1: '***',
            'variable_name': '***',
        },
    )
)
#> MySuperClass(1, ***, 'lol', variable_name=***, second_variable_name='kek')

I also encountered a situation where some arguments often do not need to be displayed at all if their value corresponds to a certain default. That is, we may want to print the value only if it differs from the default. To do this, I provided a filtering mechanism that looks like this:

print(
    descript_data_object(
        'MyClassName',
        (1, 2, 'some text'),
        {'variable_name': 1, 'second_variable_name': 'kek'},
        filters={
            1: lambda x: False if x == 2 else True,
            'second_variable_name': lambda x: False,
        },
    )
)
#> MyClassName(1, 'some text', variable_name=1)

Well, in some cases, you can come up with some additional life hacks for displaying nested objects, so that some special logic is triggered specifically for displaying them. This is also possible, see:

print(
    descript_data_object(
        'MyClassName',
        (1, 2, 'lol'),
        {'variable_name': 1, 'second_variable_name': 'kek'},
        serializator=lambda x: repr(x * 2),
    )
)
#> MyClassName(2, 4, 'lollol', variable_name=2, second_variable_name='kekkek')

I hope these few simple techniques will help you enjoy the programming process more, and make your code look as beautiful to other people as it does in your head.