October 16, 2013

Justification. Is it really so hard?

Have you ever thought about how do right-aligned texts look like? Don't you find them gnawed by an army of little hungry text-string eaters? Next time you type something, recall justification feature. Just think about it.

October 09, 2013

Strange thing about hashables in Python 2

I was digging through Python's docs about sets and dictionaries and have found a strange thing about them in Python 2. As set and dict docs say:

"A set object is an unordered collection of distinct hashable objects."
"A mapping object maps hashable values to arbitrary objects."

Old documentation tells that the set classes are implemented using dictionaries, so they are rather similar things.

From the definition of hashable it comes that an object is considered to be hashable if it has both __hash__() and (__eq__() or __cmp__()) methods. Python 3 says that the object is hashable if and only if it has both __hash__() and __eq__() methods. So, let's check that in Python 3 and Python 2.

>>> import platform
>>> platform.python_version()
'3.3.1'
>>> platform.python_implementation()
'CPython'

Let's define a couple of classes and try to use their instances as set's members and dict's keys.

>>> class A(object):
...     pass
... 
>>> class B:
...     pass
...
>>> a = A()
>>> b = B()
>>>
>>> '__hash__' in dir(a)
True
>>> '__eq__' in dir(a)
True
>>> '__hash__' in dir(b)
True
>>> '__eq__' in dir(b)
True

Well, class A is defined in new-class style and class B is defined in old-class style. In Python 3 all classes are new-styled indeed, so everything is OK and instances of both classes have __hash__() and __eq__() methods and we can create a set and a dictionary:

>>> {A(), B(), }
{<__main__.B object at 0xb717cb4c>, <__main__.A object at 0xb717cb6c>}
>>> {A(): 'foo', B(): 'bar', }
{<__main__.A object at 0xb70c66ac>: 'foo', <__main__.B object at 0xb69edf6c>: 'bar'}

So, everything is fine with Python 3. Let's switch to Python 2:

>>> import platform
>>> platform.python_version()
'2.7.4'
>>> platform.python_implementation()
'CPython'

and define the same classes and their instances:

>>> class A(object):
...     pass
... 
>>> class B:
...     pass
... 
>>> a = A()
>>> b = B()

and make some checks:

>>> '__hash__' in dir(a)
True
>>> '__eq__' in dir(a)
False
>>> '__cmp__' in dir(a)
False

Oops! New-style classes in Python 2 do not have neither __eq__() nor __cmp__() methods, so by definition they are not hashable. Let's make the galaxy collide!

>>> set([a, ])
set([<__main__.A object at 0xb74deacc>])
>>> 
>>> d = {a: 'foo', }
>>> d
{<__main__.A object at 0xb74deacc>: 'foo'}
>>> d = {a: 'bar', }
>>> d
{<__main__.A object at 0xb74deacc>: 'bar'}
>>> d.update({A(): 'baz', })
>>> d
{<__main__.A object at 0xb74deacc>: 'bar', <__main__.A object at 0xb74debec>: 'baz'}

Emm, stop. Everything is good, wright? No. But, well, yes, we've got a set and a valid dictionary. But how? I don't get it. Let's move on:

>>> '__hash__' in dir(b)
False
>>> '__eq__' in dir(b)
False
>>> '__cmp__' in dir(b)
False
>>> dir(b)
['__doc__', '__module__']

T-sss! We are going to act like a real badass developers and use this really non-hashable (by definition) object in our experiments:

>>> s = set([b ,])
>>> s
set([<__main__.B instance at 0xb74deb0c>])
>>> s.update(set([B(), b, ]))
>>> s
set([<__main__.B instance at 0xb74deb0c>, <__main__.B instance at 0xb74ded0c>])
>>>
>>> d = {b: 'foo', }
>>> d
{<__main__.B instance at 0xb74deb0c>: 'foo'}
>>> d.update({b: 'bar', B(): 'baz', })
>>> s
set([<__main__.B instance at 0xb74deb0c>, <__main__.B instance at 0xb74ded0c>])

"Oops, I did it again!" Can you hear that? The time hasn't stopped and the clock is going on. How do you like that? Non-hashables are treated as valid hashables! I'm quite happy that nothing had crashed and it is not an 8th wonder of the world. But isn't it strange when some things act in the way they really should not?