python-dev Summary for 2006-08-01 through 2006-08-15
- Summaries
- Mixing str and unicode dict keys
- Rounding floats to ints
- Assigning to function calls
- PEP 357: Integer clipping and __index__
- OpenSSL and Windows binaries
- Type of range object members
- Distutils version number
- Dict containment and unhashable items
- Returning longs from __hash__()
- instancemethod builtin
- Unicode versions and unicodedata
- Elementtree and Namespaces
- Previous Summaries
- Skipped Threads
- Epilogue
[The HTML version of this Summary is available at http://www.python.org/dev/summary/2006-08-01_2006-08-15]
Summaries
Mixing str and unicode dict keys
Ralf Schmitt noted that in Python head, inserting str and unicode keys to the same dictionary would sometimes raise UnicodeDecodeErrors:
>>> d = {}
>>> d[u'm\xe1s'] = 1
>>> d['m\xe1s'] = 1
Traceback (most recent call last):
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128)
This error showed up as a result of Armin Rigo's patch to stop dict lookup from hiding exceptions, which meant that the UnicodeDecodeError raised when a str object is compared to a non-ASCII unicode object was no longer silenced. In the end, people agreed that UnicodeDecodeError should not be raised for equality comparisons, and in general, __eq__() methods should not raise exceptions. But comparing str and unicode objects is often a programming error, so in addition to just returning False, equality comparisons on str and non-ASCII unicode now issues a warning with the UnicodeDecodeError message.
Contributing threads:
Rounding floats to ints
Bob Ippolito pointed out a long-standing bug in the struct module where floats were automatically converted to ints. Michael Urman showed a simple case that would provoke an exception if the bug were fixed:
pack('>H', round(value * 32768))
The source of this bug is the expectation that round() returns an int, when it actually returns a float. There was then some discussion about splitting the round functionality into two functions: __builtin__.round() which would round floats to ints, and math.round() which would round floats to floats. There was also some discussion about the optional argument to round() which currently specifies the number of decimal places to round to -- a number of folks felt that it was a mistake to round to decimal places when a float can only truly reflect binary places.
In the end, there were no definite conclusions about the future of round(), but it seemed like the discussion might be resumed on the Python 3000 list.
Contributing threads:
Assigning to function calls
Neal Becker proposed that code by X() += 2 be allowed so that you could call __iadd__ on objects immediately after creation. People pointed out that allowing augmented assignment is misleading when no assignment can occur, and it would be better just to call the method directly, e.g. X().__iadd__(2).
Contributing threads:
PEP 357: Integer clipping and __index__
After some further discussion on the __index__ issue of last fortnight, Travis E. Oliphant proposed a patch for __index__ that introduced three new C API functions:
- PyIndex_Check(obj) -- checks for nb_index
- PyObject* PyNumber_Index(obj) -- calls nb_index if possible or raises a TypeError
- Py_ssize_t PyNumber_AsSsize_t(obj, err) -- converts the object to a Py_ssize_t, raising err on overflow
After a few minor edits, this patch was checked in.
Contributing threads:
- Bad interaction of __index__ and sequence repeat
- __index__ clipping
- Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c Modules/operator.c Objects/abstract.c Objects/classobject.c Objects/
- Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c Modules/operator.c Objects/abstract.c Objects/class
OpenSSL and Windows binaries
Jim Jewett pointed out that a default build of OpenSSL includes the patented IDEA cipher, and asked whether that needed to be kept out of the Windows binary versions. There was some concern about dropping a feature, but Gregory P. Smith pointed out that IDEA isn't directly exposed to any Python user, and suggested that IDEA should never be required by any sane SSL connection. Martin v. Löwis promised to look into making the change.
Update: The change was checked in before 2.5 was released.
Contributing threads:
Type of range object members
Alexander Belopolsky proposed making the members of the range() object use Py_ssize_t instead of C longs. Guido indicated that this was basically wasted effort -- in the long run, the members should be PyObject* so that they can handle Python longs correctly, so converting them to Py_ssize_t would be an intermediate step that wouldn't help in the transition.
There was then some discussion about the int and long types in Python 3000, with Guido suggesting two separate implementations that would be mostly hidden at the Python level.
Contributing thread:
Distutils version number
A user noted that Python 2.4.3 shipped with distutils 2.4.1 and the version number of distutils in the repository was only 2.4.0 and requested that Python 2.5 include the newer distutils. In fact, the newest distutils was already the one in the repository but the version number had not been appropriately bumped. For a short while, the distutils number was automatically generated from the Python one, but Marc-Andre Lemburg volunteered to manually bump it so that it would be easier to use the SVN distutils with a different Python version.
Contributing threads:
Dict containment and unhashable items
tomer filiba suggested that dict.__contain__ should return False instead of raising a TypeError in situations like:
>>> a={1:2, 3:4}
>>> [] in a
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable
Guido suggested that swallowing the TypeError here would be a mistake as it would also swallow any TypeErrors produced by faulty __hash__() methods.
Contributing threads:
Returning longs from __hash__()
Armin Rigo pointed out that Python 2.5's change that allows id() to return ints or longs would have caused some breakage for custom hash functions like:
def __hash__(self):
return id(self)
Though it has long been documented that the result of id() is not suitable as a hash value, code like this is apparently common. So Martin v. Löwis and Armin arranged for PyLong_Type.tp_hash to be called in the code for hash().
Contributing thread:
instancemethod builtin
Nick Coghlan suggested adding an instancemethod() builtin along the lines of staticmethod() and classmethod() which would allow arbitrary callables to act more like functions. In particular, Nick was considering code like:
class C(object):
method = some_callable
Currently, if some_callable did not define the __get__() method, C().method would not bind the C instance as the first argument. By introducing instancemethod(), this problem could be solved like:
class C(object):
method = instancemethod(some_callable)
There wasn't much of a reaction one way or another, so it looked like the idea would at least temporarily be shelved.
Contributing thread:
Unicode versions and unicodedata
Armin Ronacher noted that Python 2.5 implements Unicode 4.1 but while a ucd_3_2_0 object is available (implementing Unicode 3.2), no ucd_4_1_0 object is available. Martin v. Löwis explained that the ucd_3_2_0 object is only available because IDNA needs it, and that there are no current plans to expose any other Unicode versions (and that ucd_3_2_0 may go away when IDNA no longer needs it).
Contributing thread:
Elementtree and Namespaces
Elements (and attributes) can be associated with a namespace, such as
http://www.w3.org/XML/1998/namespace:id
The xmlns attribute creates a "prefix" (alias) for a namespace, so that you can abbreviate the above as
xml:id
ElementTree treats the prefix as a just an aid to human readers, and creates its own abbreviations that are consistent throughout a document. Some tools (including w3 recommendations for canonicalization) treat the prefix itself as meaningful.
Elementtree may support this in version 1.3, but it wasn't going to be there in time for 2.5, and it wasn't judged important enough to keep etree out of the release.
If you need it sooner, then http://codespeak.net/lxml supports the etree API and does retain prefixes.
Contributing thread:
[Thanks to Jim Jewett for this summary.]
Skipped Threads
- clock_gettime() vs. gettimeofday()?
- Strange memo behavior from cPickle
- internal weakref API should be Py_ssize_t?
- Weekly Python Patch/Bug Summary
- Releasemanager, please approve #1532975
- FW: using globals
- TRUNK FREEZE 2006-07-03, 00:00 UTC for 2.5b3
- segmentation fault in Python 2.5b3 (trunk:51066)
- using globals
- uuid module - byte order issue
- RELEASED Python 2.5 (beta 3)
- TRUNK is UNFROZEN
- 2.5 status
- Python 2.5b3 and AIX 4.3 - It Works
- More tracker demos online
- need an SSH key removed
- BZ2File.writelines should raise more meaningful exceptions
- test_mailbox on Cygwin
- cgi.FieldStorage DOS (sf bug #1112549)
- 2.5b3, commit r46372 regressed PEP 302 machinery (sf not letting me post)
- free(): invalid pointer
- should i put this on the bug tracker ?
- Is this a bug?
- httplib and bad response chunking
- cgi DoS attack
- DRAFT: python-dev summary for 2006-07-01 to 2006-07-15
- SimpleXMLWriter missing from elementtree
- DRAFT: python-dev summary for 2006-07-16 to 2006-07-31
- Is module clearing still necessary? [Re: Is this a bug?]
- PyThreadState_SetAsyncExc bug?
- Errors after running make test
- What is the status of file.readinto?
- Recent logging spew
- [Python-3000] Python 2.5 release schedule (was: threading, part 2)
- test_socketserver failure on cygwin
- ANN: byteplay - a bytecode assembler/disassembler
- Arlington VA sprint on Sept. 23
- IDLE patches - bugfix or not?
- Four issue trackers submitted for Infrastructue Committee's tracker search
Epilogue
This is a summary of traffic on the python-dev mailing list from August 01, 2006 through August 15, 2006. It is intended to inform the wider Python community of on-going developments on the list on a semi-monthly basis. An archive of previous summaries is available online.
An RSS feed of the titles of the summaries is available. You can also watch comp.lang.python or comp.lang.python.announce for new summaries (or through their email gateways of python-list or python-announce, respectively, as found at http://mail.python.org).
This python-dev summary is the 10th written by Steve Bethard.
To contact me, please send email:
- Steve Bethard (steven.bethard at gmail.com)
Do not post to comp.lang.python if you wish to reach me.
The Python Software Foundation is the non-profit organization that holds the intellectual property for Python. It also tries to advance the development and use of Python. If you find the python-dev Summary helpful please consider making a donation. You can make a donation at http://python.org/psf/donations.html . Every cent counts so even a small donation with a credit card, check, or by PayPal helps.
Commenting on Topics
To comment on anything mentioned here, just post to comp.lang.python (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!
How to Read the Summaries
This summary is written using reStructuredText. Any unfamiliar punctuation is probably markup for reST (otherwise it is probably regular expression syntax or a typo :); you can safely ignore it. We do suggest learning reST, though; it's simple and is accepted for PEP markup and can be turned into many different formats like HTML and LaTeX.
