Lines length should not exceed 80 characters, as per PEP 8.
Every Python source file should be encoded as UTF-8. As per PEP 263, the first or the second line must be:
# coding=utf-8
For each data set, please provide a comment with reference to the source and/or origin of the data.
When you have long lists of names, please order them alphabetically. Keep the lines length as close as possible to 80 characters, without exceeding the limit.