pipeline.extern.asizeof

This module exposes 9 functions and 2 classes to obtain lengths (in items) and sizes (in bytes) of Python objects for Python 2.6 and later, including Python 3+ 2.

Public Functions 1

Function asizeof calculates the combined (approximate) size in bytes of one or several Python objects.

Function asizesof returns a tuple containing the (approximate) size in bytes for each given Python object separately.

Function asized returns for each object an instance of class Asized containing all the size information of the object and a tuple with the referents 4.

Functions basicsize and itemsize return the basic respectively itesize of the given object, both in bytes. For objects as array.array, numpy.array, numpy.matrix, etc. where the item size varies depending on the instance-specific data type, function itemsize returns that item size.

Function flatsize returns the flat size of a Python object in bytes defined as the basic size plus the item size times the length of the given object.

Function alen 3 returns the length of an object like standard function len but extended for several types. E.g. the alen of a multi-precision int (or long) is the number of digits 6. The length of most mutable sequence objects includes an estimate of the over-allocation and therefore, the alen value may differ from the standard len result. For objects like array.array, numpy.array, numpy.matrix, etc. function alen returns the proper number of items.

Function refs returns (a generator for) the referents 4 of the given object.

Certain classes are known to be sub-classes of or to behave as dict objects. Function adict can be used to install other class objects to be treated like dict.

Public Classes 1

Class Asizer may be used to accumulate the results of several sizing calls. After creating an Asizer instance, use methods asizeof and asizesof as needed to size any number of additional objects and accumulate the sizes.

Call methods exclude_refs and/or exclude_types to exclude references to respectively instances or types of certain objects.

Use one of the print_… methods to report the statistics.

An instance of class Asized is returned for each object sized with the asized function or method.

Duplicate Objects

Any duplicate, given objects are sized only once and the size is included in the combined total only once. But functions asizesof and asized will return a size value respectively an Asized instance for each given object, including duplicates.

Definitions 5

The length of an objects like dict, list, set, str, tuple, etc. is defined as the number of items held in or allocated by the object. Held items are references to other objects, called the referents.

The size of an object is defined as the sum of the flat size of the object plus the sizes of any referents 4. Referents are visited recursively up to the specified detail level. However, the size of objects referenced multiple times is included only once in the total size.

The flat size of an object is defined as the basic size of the object plus the item size times the number of allocated items, references to referents. The flat size does include the size for the references to the referents, but not the size of the referents themselves.

The flat size returned by function flatsize equals the result of function asizeof with options code=True, ignored=False, limit=0 and option align set to the same value.

The accurate flat size for an object is obtained from function sys.getsizeof() where available. Otherwise, the length and size of sequence objects as dicts, lists, sets, etc. is based on an estimate for the number of allocated items. As a result, the reported length and size may differ substantially from the actual length and size.

The basic and item size are obtained from the __basicsize__ respectively __itemsize__ attributes of the (type of the) object. Where necessary (e.g. sequence objects), a zero __itemsize__ is replaced by the size of a corresponding C type.

The overhead for Python’s garbage collector (GC) is included in the basic size of (GC managed) objects as well as the space needed for refcounts (used only in certain Python builds).

Optionally, size values can be aligned to any power of 2 multiple.

Size of (byte)code

The (byte)code size of objects like classes, functions, methods, modules, etc. can be included by setting option code=True.

Iterators are handled like sequences: iterated object(s) are sized like referents 4 but only up to the specified level or recursion limit (and only if function gc.get_referents() returns the referent object of iterators).

Generators are sized as (byte)code only, but the generated objects are never sized.

Old- and New-style Classes

All old- and new-style class, instance and type objects, are handled uniformly such that (a) instance objects are distinguished from class objects and (b) instances of different old-style classes can be dealt with separately.

Class and type objects are represented as <class ....* def> respectively <type ... def> where the * indicates an old-style class and the ... def suffix marks the definition object. Instances of classes are shown as <class module.name*> without the ... def suffix. The * after the name indicates an instance of an old-style class.

Ignored Objects

To avoid excessive sizes, several object types are ignored 5 by default, e.g. built-in functions, built-in types and classes 7, function globals and module referents. However, any instances thereof and module objects will be sized when passed as given objects. Ignored object types are included unless option ignored is set accordingly.

In addition, many __...__ attributes of callable objects are ignored 5, except crucial ones, e.g. class attributes __dict__, __doc__, __name__ and __slots__. For more details, see the type-specific _..._refs() and _len_...() functions below.

Footnotes

1(1,2)

The functions and classes in this module are not thread-safe.

2

Earlier editions of this module supported Python versions down to Python 2.2. To use Python 2.5 or older, try module asizeof from project Pympler 0.3.x.

3

Former function leng, class attribute leng and keyword argument leng have all been renamed to alen. However, function leng is still available for backward compatibility.

4(1,2,3,4)

The referents of an object are the objects referenced by that object. For example, the referents of a list are the objects held in the list, the referents of a dict are the key and value objects in the dict, etc.

5(1,2,3)

These definitions and other assumptions are rather arbitrary and may need corrections or adjustments.

6

See Python source file .../Include/longinterp.h for the C typedef of digit used in multi-precision int (or long) objects. The C sizeof(digit) in bytes can be obtained in Python from the int (or long) __itemsize__ attribute. Function alen determines the number of digits of an int (or long) object.

7

Type``s and ``class``es are considered built-in if the ``__module__ of the type or class is listed in the private _builtin_modules.

Functions

adict(*classes)

Installs one or more classes to be handled as dict.

alen(obj, **opts)

Returns the length of an object (in items).

asized(*objs, **opts)

Returns a tuple containing an Asized instance for each object passed as positional argument.

asizeof(*objs, **opts)

Returns the combined size (in bytes) of all objects passed as positional arguments.

asizesof(*objs, **opts)

Returns a tuple containing the size (in bytes) of all objects passed as positional argments.

basicsize(obj, **opts)

Returns the basic size of an object (in bytes).

flatsize(obj[, align])

Returns the flat size of an object (in bytes), optionally aligned to a given power of 2.

itemsize(obj, **opts)

Returns the item size of an object (in bytes).

leng(obj, **opts)

Returns the length of an object (in items).

named_refs(obj, **opts)

Returns (a generator for) all named referents of an object (re-using functionality from asizeof).

refs(obj, **opts)

Returns (a generator for) specific referents of an object.

Classes

Asized(size, flat[, refs, name])

Stores the results of an asized object in the following 4 attributes:

Asizer(**opts)

Sizer state and options.