Bounding Box

The ezdxf.bbox module provide tools to calculate bounding boxes for many DXF entities, but not for all. The bounding box calculation is based on the ezdxf.disassemble module and therefore has the same limitation.

Warning

If accurate boundary boxes for text entities are important for you, read this first: Text Boundary Calculation. TL;DR: Boundary boxes for text entities are not accurate!

Unsupported DXF entities:

All ACIS based types like BODY, 3DSOLID or REGION

External references (XREF) and UNDERLAY object

RAY and XRAY, extend into infinite

ACAD_TABLE, no basic support - only preserved by ezdxf

Unsupported entities are silently ignored, filtering of these DXF types is not necessary.

The base type for bounding boxes is the BoundingBox class from the module ezdxf.math.

The entities iterable as input can be the whole modelspace, an entity query or any iterable container of DXF entities.

The Calculation of bounding boxes of curves is done by flattening the curve by a default flattening distance of 0.01. Set argument flatten to 0 to speedup the bounding box calculation by accepting less precision for curved objects by using only the control vertices.

The optional caching object Cache has to be instantiated by the user, this is only useful if the same entities will be processed multiple times.

Example usage with caching:

from ezdxf import bbox

msp = doc.modelspace()
cache = bbox.Cache()
# get overall bounding box
first_bbox = bbox.extents(msp, cache=cache)
# bounding box of all LINE entities
second_bbox = bbox.extend(msp.query("LINE"), cache=cache)

Functions

ezdxf.bbox.extents(entities: Iterable[DXFEntity], *, fast=False, cache: Cache | None = None) → BoundingBox

Returns a single bounding box for all given entities.

If argument fast is True the calculation of Bézier curves is based on their control points, this may return a slightly larger bounding box.

ezdxf.bbox.multi_flat(entities: Iterable[DXFEntity], *, fast=False, cache: Cache | None = None) → Iterable[BoundingBox]

Yields a bounding box for each of the given entities.

If argument fast is True the calculation of Bézier curves is based on their control points, this may return a slightly larger bounding box.

ezdxf.bbox.multi_recursive(entities: Iterable[DXFEntity], *, fast=False, cache: Cache | None = None) → Iterable[BoundingBox]

Yields all bounding boxes for the given entities or all bounding boxes for their sub entities. If an entity (INSERT) has sub entities, only the bounding boxes of these sub entities will be yielded, not the bounding box of the entity (INSERT) itself.

If argument fast is True the calculation of Bézier curves is based on their control points, this may return a slightly larger bounding box.

Caching Strategies

Because ezdxf is not a CAD application, ezdxf does not manage data structures which are optimized for a usage by a CAD kernel. This means that the content of complex entities like block references or leaders has to be created on demand by DXF primitives on the fly. These temporarily created entities are called virtual entities and have no handle and are not stored in the entity database.

All this is required to calculate the bounding box of complex entities, and it is therefore a very time consuming task. By using a Cache object it is possible to speedup this calculations, but this is not a magically feature, it requires an understanding of what is happening under the hood to achieve any performance gains.

For a single bounding box calculation, without any reuse of entities it makes no sense of using a Cache object, e.g. calculation of the modelspace extents:

from pathlib import Path
import ezdxf
from ezdxf import bbox

CADKitSamples = Path(ezdxf.EZDXF_TEST_FILES) / 'CADKitSamples'

doc = ezdxf.readfile(CADKitSamples / 'A_000217.dxf')
cache = bbox.Cache()
ext = bbox.extents(doc.modelspace(), cache)

print(cache)

1226 cached objects and not a single cache hit:

Cache(n=1226, hits=0, misses=3273)

The result for using UUIDs to cache virtual entities is not better:

Cache(n=2206, hits=0, misses=3273)

Same count of hits and misses, but now the cache also references ~1000 virtual entities, which block your memory until the cache is deleted, luckily this is a small DXF file (~838 kB).

Bounding box calculations for multiple entity queries, which have overlapping entity results, using a Cache object may speedup the calculation:

doc = ezdxf.readfile(CADKitSamples / 'A_000217.dxf.dxf')
msp = doc.modelspace()
cache = bbox.Cache(uuid=False)

ext = bbox.extents(msp, cache)
print(cache)

# process modelspace again
ext = bbox.extents(msp, cache)
print(cache)

Processing the same data again leads some hits:

1st run: Cache(n=1226, hits=0, misses=3273)
2nd run: Cache(n=1226, hits=1224, misses=3309)

Using uuid=True leads not to more hits, but more cache entries:

1st run: Cache(n=2206, hits=0, misses=3273)
2nd run: Cache(n=2206, hits=1224, misses=3309)

Creating stable virtual entities by disassembling the entities at first leads to more hits:

from ezdxf import disassemble

entities = list(disassemble.recursive_decompose(msp))
cache = bbox.Cache(uuid=False)

bbox.extents(entities, cache)
print(cache)

bbox.extents(entities, cache)
print(cache)

First without UUID for stable virtual entities:

1st run: Cache(n=1037, hits=0, misses=4074)
2nd run: Cache(n=1037, hits=1037, misses=6078)

Using UUID for stable virtual entities leads to more hits:

1st run: Cache(n=2019, hits=0, misses=4074)
2nd run: Cache(n=2019, hits=2018, misses=4116)

But caching virtual entities needs also more memory.

In conclusion: Using a cache is only useful, if you often process nearly the same data; only then can an increase in performance be expected.

Cache Class

class ezdxf.bbox.Cache(uuid=False)

Caching object for ezdxf.math.BoundingBox objects.

Parameters:: uuid – use UUIDs for virtual entities

has_data: True if the cache contains any bounding boxes

hits

misses

invalidate(entities: Iterable[DXFEntity]) → None

Invalidate cache entries for the given DXF entities.

If entities are changed by the user, it is possible to invalidate individual entities. Use with care - discarding the whole cache is the safer workflow.

Ignores entities which are not stored in cache.