Recover

This module provides functions to “recover” ASCII DXF documents with structural flaws, which prevents the regular ezdxf.read() and ezdxf.readfile() functions to load the document.

The read() and readfile() functions will repair as much flaws as possible and run the required audit process automatically afterwards and return the result of this audit process:

import sys
import ezdxf
from ezdxf import recover

try:
    doc, auditor = recover.readfile("messy.dxf")
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file.')
    sys.exit(2)

# DXF file can still have unrecoverable errors, but this is maybe just
# a problem when saving the recovered DXF file.
if auditor.has_errors:
    auditor.print_error_report()

The loading functions also decode DXF-Unicode encoding automatically e.g. “\U+00FC” -> “ü”. All these efforts cost some time, loading the DXF document with ezdxf.read() or ezdxf.readfile() is faster.

Warning

This module will load DXF files which have decoding errors, most likely binary data stored in XRECORD entities, these errors are logged as unrecoverable AuditError.DECODE_ERRORS in the Auditor.errors attribute, but no DXFStructureError exception will be raised, because for many use cases this errors can be ignored.

Writing such files back with ezdxf may create invalid DXF files, or at least some information will be lost - handle with care!

To avoid this problem use recover.readfile(filename, errors='strict') which raises an UnicodeDecodeError exception for such binary data. Catch the exception and handle this DXF files as unrecoverable.

Loading Scenarios

1. It will work

Mostly DXF files from AutoCAD or BricsCAD (e.g. for In-house solutions):

try:
    doc = ezdxf.readfile(name)
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file: {name}.')
    sys.exit(2)

2. DXF file with minor flaws

DXF files have only minor flaws, like undefined resources:

try:
    doc = ezdxf.readfile(name)
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file: {name}.')
    sys.exit(2)

auditor = doc.audit()
if auditor.has_errors:
    auditor.print_error_report()

3. Try Hard

From trusted and untrusted sources but with good hopes, the worst case works like a cache miss, you pay for the first try and pay the extra fee for the recover mode:

try:  # Fast path:
    doc = ezdxf.readfile(name)
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
# Catch all DXF errors:
except ezdxf.DXFError:
    try:  # Slow path including fixing low level structures:
        doc, auditor = recover.readfile(name)
    except ezdxf.DXFStructureError:
        print(f'Invalid or corrupted DXF file: {name}.')
        sys.exit(2)

    # DXF file can still have unrecoverable errors, but this is maybe
    # just a problem when saving the recovered DXF file.
    if auditor.has_errors:
        print(f'Found unrecoverable errors in DXF file: {name}.')
        auditor.print_error_report()

4. Just use the slow recover module

Untrusted sources and expecting many invalid or corrupted DXF files, you always pay an extra fee for the recover mode:

try:  # Slow path including fixing low level structures:
    doc, auditor = recover.readfile(name)
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file: {name}.')
    sys.exit(2)

# DXF file can still have unrecoverable errors, but this is maybe
# just a problem when saving the recovered DXF file.
if auditor.has_errors:
    print(f'Found unrecoverable errors in DXF file: {name}.')
    auditor.print_error_report()

5. Unrecoverable Decoding Errors

If files contain binary data which can not be decoded by the document encoding, it is maybe the best to ignore these files, this works in normal and recover mode:

try:
    doc, auditor = recover.readfile(name, errors='strict')
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file: {name}.')
    sys.exit(2)
except UnicodeDecodeError:
    print(f'Decoding error in DXF file: {name}.')
    sys.exit(3)

6. Ignore/Locate Decoding Errors

Sometimes ignoring decoding errors can recover DXF files or at least you can detect where the decoding errors occur:

try:
    doc, auditor = recover.readfile(name, errors='ignore')
except IOError:
    print(f'Not a DXF file or a generic I/O error.')
    sys.exit(1)
except ezdxf.DXFStructureError:
    print(f'Invalid or corrupted DXF file: {name}.')
    sys.exit(2)
if auditor.has_errors:
    auditor.print_report()

The error messages with code AuditError.DECODING_ERROR shows the approximate line number of the decoding error: “Fixed unicode decoding error near line: xxx.”

Hint

This functions can handle only ASCII DXF files!

ezdxf.recover.readfile(filename: str | Path, errors: str = 'surrogateescape') tuple[Drawing, Auditor]

Read a DXF document from file system similar to ezdxf.readfile(), but this function will repair as many flaws as possible, runs the required audit process automatically the DXF document and the Auditor.

Parameters:
  • filename – file-system name of the DXF document to load

  • errors

    specify decoding error handler

    • ”surrogateescape” to preserve possible binary data (default)

    • ”ignore” to use the replacement char U+FFFD “�” for invalid data

    • ”strict” to raise an UnicodeDecodeError exception for invalid data

Raises:
  • DXFStructureError – for invalid or corrupted DXF structures

  • UnicodeDecodeError – if errors is “strict” and a decoding error occurs

ezdxf.recover.read(stream: BinaryIO, errors: str = 'surrogateescape') tuple[Drawing, Auditor]

Read a DXF document from a binary-stream similar to ezdxf.read(), but this function will detect the text encoding automatically and repair as many flaws as possible, runs the required audit process afterwards and returns the DXF document and the Auditor.

Parameters:
  • stream – data stream to load in binary read mode

  • errors

    specify decoding error handler

    • ”surrogateescape” to preserve possible binary data (default)

    • ”ignore” to use the replacement char U+FFFD “�” for invalid data

    • ”strict” to raise an UnicodeDecodeError exception for invalid data

Raises:
  • DXFStructureError – for invalid or corrupted DXF structures

  • UnicodeDecodeError – if errors is “strict” and a decoding error occurs

ezdxf.recover.explore(filename: str | Path, errors: str = 'ignore') tuple[Drawing, Auditor]

Read a DXF document from file system similar to readfile(), but this function will use a special tag loader, which tries to recover the tag stream if invalid tags occur. This function is intended to load corrupted DXF files and should only be used to explore such files, data loss is very likely.

Parameters:
  • filename – file-system name of the DXF document to load

  • errors

    specify decoding error handler

    • ”surrogateescape” to preserve possible binary data (default)

    • ”ignore” to use the replacement char U+FFFD “�” for invalid data

    • ”strict” to raise an UnicodeDecodeError exception for invalid data

Raises:
  • DXFStructureError – for invalid or corrupted DXF structures

  • UnicodeDecodeError – if errors is “strict” and a decoding error occurs