Recover¶
This module provides functions to “recover” ASCII DXF documents with structural
flaws, which prevents the regular ezdxf.read()
and ezdxf.readfile()
functions to load the document.
The read()
and readfile()
functions will repair as much
flaws as possible and run the required audit process automatically
afterwards and return the result of this audit process:
import sys
import ezdxf
from ezdxf import recover
try:
doc, auditor = recover.readfile("messy.dxf")
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file.')
sys.exit(2)
# DXF file can still have unrecoverable errors, but this is maybe just
# a problem when saving the recovered DXF file.
if auditor.has_errors:
auditor.print_error_report()
The loading functions also decode DXF-Unicode encoding automatically
e.g. “\U+00FC” -> “ü”.
All these efforts cost some time, loading the DXF document with
ezdxf.read()
or ezdxf.readfile()
is faster.
Warning
This module will load DXF files which have decoding errors, most likely binary
data stored in XRECORD entities, these errors are logged as unrecoverable
AuditError.DECODE_ERRORS
in the Auditor.errors
attribute, but no
DXFStructureError
exception will be raised, because for many use
cases this errors can be ignored.
Writing such files back with ezdxf may create invalid DXF files, or at least some information will be lost - handle with care!
To avoid this problem use recover.readfile(filename, errors='strict')
which raises an UnicodeDecodeError
exception for such binary data.
Catch the exception and handle this DXF files as unrecoverable.
Loading Scenarios¶
1. It will work¶
Mostly DXF files from AutoCAD or BricsCAD (e.g. for In-house solutions):
try:
doc = ezdxf.readfile(name)
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
2. DXF file with minor flaws¶
DXF files have only minor flaws, like undefined resources:
try:
doc = ezdxf.readfile(name)
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
auditor = doc.audit()
if auditor.has_errors:
auditor.print_error_report()
3. Try Hard¶
From trusted and untrusted sources but with good hopes, the worst case works like a cache miss, you pay for the first try and pay the extra fee for the recover mode:
try: # Fast path:
doc = ezdxf.readfile(name)
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
# Catch all DXF errors:
except ezdxf.DXFError:
try: # Slow path including fixing low level structures:
doc, auditor = recover.readfile(name)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
# DXF file can still have unrecoverable errors, but this is maybe
# just a problem when saving the recovered DXF file.
if auditor.has_errors:
print(f'Found unrecoverable errors in DXF file: {name}.')
auditor.print_error_report()
4. Just use the slow recover module¶
Untrusted sources and expecting many invalid or corrupted DXF files, you always pay an extra fee for the recover mode:
try: # Slow path including fixing low level structures:
doc, auditor = recover.readfile(name)
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
# DXF file can still have unrecoverable errors, but this is maybe
# just a problem when saving the recovered DXF file.
if auditor.has_errors:
print(f'Found unrecoverable errors in DXF file: {name}.')
auditor.print_error_report()
5. Unrecoverable Decoding Errors¶
If files contain binary data which can not be decoded by the document encoding, it is maybe the best to ignore these files, this works in normal and recover mode:
try:
doc, auditor = recover.readfile(name, errors='strict')
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
except UnicodeDecodeError:
print(f'Decoding error in DXF file: {name}.')
sys.exit(3)
6. Ignore/Locate Decoding Errors¶
Sometimes ignoring decoding errors can recover DXF files or at least you can detect where the decoding errors occur:
try:
doc, auditor = recover.readfile(name, errors='ignore')
except IOError:
print(f'Not a DXF file or a generic I/O error.')
sys.exit(1)
except ezdxf.DXFStructureError:
print(f'Invalid or corrupted DXF file: {name}.')
sys.exit(2)
if auditor.has_errors:
auditor.print_report()
The error messages with code AuditError.DECODING_ERROR
shows the
approximate line number of the decoding error:
“Fixed unicode decoding error near line: xxx.”
Hint
This functions can handle only ASCII DXF files!
- ezdxf.recover.readfile(filename: str | Path, errors: str = 'surrogateescape') tuple[Drawing, Auditor] ¶
Read a DXF document from file system similar to
ezdxf.readfile()
, but this function will repair as many flaws as possible, runs the required audit process automatically the DXF document and theAuditor
.- Parameters:
filename – file-system name of the DXF document to load
errors –
specify decoding error handler
”surrogateescape” to preserve possible binary data (default)
”ignore” to use the replacement char U+FFFD “�” for invalid data
”strict” to raise an
UnicodeDecodeError
exception for invalid data
- Raises:
DXFStructureError – for invalid or corrupted DXF structures
UnicodeDecodeError – if errors is “strict” and a decoding error occurs
- ezdxf.recover.read(stream: BinaryIO, errors: str = 'surrogateescape') tuple[Drawing, Auditor] ¶
Read a DXF document from a binary-stream similar to
ezdxf.read()
, but this function will detect the text encoding automatically and repair as many flaws as possible, runs the required audit process afterwards and returns the DXF document and theAuditor
.- Parameters:
stream – data stream to load in binary read mode
errors –
specify decoding error handler
”surrogateescape” to preserve possible binary data (default)
”ignore” to use the replacement char U+FFFD “�” for invalid data
”strict” to raise an
UnicodeDecodeError
exception for invalid data
- Raises:
DXFStructureError – for invalid or corrupted DXF structures
UnicodeDecodeError – if errors is “strict” and a decoding error occurs
- ezdxf.recover.explore(filename: str | Path, errors: str = 'ignore') tuple[Drawing, Auditor] ¶
Read a DXF document from file system similar to
readfile()
, but this function will use a special tag loader, which tries to recover the tag stream if invalid tags occur. This function is intended to load corrupted DXF files and should only be used to explore such files, data loss is very likely.- Parameters:
filename – file-system name of the DXF document to load
errors –
specify decoding error handler
”surrogateescape” to preserve possible binary data (default)
”ignore” to use the replacement char U+FFFD “�” for invalid data
”strict” to raise an
UnicodeDecodeError
exception for invalid data
- Raises:
DXFStructureError – for invalid or corrupted DXF structures
UnicodeDecodeError – if errors is “strict” and a decoding error occurs