Scopr Change History
====================

XBow SDK
--------
6.0.867   Added InstallShield CAB file type identification.
          Added XISCAB engine to extract InstallShield CAB v5+ files.
6.0.865   Added file sub-type identification of MagicDraw Project ZIPs.
          Added .mdzip as a valid ZIP extension.
          Added file sub-type identification of Microsoft Audio Compression Manager DLLs.
          Added .acm as a valid DLL extension.
          Added RemoveSoftHyphens and ReplaceSoftHyphens options to the XText engine.
          * Added corresponding per-object details to count the number of soft-hyphen characters removed or replaced.
            * RemovedSubjectSoftHyphens
            * ReplacedSubjectSoftHyphens
            * RemovedBodySoftHyphens
            * ReplacedBodySoftHyphens
          Added "DRMEncrypted;" to details of OLE2 documents that are IRM/DRM-encrypted.
          * This flag can be used to differentiate auto-decryptable encrypted OLE2 documents from those that are not.
          * No attempt is made to auto-decrypt DRMEncrypted documents.
          Added extraction of 0-byte MIME attachments that have a filename.
          Added de-obfuscation for https://virusreports.shapeofemail.com/293809.
          Added ConvertToScoprHTMLImage per-engine XAPI option.
          * For supported image types, extracts an HTML that duplicates the original image using only HTML 5.
          * Since this is effectively a new sub-type of HTML, these HTMLs are identified as "Scopr HTML Image" or SHTMLI.
          If the extracted URL context is URL_CONTEXT_UNKNOWN, de-duplicate the URL itself not including the context.  Otherwise de-duplicate *including* the context.
          Added document screenshot options:
          * ExtractScreenshot
            * Per-file-format screenshot enable/disable.
          * ExtractScreenshotDepths
            * Per-file-format bit-flags for which depths (0-31) are screenshot-enabled.
          * ExtractScreenshotMinSize
            * Per-file-format minimum file size to screenshot.
          * ExtractScreenshotMaxSize
            * Per-file-format maximum file size to screenshot.
          Fix for auto-decrypting Office documents that use the more generic "encryptedKey" property instead of "p:encryptedKey".
          If a child data stream of a PDF contains a URL identified by a "<<S/URI/Type/Action/URI (" sequence, pre-pend "http://" if the URL has no schema.
		  Fixed minor bug in URL extraction from JavaScript.
		  * Remove any contiguous LWSP or CR or LF characters from the end of re-direct URL strings because none of those characters can actually be part of a valid URL.
          Fixed Bech32 Bitcoin address extraction.
          Fixed HTML file type identification FPs.
          Added de-obfuscation engine to address https://virusreports.shapeofemail.com/293790.
          * De-obfuscates powershell.exe command-line script.  
          * Added "deobfuscatedpowershell" to object's details.
          Moved ARC file type identification after terminfo.
          Added ATR, E57, FDF file type identification.
6.0.835   Added de-obfuscation of powershell.exe command line attack.
          * Thus the infected .bat file reports Scopr:AntiMalware:Malware=ObfuscatedPowerShell and extracts the obfuscated URL.
          Added extraction of the MSIP_Labels MIME header (Microsoft Information Protection SDK Metadata).
          * The labels are added to the MIME object's details.
          * If the parent is XBOW_TYPE_MIME and the parent has an MSIP_Labels header (that is, MSIP_Label_xxx_Enabled=true; in its GDPRDetails) and the document does not have an MSIP_Label_xxx_Enabled=true; in its GDPRDetails, add "nomsiplabel;" to the object's details.
          Fixed setting of qwUncompressedSize for extracted binarized images.
          Fixed crash on de-allocated subject CkString buffer during recursive MIME part processing.
          * Now subject string is copied to a stack buffer instead.
          Added per-engine XAPI option StripComments.
          * If set on a text engine that supports it, produces child objects where comments have been stripped.
          Added XAPI configuration options BarcodeShrinkPercent, MinBarcodeShrinkFrameX, and MinBarcodeShrinkFrameY.
          * If the original image width >= MinBarcodeShrinkFrameX or height >= MinBarcodeShrinkFrameY, shrink/reduce the size of the image by BarcodeShrinkPercent (1-99; all other values do nothing).
          Added XEPS engine to convert Encapsulated Postscript (EPS) vector drawings to PNG.
          Added XJXR engine to convert JPEG XR (eXtended Range) images to BMP.
          * JXR is natively supported by Windows 11.
          Added XBOW_FLAG2_NO_DOCUMENT_TEXT (JSON "NoDocumentText") flag.
          * Set on PDF, OLE2, and OneNote documents that do not have any body text.
          Added Alpine Linux x64 build of scoprd.
          * Includes Chilkat 10.1.3 MIME parser.
          Added extraction of SegWit and Taproot Bitcoin addresses.
          Improved SVG file type identification.
          Added EPS, JXR, BPG, APNG, HEIF, CRAMFS, EXT, NTFS, SQUASHFS, HDF5, terminfo, and scr_dump file type identification.
          Added check of extraction limits even after base64 extraction fails.
          * It is very normal to find a long series of many base64-looking strings that we do not want to process due to configuration settings, but we still need to check extraction limits - particularly time.
          * But we do not want to count the string as an extracted object since we did not do that.
          Improved BIFF extraction.
          Improved UDF file type identification.
          Added MaxBarcodeZoomFrameX. Maximum image width for zoomed barcode processing.
          Added MaxBarcodeZoomFrameY. Maximum image height for zoomed barcode processing.
          Added MinBarcodeZoomFrameX. Minimum image width for zoomed barcode processing.
          Added MinBarcodeZoomFrameY. Minimum image height for zoomed barcode processing.
          Added stability/security enhancements to OLE2 engine.
          * Prevents arbitrarily large heap allocations due to malicious/malformed/corrupt OLE2 file data.
          Added logic to prevent extraction of raw data from SVG xlink:href="<url>" references where the url starts with "data:".
          Upgraded to plutosvg 0.0.7.
          * Fixes an infinite loop that is causing minor problems in production.
          * Significantly better SVG support in general.
          Added logic to convert UTF-16 and UTF-32 XML and SVG documents to UTF-8 children before processing.
          Added JSON response fields "autodecryptable" (for encrypted files where auto-decrypt is supported) and "version" (the scoprd version that produced the JSON response).
          Added logic to generically extract 34-character bitcoin addresses that have been split in half and are separated by a colon within the 32 characters following the first half.
          * If the length of the string following the colon is exactly 17 (half of 34), that string is appended to the first 17 characters to produce a possible 34-character Bitcoin address. If the resulting address is valid, including its checksum, it is extracted.
          * Added "splitbtc=<number of pieces>" to the object's details.
          Improved metadata extraction by using an up to 5MB read buffer (based on the file size) instead of a constant 8K buffer.
          Added new XEVENT callback XBOW_EVENT_OBJECT_MALFORMED to notify XAPI clients when a malformed file/object is detected. Detected malformities:
          * XAPI definition                                    XAPI value  Description
            -------------------------------------------------  ----------  -----------------------------------------------------------------------
            XBOW_MALFORM_NONE                                  0           No malform reason
            XBOW_MALFORM_JPEG_CORRUPT                          1           JPEG data is corrupt
            XBOW_MALFORM_RTF_BIN_NEGATIVE_N                    2           RTF /bin control word is followed by a negative value of N
            XBOW_MALFORM_MISSING_HTML_TAG_ATTRIBUTE_NAME       3           Missing HTML tag attribute name
            XBOW_MALFORM_VBE_ENCODED_SIZE_TOO_LARGE            4           Advertised VBE encoded data size exceeds file size
            XBOW_MALFORM_VBE_ENCODED_SIZE_TOO_SMALL            5           Advertised VBE encoded data size is less than file size
            XBOW_MALFORM_VBE_ENCODED_SIZE_MISMATCH             6           Advertised VBE encoded data size does not match actual size
            XBOW_MALFORM_PDF_HEADER_AT_NONZERO_OFFSET          7           PDF header found at non-zero offset
            XBOW_MALFORM_RAR_HEADER_AT_NONZERO_OFFSET          8           RAR header found at non-zero offset
            XBOW_MALFORM_MS04_028                              9           MS04-028: Buffer overrun in JPEG processing could allow code execution
            XBOW_MALFORM_CVE_2018_11212                        10          CVE-2018-11212: JPEG divide-by-zero denial of service vulnerability
            XBOW_MALFORM_PDF_CORRUPT                           11          PDF header is present, but PDFlib TET failed to parse as a valid PDF
            XBOW_MALFORM_INVALID_OFFICEARTRECORDHEADER_RECLEN  12          Invalid OLE2 Data stream OFFICEARTRECORDHEADER recLen field (32-bit image size)
6.0.802   Added XAPI configuration option AddBarcodeBorderPixels.
          * 0 = off (default)
          * 1-5 = width of border to add to image
          * Useful for detecting QR codes that have been intentionally cut off at the border.
          * If AddBarcodeBorderPixels is set, the position of the barcode in the original image is reported, not the position in the larger image where the border has been added.
          * ZBar barcode detections report when the detected & extracted barcode crosses the edge of the original image into the optionally added black border.
            * The detail is called bordercross and its possible values are left, right, top, bottom, or a comma-separated combination of these.
            * For example:
              bordercross=left,bottom;
              means the barcode was detected in the bottom-left corner of the image and required both the left and bottom borders to be present for the barcode to be detected.
          Moved much of the XAPI-internal (non-engine) logging to the debug log level to reduce the amount of info log level output.
          Added XAPI option BarcodeBorderGrayscaleColor.
          * When AddBarcodeBorderPixels is not 0, use this grayscale color value for the border.
          * The range is 0-255 where 0=100% black and 255=100% white.
          * This option is used to increase the contrast between the border and the barcode.
          * This allows ZBar to detect more barcodes at the border.
6.0.779   Added OLE2 'Data' stream extraction of BMP, JPEG, PNG, and TIFF images.
          * Addresses QR codes stored in the 'Data' stream.
          Added identification of XOR and RC4-encrypted Office 95, 97, 2000, XP, and 2003 documents.
          The original 3rd-party Cybozulib minixml.hpp code written in 2012 does not support newer encrypted XLSX documents where the EncryptedPackage stream's XML data has a UTF-8 Byte Order Mark (BOM) of 0xEF, 0xBB, 0xBF in front of the usual "<?xml" XML-start header.
          * Added logic to detect encrypted XLSX documents with a UTF-8 BOM.
          * Set XBOW_FLAG_HAS_ENCRYPTED_CHILDREN on the encrypted legacy OLE2 document if it is encrypted with XOR or RC4.
          Added options to control the range of image sizes to zoom when barcode zooming is enabled.
          * MinBarcodeZoomFrameX: Minimum image width (in pixels) to zoom
          * MinBarcodeZoomFrameY: Minimum image height (in pixels) to zoom
          * MaxBarcodeZoomFrameX: Maximum image width (in pixels) to zoom
          * MaxBarcodeZoomFrameY: Maximum image height (in pixels) to zoom
6.0.745   Fixed additional OLE2 auto-decrypt issues.
6.0.742   Added logic to identify and extract PDF vector image streams.
          Added text engine options ExtractPDFVectorImageStreams and ExtractPDFVectorImages.  Operational, but functionally very limited.  Needs work.
          Added barcode configuration options.
          * ZBarSymbologyTypes - Enable/disable the individual barcode types supported by ZBar:
            EAN-2
            EAN-5
            EAN-8
            EPC-E
            ISBN-10
            UPC-A
            EAN-13
            ISBN-13
            Composite
            I2/5
            GS1 DataBar
            GS1 DataBar Expanded
            Codabar
            Code 39
            PDF 417
            QR-Code
            SQ Code
            Code 93
            Code 128
          * MaxBarcodeWidth - Maximum barcode bounding box width in pixels; 0 = none
          * MaxBarcodeHeight - Maximum barcode bounding box height in pixels; 0 = none
          * MinBarcodeWidth - Minimum barcode bounding box width in pixels; 0 = none
          * MinBarcodeHeight - Minimum barcode bounding box height in pixels; 0 = none
          Modified ZBar logic to return no more than 4 points to describe the bounding box surrounding any given symbol.
          Added identification of encrypted ZIPs that match the PKWARE Strong Encryption Specification (a.k.a. SecureZIP).
          Added XAPI configuration options for detecting keywords (byte sequences) in all extracted files:
          * FindKeywords
          * Keywords
          * MaxFindKeywordsBytes
          Added 3 new per-object details:
          * Max URLs Length Exceeded
          * Extracted Document Text
          * PDF Stream
          Added ZIP:EPUB file type identification.
          Completed ASCII art QR code extraction engine.
          Output unknown 64-bit compressed/uncompressed sizes to JSON as -1 instead of the internally defined/used largest unsigned 64-bit integer 2^64 - 1 == 18,446,744,073,709,551,615.
          Added support for correctly reporting ZIP64 compressed and uncompressed file sizes.
          Re-factored problematic RIFF parser.
          Added CDR, DLS, RMID file type identification.
          Upgraded to PDFlib TET 5.5 to address remaining PDF parser infinite loops.
          Finished adding support for the f: parameter in the scoprd protocol to upload a file, process the file, and store the resulting metadata in a configured database.
          Added logic to remove any extracted files that are not included in the scoprd response.
          If outputting JSON and an object is hidden from the JSON output and the object is not the root (depth == 0) object and it created a temporary file, delete the temporary file.
          Updated list of OLE2 extensions.
          Added log/debug output to CMainProcessing::ObjectFound.
          Fixed ordinal counting for child objects extracted from OLE2 objects.
          Added per-object 32-bit unique ID.  Used to differentiate siblings at the same depth and ordinal.
          * Added logic to make it possible for XAPI clients to uniquely identify objects based on their depth, ordinal, and an object ID (or just the object ID alone).
          * If object ID is 0 when an event for it is fired, assign the object the next one via IXAPI::GetNextObjectID().
          Moved per-object detail string definitions to a common header file.
          Call xcommon_child_autodecrypt_failed() after all attempts to auto-decrypt an object have failed.
          Correctly set auto-decrypt failed flags in XALZIP engine.
          Added -v command-line option to show scoprd version number.
          Added -c command-line option to specify scoprd.conf filepath.
          Added SquelchExtractedFilepath option.
          Added logic to deal with ASCII art QR codes using white foreground on blue background.
          Cleaned up URL options/flags.
          Added ReportURLContext=1 to all XAPI configs.
          Re-factored URL processing options.
          * XBOW_URL_FLAG_xxx definitions are now used to store new urlflags column in xbsuser table.
          * urlflags replaces extracturls and extractrelurls columns.
          Added MIME, POWERSHELL, and RTF to list of text formats for which common metadata elements are extracted (URLs, email addresses, phone numbers, etc.).
          Treat left/right parens and equal sign as email address delimiters.
          Treat the close-paren character as part of the URL unless the first character left of the URL schema is a left-paren.
          Added EnhanceBarcodeBorderPixels XAPI option.
          * Specifies the width of additional image border space to add to all images scanned for barcodes.
          * The new border is a mirror image of the existing border except all pixels that are not 100% white are set to 100% black.
          * Useful for detecting barcodes that are cut off near one or more edge.
          Fixed decoded base64 size computation.
          Added limited C, C++, and C# file type identification.
          Valid barcode URLs must contain only these characters:
          * ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=
          * All other characters must be percent-encoded.
          Fixed max extract ratio logic.
          Improved HTML parsing logic to only set XBOW_FLAG_OBFUSCATED_URL_ATTRIBUTE when the HTML tag context is known.
          Added support for XBOW_URL_FLAG_EXTRACT_SCREENSHOT feature which uses Microsoft's Playwright to render a URL via a headless browser and capture the page as a PNG.  The name of the extracted browser page screenshot is "browser_page_screenshot.png".
          Reduced size of HTML read/decode buffers from ridiculous 10MB to 8K.
          Re-factored hex sequence extraction logic to use an 8K read buffer instead of 10MB.
          Fixed memory leak in XQR engine; the magnified image was not being freed.
          Fixed HTML entity decoder; use the original 2012 version, not the corrupt 2016 version.
          Consolidated multiple engine-specific versions of extract_base64() to xcommon_extract_base64().
          Fixed script extraction now that HTML read buffer size is 8K.
          Always output the number of extracted, not-squelched barcodes first.
          Fixed PDF auto-decrypt failure when the password is a single (unescaped) backslash, left-brace, right-brace, or double-quote character.
          Added ExtractHexPattern XAPI option and associated support for extracting hex sequences obfuscated by interlacing single random characters.
          WinRAR searches the first 2,097,115 bytes for a valid CAB file header and will successfully extract CABs that start at non-zero offsets. The XCAB engine's MSPack search() function searches the entire file.
          * For now Scopr will not go that far.
          * Since Scopr is already loading the first MIN_DETECT_BUFFER_SIZE (8K) bytes of every file, the CAB file identification logic now scans the first 8K for a valid CAB header.
          Removed support for ZXing barcode scanner.
          * The ZBar barcode scanner out-performs ZXing in most respects.
          Completed March 30, 2024 fix for CAB files where the coffFiles offset must be used to correctly extract the contents of the CAB and the CAB header's start offset must be included in coffFiles.
          During PDF auto-decrypt, when the correct key (either user or owner) is not found, mark the PDF itself with the XBOW_FLAG_AUTODECRYPT_KEY_NOT_FOUND flag.
          Added scoprd -K command-line parameter.
          * -K:<filepath> Auto-decrypt key list filepath
          Re-factored scoprd protocol handling logic.
          * Fixed regression in logic that handled the keys: protocol parameter.
          Fixed RAR auto-decrypt regression.
          Reverted Hex.cpp to the 6.0.637 code level.
          * The re-factored hex sequence extraction logic in 6.0.730 was found to have a minor problem.
          * Rather than risk trying to fix it, chose the faster/safer option of reverting the code to the last-known-good version.
          Minor re-factoring of the XML engine's base64 extraction logic to improve performance.
          * Now uses a 64K read buffer instead of reading one byte at a time.
          If the to-be-extracted MIME message body size returned by the 3rd-party Chilkat MIME parser is the same as the current MIME object's uncompressed size, do not extract it.
          * This means Chilkat considers the entire object to be a MIME message body.
          * Thus, extracting it would mean processing the exact same object over and over until an extraction limit is exceeded.
          Fix for an OLE2 auto-decrypt logic change in 6.0.706 ("Call xcommon_child_autodecrypt_failed after all attempts to auto-decrypt an object have failed").
          * The change did not take into account the OLE2 engine's unique two-pass auto-decrypt design.
          * The auto-decrypt completes successfully, but the encrypted OLE2 document ends up with conflicting auto-decrypt flags.
          * The regression was introduced on November 10, 2024.
          * Impacts encrypted OLE2 documents only.
6.0.637   Fixed failure to extract some URLs.
          * The original legacy 3rd-party logic was checking for the :// sequence, but instead of treating the string as an invalid URL if that sequence is not present it was allowing URLs like http:foobar.com/path/filename.ext which are not valid.
          Fixed crash after successful auto-decrypt when the key contains at least one high-ASCII character.
          * Note that it is very easy to mistakenly and unknowingly copy+paste passwords containing high-ASCII characters as UTF-8 sequences instead.
          * When this happens the password will still look correct, but the bytes are actually different and thus the password will not work.
          Fixed Ole10Native stream extraction.
          * Off-by-one bug in stream name comparison was skipping Ole10Native streams.
6.0.626   Added XBOW_BARCODE_READER_FLAG_ENABLE_MONOCHROME option to BarcodeReaderFlags.
          * Not enabled by default.
          * Converts grayscale image to black and white only based on threshold options.
          To match the behavior of barcode scanners, added logic to pre-pend "http://" to any barcode payload strings that look like URLs.
          * Specifically, if the payload string consists of:
            * One or more characters followed by a period (which must be the last period in the string).
            * The period must be followed by at least two characters.
            * Followed by two or more characters in the set [A-Z,a-z].
            * And contains only alphanumeric characters.
          Fixed XOLESS engine regression caused by a fix that was intended to address corrupt FAT/miniFAT crashes.
          * This involved a significant re-factoring of the original 3rd-party cfb.hpp logic.
          Added .png as a valid extension for ICO files.
          Added optional per-URL context to extracted URLs.
          Fixed off-by-one bug in HTML engine's buffered URL search and extract loop.
          Fixed crash in XARJ engine's original 3rd-party code (found during stress testing).
          Added file type identification for Universal Data Link (URL) files.
          Added MaxImageMemory XAPI option to control the maximum number of bytes allowed to be allocated to process an image.
6.0.613   Fixed crash in XQR engine if malloc fails (i.e. out of memory) when processing a "double-QR" (QR within a QR).
          Applied fix for CVE-2023-40889 to zbar 0.23.90 library.
          Added logic to prevent an infinite recursion loop in zbar 0.23.90 qr_reader_match_centers().
          XQR engine now pre-pends "http://" to extracted barcode payloads that look like URLs without a schema.
          * This behavior is a catch-22.
          * Some barcode readers do this, while others do not.
          XTEXT engine now supports optional extraction of ASCII-art barcodes as PNG images.
          Added barcode type to ASCII-art object's details.
          XPDF engine now sets PDFlib TET's outputformat option to utf8.
          * This forces TET to return the PDF document body text as a UTF-8 string instead of UTF-16 or UTF-32.
          * This is necessary in order to correctly process ASCII-art barcodes.
          Significantly improved XQR engine barcode processing logic.
          Added several new XAPI configuration options to enhance barcode processing.
          XTEXT engine now treats SVG files the same as HTML with respect to HTML and ASCII-art processing.
          Added info, warning, and error XAPI logging options.
          Added MSC (OLESS and XML formats) and MSCIL file type identification.
          Added XMSCIL engine to extract BMP icons from MSC files.
          Fixed XTEXT engine's ExtractHex option (it was always enabled).
          Fixed numerous data validation issues (crashes) in XACE engine's BASE_EXTRACT_DecompressFile() function and its dependencies.
          Fixed XOLESS engine's legacy parsing logic that was not validating the SectorShift and MiniSectorShift header fields and added validation of block numbers vs. the file size.
          Disabled all remaining printf() debug output.
          * Such output interferes with the scoprd and XRay protocols.
          Fixed a crash in the XISO9660 engine's legacy code.
          * Found during corrupt data stress testing.
          Fixed a crash in the XEAPPX engine's legacy code.
          * Found during corrupt data stress testing.
          If the input APPX XML contained Name=" but the closing " was missing, the bug caused a NULL pointer reference.
          Fixed a crash in the XACE engine's legacy code.
          * Found during corrupt data stress testing.
          * Added bullet-proofing logic to the BASE_EXTRACT_DecompressFile() function and its dependencies.
6.0.572   Disabled SQ code scanning in zbar library due to performance issues.
          Moved load of configuration options KEYNAME_PORT, KEYNAME_BINDIPV4, and KEYNAME_MAXSCOPRDCONNECTIONS to LoadConfiguration() in scoprd.cpp.
          * Previously these global connection-related variables were being loaded via the COptions constructor which was too late for the first connection.
          Improved logging; added LOGFLAG_INFO, LOGFLAG_WARNING, and LOGFLAG_ERROR.
          Re-factored LZMA SDK logic in order to support double-quote characters in 7ZIP passwords.
          Added SOReuseAddr and SOReusePort options to scoprd configuration.
          BarcodeReaderFlags
          * 0x00000004 - If set, the XQR engine will generate and barcode scan a 2nd, inverse-contrast version of the original image
          * 0x00000008 - If set, the XQR engine will apply/use the MinBarcodeWhiteThreshold and MaxBarcodeBlackThreshold options
          OCRFlags
          * 0x00000004 - If set, the XOCR engine will apply/use the OCRMinWhiteThreshold and OCRMaxBlackThreshold options
          Added scannum=1 or 2 to object details when at least one barcode is found
          * Indicates which barcode scan (1st or 2nd) detected at least one barcode
6.0.563   Critical fix for semi-infinite loop in iCalendar file type identification logic for files that exceed the max extract ratio (e.g. 500:1).
          Optimized performance of iCalendar and PowerShell file type identification logic.
          Added JPEG 2000 file type identification (JP2, JPM, JPX, MJP2).
6.0.430   Re-factored XONE OneNote engine to support auto-decrypt key discovery.
          Improved PowerShell file type identification. [VR226393]
          When decoding hex-encoded strings, allow commas to separate each 2-character hex value (e.g. '0d,0a,45,51').  [VR226393]
          Fixed rare infinite loop in ICalender parser.
          Added MFS, HFS, and HFS+ file type identification.
          Added logic to the XONE engine to identify encrypted OneNote sections.
          * The XONE engine does not yet support auto-decrypt for OneNote documents containing one or more encrypted sections.
          Increased CxImage library's MAX_DIB_SIZE from 25MB to 50MB to support 2x larger images of any/all types.
          Use .mp4 as the data-type-correct extension for files identified as either ISO-14496-MP41 and ISO-14496-MP42.
          Fixed crash when transient key list file specified by a "keys:<filepath>" parameter in the scoprd protocol has a combined/cumulative key length of over 256 characters.
          Removed scoprd_filters.csv from scoprd installer RPM.
6.0.412   Added XONE OneNote extraction engine.
          * Extracts strings, URLs, and embedded files.
          Added support for buffer-based file type identification for ICalendar/VCalendar files.
          Fixed XZIP engine crash caused by very specific values in the compressed file size field of the ZIP header for files that are both STORED and AES-encrypted.
          Added .wmz (gzip'd WMF) as a valid GZIP extension.
          Fixed XBCRYPT engine logic such that it will identify BCrypt files even when they can not be auto-decrypted.
          Enhanced XONE OneNote extraction engine.
          * XONE now supports the ExtractText=1 option and behaves the same as the XOLESS and XPDF engines when it is enabled.
          * That is, all strings are extracted to a single file, with each string terminated by a line-feed character.
          * This addresses OneNote files containing large numbers of strings and thus unnecessarily exceeding extraction limits.
          Added MaxDepthToHash option to scoprd.conf.
6.0.394   Improved TAR file type identification and parsing.
          Convert '#' characters in filenames to underscores because they can't be used directly in a URL.
          Added MaxHeapSize XAPI configuration option.
          Replaced all malloc, realloc, and free calls with equivalent centralized heap management functions.
          Heap management includes MaxHeapSize enforcement and peak heap usage logging.
          Added RAR major format version to RAR object details.
          Added extraction of Javascript code blocks from SVG XML documents (VR211891).
          Added logic to ignore possible auto-decrypt keys that are valid base64 sequences of MaxDiscoverableBase64KeySize or more characters.
          Files initially identified as XBOW_TYPE_JAVASCRIPT are no longer adjusted to XBOW_TYPE_UNKNOWN when duktape fails to compile the file as JavaScript.
          Fixed BASE64 discovery code such that it correctly ignores "" (empty) quoted sequences.
          Fixed XRAR engine to use xcommon_wchar_t_to_utf8 to convert Unicode RAR filenames to UTF8.
          The base64 decoder in the XText engine now supports decoding reversed base64.
          * This means equal sign charcters are valid at the beginning of an otherwise valid base64 sequence.
          Fixed PDF processing cases where the master key is set, but the user key is not.
          Re-factored the libpng source code in order to remove all use of the dangerous and problematic setjmp/longjmp functions.
          Added logic to extract an HTML entity-decoded version of any HTML file that contains at least one HTML entity.
          Re-factored the X7ZIP engine's LZMA decompression logic to minimize heap usage.
          Added MinDiscoverableKeySize XAPI option.
          * Minimum auto-decrypt key size to discover.
          * Highly recommended that this option be set to 3.
          * Bad actors rarely use 1 or 2-character keys because it is relatively trivial to brute-force all such key combinations.
          Added MaxDiscoverableBase64KeySize XAPI option.
          * Maximum auto-decrypt key size for alpha-numeric keys that may also simply be part of a base64-encoded block of data.
          * Highly recommended that this option be set to at least 32.
          * This will generally cause auto-decrypt key discovery to ignore any large sequences of base64-encoded data.
          * This option is critically important in order to discover the correct auto-decrypt key if the key follows a large amount of base64-encoded data.
          Fixed infinite loop regression in CxImage PNG library caused by setjmp/longjmp re-factoring.
          Fixed logic error in original implementation of decrypted PDF child generation.
          * If a PDF's auto-decrypt key is found, that's enough to generate the Decrypted PDF Child.
          * Original implementation required at least one child file to be auto-decrypted + extracted first.
          Fixed rare infinite loop in ACE engine's auto-decrypt logic.
          Added support for four new scoprd.conf AutoDecryptFlags to begin dealing with one and two-character encryption keys:
          * 0x00000008 - Add the 62 one-character alpha-numeric keys a-z, A-Z, 0-9 to the transient key list
          * 0x00000010 - Add the 100 two-character 0-9 key combinations to the transient key list
          * 0x00000020 - Add the 676 two-character a-z key combinations to the transient key list
          * 0x00000040 - Add the 3,844 two-character alpha-numeric a-z, A-Z, 0-9 key combinations to the transient key list
          Additional improvements to make sure infinite loops do not occur in the RIFF and PNG parsers.
          Added VHDX engine.
          If a BZIP2, GZIP, or XZ file's extension indicates that it is expected to contain a TAR (e.g. .tbz, .tar.bz, .tgz, .tar.gz, .txz, .tar.xz), append .tar to the extracted child object's name.
          * There is no guarantee that .tar will be correct, but the hint is now preserved.
          Stop processing a TAR if it contains two consecutive headers containing nothing but 0x00 and/or 0x30.
          Allow TARs with an all-null magic field.
          Increased size of HTML parse buffer from 1MB to 10MB to deal with recent malicious HTML samples.
          Added support for converting SVG (XML-based vector images) to PNG.
          Added SVG to list of file types supported by XQR/zbar engine.
          Added SVG to XAPI configuration option BarcodeReaderFlags.
          Fixed a crash that can occur when parsing a long non-null-terminated XML line.
          Fixed corrupt data vulnerability in the XLHA engine.
          * The XLHA engine reads the LHA header's original size field as a 32-bit signed value.
          * If the high bit is set, the original size can be negative.
          * The original code did not expect such a case and behaved incorrectly.
          * Now when this happens, the corrupt LHA header and associated file is skipped.
          Added default value documentation to default scoprd configuration files (no changes to any values).
          Added XAPI per-engine configuration option ExtractDeleted.
          * If set on an engine that supports it (e.g. XMBR), extracts deleted files.
          Added MaxExtractVirtualDiskSize XAPI configuration option.
          Added MaxSizeToHash XRay/scoprd configuration option.
          Added FAT as a possible secondary file type for MBR file type identifications.
6.0.360   Added logic to extract and decode rot13-encoded data as in VR199717.
          Added MKV (Matroska audio/video) and ONE (OneNote section/page) file type identification.
          Extract metadata from WAV INFO list chunks: IART, ICMT, ICOP, iurl.
          Added .onepkg to valid CAB nce extension list.
6.0.355   Fixed crash in CxImage 7.0.2.
          * CxImage is a 3rd-party library which has not received an update in over 11 years.
          * The CxImageGIF::out_line() method in ximagif.cpp and related code in ximaiter.h does pointer arithmetic with a variable that could be NULL.
          * The crash is generally triggered by corrupting the GIF image frame's biWidth and/or biHeight header values such that they are higher than they should be.
          Fixed regression/crash in XZIP engine when identifying some relatively obscure ZIP-based sub-types.
6.0.354   Added logic to XZIP engine to re-process encrypted ZIP children that failed to auto-decrypt on the first pass.
          * Addresses malware campaigns that intentionally place the decrypt key in email attachment(s) such that the key does not get discovered until after the first unzip pass.
          Added file type identification for appx, appxbundle, eappx, and eappxbundle.
          Added XEAPPX engine.
          * XEAPPX can only extract non-encrypted files.
          Added PDFlib PLOP support to XPDF engine.
          * The XPDF engine now converts and extracts encrypted PDFs to functionally equivalent non-encrypted PDFs.
          Added QOI file type identification and engine to extract QOI images as BMPs.
          (Black Duck) Upgraded nlohmann json-develop json.hpp 3.9.1 to 3.10.4.
          (Black Duck) Upgraded XZUtils 5.2.3 to 5.2.5.
          (Black Duck) Upgraded pcre2 10.33 to 10.37.
          (Black Duck) Upgraded libcurl 7.64 to 7.80.
          (Black Duck) Upgraded libxls 1.5 to 1.6.2.
          (Black Duck) Upgraded libzstd 1.4.3 to 1.5.0.
          (Black Duck) Upgraded libjpeg to version 9d.
          (Black Duck) Upgraded unrar 5.8.3 to 6.1.3.
          (Black Duck) Upgraded libressl 3.0.2 to 3.4.2.
          No longer extract raw binary VERSIONINFO resource from EXEs; redundant since VERSIONINFO gets extracted in text/RC (Resource Compiler) form.
          Added appx, appxbundle, eappx, eappxbundle, qoi, tvg, vbe, vba, and xap file type identification.
          Added eappx/eappxbundle engine.
          Added qoi engine.
          Added PDFlib PLOP 5.4p4 to PDF engine to convert encrypted PDFs to their decrypted equivalent PDF for AV scanning purposes.
          Added .lz as a valid LZIP extension.
          Fixed extremely rare infinite loop in RIFF parser.
          Fixed 7ZIP engine's handling of LZIP, LZMA, and PPMD archive files; re-factored/corrected 7ZIP engine's auto-decrypt logic to handle a wider range of encrypted 7z cases.
          Fixed BCRYPT engine's auto-decrypt logic.
          Added support for detecting Log4Shell attack strings; added DetectLog4Shell XAPI configuration option (default is off).
          (Black Duck) Upgraded LZMA SDK 19.00 to 21.07.
          (Black Duck) Upgraded unrar 5.8.3 to 6.1.3.
          Enabled X7ZIP engine's support for all 7Zip compression methods that are available via LZMA SDK 21.07.
          Added XUDF engine to support extraction of Universal Disk Format images.
          Added MP3, PGF (Progressive Graphics File), PNM (Portable AnyMap and associated sub-types), and Egress SWITCH file type identification.
          Improved batch, Perl, JS, and VBS file type identification reliability.
          Fixed relatively rare (1:256) auto-decrypt bug in XZIP engine.
          Added "msftremoteobjecttargetusesie" boolean flag to JSON response; set to 1 when an XML object contains a Microsoft-specific remote object Target URL that ends with either .htm! or .html!; useful for identifying Follina attacks.
          Improved scoprd license key log output to make it clear when a license key has or has not expired and when a license key never expires.
          (Black Duck) Upgraded LZMA SDK 19.00 to 21.07.
          (Black Duck) Upgraded unrar 5.8.3 to 6.1.3.
          (Black Duck) Upgraded XZUtils 5.2.3 to 5.2.5.
          Improved JavaScript file type identification.
          * Specifically, to deal with use of a single line of JavaScript containing document.write(window.atob('<base64>'))
          Added support for extracting base64 sequences following window.atob where the base64 string is delimited by single quote characters.
          Added "location.replace" and "window.location" to JavaScript file type identification keyword list.
          Re-added "let" as both VBS and JS file type identification keyword.
          Fully enabled the XZ engine on non-Windows platforms.
          Added logic to detect when an HTML tag is missing an attribute name between LWSP and '='.
          * When detected, sets XBOW_FLAG_MALFORMED and XBOW_FLAG_SUSPICIOUS and malformoffset and malformreason to "Missing HTML tag attribute name".
          (Black Duck) Upgraded duktape 2.6.0 to 2.7.0.
          Improved HTML file type identification.
          Improved CxImage malformed image reporting infrastructure.
          Disabled XBOW_MALFORM_MISSING_HTML_TAG_ATTRIBUTE; too many FPs.
          Added logic to XPDF engine to call PDFlib TET APIs to extract all URL annotations when the ExtractAbsoluteURLs option is enabled.
6.0.294   Temporarily disabled ZIPX compression method 95 (XZ) support due to discovered instability.
          Added file type identification and details for the B1 archive file format.
          Added "pkcs7encrypted" and "pkcs7signed" properties for S/MIME objects.
          Added APK as a ZIP file sub-type.
          * This also fixes issue with APKs being identified as JARs.
          Added XAPI option DisableFastBase64=0|1.
          * If set to 1 on the xtext engine, a hardware-agnostic base64 decoder (same one used in Chromium) is used instead of the default, faster, Intel 4th gen, AVX-based assembly-language base64 decoder.
          Skip metadata extraction for text files if none of the metadata extraction options are enabled.
          * This is a performance enhancement for text files.
          Set MAXIMUM_POWERSHELL_LINES to 1000.
          * Stop analyzing text files for PowerShell keywords at that point.
          * This is a performance enhancement for large text files.
          The filename given to base64-encoded data URIs extracted from HTML documents now includes the extension associated with the URI's media type.
          Added CHECKEXTRACTLIMITS calls to all RAR extraction I/O loops to ensure the timeout limit is enforced.
          Fixed RAR 4.x auto-decrypt.
          Upgraded to PDFlib TET 5.3p3.
          Re-introduced split ZIP identification.
          Added .ntx as a valid yEnc extension.
          Added ZIP:PPAM file sub-type identification.
          Improved JavaScript file type identification.
          Added PDF auto-decrypt support.
          Re-factored the ZIP engine's auto-decrypt loop such that it runs through the very fast 1-byte password validation check first, then only decrypts the entire file when that check passes.
          * This is an auto-decrypt performance enhancement.
          * On average the 1-byte validation will match 1 out of every 256 keys attempted.
          * Only then compute the CRC32 of the decrypted data.
          * If the CRC32 matches, we know we found the correct key.
          * Otherwise the auto-decrypt loop continues.
          Completed PDF auto-decrypt support.
          PDF engine now extracts attachments and annotations in addition to images and document body text.
          Increased TAR extraction I/O buffer size from 8K to 64K to improve performance; 8x fewer loops on large TAR files.
          Added hexchars and hexvalues counts to JS object details.
          Added support for extracting QR codes out of TIFF images.
          Added WOFF and WOFF2 file type identification.
          Reduced JS file type identification FPs.
          Upgraded CentOS 7 build environment to gcc 9.3.0.  Required for CxImage TIFF library (C++11).
          Fixed XBIFF engine's extraction logic.
          Added JavaScript extraction (including via auto-decrypt) to XPDF engine.
          Re-factored icon resource extraction in XEXE engine to make it more reliable and architecturally consistent.
          Upgraded to libmspack 0.10.1alpha; high priority per BlackDuck.
          Added .xll as a valid NCE for the DLL file type.
          Fixed relatively rare temporary file leak in XOLESS engine.
          Disabled CxImage TIFF engine's warning and error handlers because they generate very undesirable message boxes on Windows and stderr output everywhere else.
6.0.226   Added support for extracting document metadata as per-object details from Office (OLE2 and new ZIP-based format) documents.
          Added EnableMetadata=0/1 option to scoprd.conf.
          * This option can be used to completely disable scoprd-metadata.csv output.
          When xapi.conf GDPRCompliantMetadata=1, OLE2, PDF, and DOCX engines only extract non-security-sensitive metadata.
          Globally added and enforced default MIN_BASE64_FILE_SIZE == 10 to prevent short base64-looking, but not actual base64 sequences from getting extracted.
          Added XAPI option MinExtractBase64Size to override the MIN_BASE64_FILE_SIZE 10 default if desired.
          Added per-object XBOW_FLAG_WRONG_EXTENSION bit-flag.  Set when the file extension does not match the primary file type.
          Added per-object secondary file type and sub-type.
          * These are set when a compound file is identified (e.g. BMP+RAR, GIF+ZIP, JPEG+ACE).
          Fixed auto-decrypt of RAR files with encrypted filenames.
          Added schemas.openxmlformats.org to scoprd excludedurllist.txt.
          If an extracted URL matches a URL in the excluded URL list, it is no longer added to the per-object URLs field.
          * Previously excluded URLs were extracted, but not followed.
          * Now they are neither extracted nor followed.
          Fixed Windows EXE VersionInfo resource extraction.
          Improved scoprd's xid.py CSV output.
          Fixed minor scoprd bug in non-Windows EraseDirectory logic.
          * It was not ignoring the '.' and '..' directory entries.
          * Although this bug was causing many file remove() errors to be logged, it was not possible for the bug to do any damage to the file system.
          Fixed scoprd auto-decrypt key management system.
          * The scoprd logic now uses the same logic as Scopr XRay.
          Added scoprd.conf option ShowUnidentifiedFiles.
          * If set to 1, the output will include unidentified files.
          * Default is 0 (i.e. unidentified files are hidden, thus improving processing times).
          The password discovery loop in the MIME engine now correctly captures the first and last words in the message body if the word is not followed by a delimiter.
          Fixed malformed JSON output when an unidentified file is squelched due to the new ShowUnidentifiedFiles option.
          Added "xbowflags" to JSON output.
          * It is more efficient to manage this one 32-bit set of bit-flags rather than each bit separately.
          * No change to the existing JSON output of the individually-named bit-flags.
          Added "attrs" to JSON output.
          * This is the same as the "attributes" field except represented as an unsigned integer instead of a hex string.
          * This simplifies queries for this field.
          * The "attributes" field remains unchanged for now, but may be removed in a future release.
          Removed the redundant/implied leading period from the "datatypeextension" field.
          Now enforcing XAPI MinExtractedBase64Size configuration option for JavaScript and XML.
          Added a per-object GDPRDetails string.
          * If the XAPI GDPRCompliantMetadata option is disabled (0), this field contains any extracted metadata that is considered security-sensitive.
          * If the XAPI GDPRCompliantMetadata option is enabled (1), this field will be empty (i.e. security-sensitive data is not extracted).
          Added "gdprdetails" to JSON output.
          * Added a similar field to CSV, HTML, and XML output.
          * Same semi-colon-delimited format as "details".
          Added extraction of custom properties from OOXML documents as per-object "gdprdetails" because there is no way to determine which custom properties are security-sensitive.
          * Thus, all custom properties are security-sensitive.
          * The syntax of extracted custom properties is "customproperty_<Custom Property Name>=<Custom Property Value>;".
          Treat the last underscore in a filename as a possible double-extension separator (i.e. the next-to-last extension is assumed to follow the underscore).
          Improved OOXML document file type identification.
          Disabled XZIP engine's ZIP 1.0 logic since there is no reliable way to tell when encrypted ZIPs were created with a 2-byte ZIP 1.0 password validation or 1-byte.
          * Thus auto-decrypt is again susceptible to a 1 in 256 chance of using the wrong auto-decrypt key.
          * The will be addressed in a future release.
          Replaced error-prone manually-generated scoprd JSON output with nlohmann JSON library.
          Added --no-clobber switch to cp command when installing scoprd *.conf files.
          * Thus existing scoprd.conf and xapi.conf files will not get overwritten during installation.
          Added getconf:[.conf filename] command to scoprd protocol.
          * Returns the default /usr/local/scopr/scoprd/xapi.conf or specified XAPI configuration file in JSON form.
          Added getconf.py sample script to scoprd installation.
          Added xapi_docsonly.conf sample configuration to scoprd installation.
          Added .tif as a legitimate TIFF extension.
          Moved output of successful auto-decrypt key to gdprdetails field.
          Fixed ZIP engine's handling of STORED AES-encrypted files.
          * It was extracting 10 more bytes than it should because it was not accounting for the 10 bytes of AES authentication data that follows the raw file data.
          Added logic to extract custom properties from OLE2 DocumentSummaryInformation streams.
          Added logic to MIME extraction engine to identify text/html utf-8 attachments containing a contiguous sequence of at least 5,462 high-ASCII characters as Suspicious:CVE-2020-16497.
          Added PDF custom property extraction.
          Fixed custom property extraction for OLE2 documents created on non-Windows platforms.
          Fixed a PDF extraction bug that would sometimes cause PDF processing to stop prematurely.
          Changed the custom property name prefix to "cp_".
          Added KeyLengthWeights option to scoprd.conf.
          * Scales the auto-decrypt score for each key based on the key's length.
          Added KeyContextKeywords option to scoprd.conf.
          * Comma-delimited list of lowercase strings that are commonly expected to be within MaxKeyContext keys of the actual auto-decrypt key.
          * The auto-decrypt score for keys surrounding keys that contain one or more of the context keywords is increased by the inverse of the distance betweeen the two.
          Added MaxKeyContext option to scoprd.conf.
          This is the maximum number of previously discovered auto-decrypt keys on either side of an auto-decrypt key to consider more probable (using the inverse of the proximity) than others.
          Improved password discovery logic to ignore all HTML tag sequences and thus only look for passwords between HTML tags.
          * This considerably reduces the average number of auto-decrypt attempts needed before finding the correct key.
          The ZIP engine was artificially limited to 80 character passwords (hard-coded in the legacy InfoZIP logic).
          * Increased this to 256 characters - which is the maximum auto-decrypt password length scoprd is capable of supporting.
          Discovered that the legacy InfoZIP implementation limits the number of bytes at the end of the ZIP file it searches for the central directory structure.
          * Its default behavior is to scan only up to 66000 bytes at the end of the file.
          * ZIP examples exist where extraction fails because of this.
          * Changed the logic so that it will now search the entire ZIP file.
          scoprd no longer changes the scoprd-metadata.csv filename to "x" when the XAPI GDPRCompliantMetadata option is enabled.
          The XZIP engine now supports the zstd compression method.
          Added a new XZSTD engine that supports extracting .zst zstd-compressed files.
          Added compressionmethodid detail to ZIP objects; useful for recognizing ZIPs using unsupported or undocumented compression methods.
          Added support for recognizing undocumented ZIP compression method 92.
          * Since ZIP compression method 92 is undocumented, calling it "duplicatesha1" because that is literally all it is - the 20-byte SHA1 of a file that is present earlier in the ZIP.
          Added per-object "xbowproperties" 32-bit bit-field value to scoprd JSON response.
          Fixed OLE2 Unicode custom property extraction crash.
          Added prototype BIFF engine for extracting images and URLs from OLE2 XLS Workbook streams.
          Added a "keys:" parameter to scoprd protocol.
          * Allows scoprd clients to pass a line-feed-delimited list of keys to try during auto-decrypt operations.
           Added a MinBase64Size XAPI option to control the minimum length (in bytes) of valid base64 sequences to extract.
          Always set XBOW_FLAG_ENCRYPTED and XBOW_FLAG_NO_KEY for PGP and MCrypt files since auto-decrypt is not supported for either.
          The ZIP engine now reports the existence of encrypted children even if they fail to auto-decrypt.
          Added .asc and .sig as valid PGP extensions.
          Fixed OLE2 EncryptedPackage processing when auto-decrypt fails.
          Added initial minimal support for reporting new per-object flag XBOW_FLAG_HAS_ENCRYPTED_CHILDREN.
          * This flag is very useful in queries as it identifies the container files that have encrypted content - as opposed to the encrypted files themselves which are identified by XBOW_FLAG_ENCRYPTED.
          Added a hard-coded 15 second timeout to scoprd's ComputeHashes function.
          * This prevents huge files from tying up a scoprd instance for a lengthy period of CPU-bound time.
          Fixed an infinite loop in the RAR engine's ReadHeader50 and ReadHeader15 logic when auto-decrypting RARs with encrypted headers.
          Completely disabled the CXJavaScript::extract_string_fromcharcode method due to it causing an infinite loop.
          * This method will be re-enabled later after an appropriate fix has been verified.
          Fixed infinite loops in the XTEXT engine's XML parsing logic that extracts various types of scripts out of XML documents.
          Added support for the "keys:<filepath>:" protocol parameter.
          Added a key list filepath parameter to scoprc: "scoprc <servername> <port> <key list filepath> <filepath to process>".
          * The key list filepath need not exist.
          * Scoprd ignores (but logs) any key list filepaths that it can not access or open.
          Fixed handling of the MinExtractedBase64Size XAPI option when extracting base64 sequences out of XML documents.
          * The option is now enforced correctly in this scenario.
          Added scoprd configuration options KeysFilepathPrefix and ProcessFilepathPrefix to lock down the locations that are allowed to be used in the scoprd protocol for the respective fully-qualified filepaths.
          * These options should be used to prevent unexpected and malicious filepaths from being opened by scoprd.
          Fixed an infinite loop in the AVI and WAV (a.k.a. RIFF) file type identification logic.
          * In rare cases it would get stuck trying to read the next RIFF chunk (a 4 byte RIFF chunk ID) when the EOF has already been reached.
          For encrypted OLE2 documents, XBOW_FLAG_HAS_ENCRYPTED_CHILDREN is now set on the EncryptedPackage stream's parent object - which is typically the OLE2 document itself.
          Removed XBOW_FLAG_NO_KEY from processing of raw EncryptedPackage streams - where no auto-decrypt is even attempted.
          Fixed VDI file type identification.
          Fixed 1-byte buffer overflow in ZIP engine's auto-decrypt logic if the key to try is 256 characters long.
          Set XBOW_FLAG_HAS_ENCRYPTED_CHILDREN on the OLE2 EncryptedPackage stream's parent - which is typically the OLE2 document itself.
          Added slx, slxc, slxp to allowed ZIP extensions.
          Added mexw32, mexw64 to allowed DLL extensions.
          After a successful auto-decrypt of an OLE2 EncryptedPackage stream, do not set XBOW_FLAG_ENCRYPTED on the resulting DecryptedPackage stream.
          Do not set XBOW_FLAG_NO_KEY when processing raw OLE2 EncryptedPackage streams - where no auto-decrypt is even attempted.
          Fixed infinite loop in WAV and AVI (RIFF) file type identification.
          Fixed possible infinite or very long loops in PDF engine.
          Fixed VDI file type identification.
          Fixed high impact, but extreme end-case buffer overrun issue in ZIP auto-decrypt logic.
          Added diagcab, nupkg, nupack, and xdp as allowed extensions.
          Added auto-decrypt performance metrics: autodecryptattempts, autodecrypttime, autodecryptaverage, autodecryptslowest, autodecryptfastest, and keynotfound.
          Fixed an infinite loop in the XTEXT engine's PowerShell parser.
          Added ACE auto-decrypt key to GDPR details.
          Added ALZip auto-decrypt key to GDPR details.
          Added XAPI option MaxAutoDecryptAttempts.
          Fixed extraction of LZMA and LZMA2-compressed DAA, ZIP, and 7ZIP files.
          Improved malformed PDF file type identification and processing.
          Upgraded to PDFlib TET 5.2.0.  Includes fixes for numerous PDF parser stability issues.
          Added XBOW_SUBTYPE_PAC to support the JS:PAC (Proxy Auto-Configuration) file type.
          Fixed small buffer overflow in RIFF parser.
          Disabled MP3 file type identification due to its inaccurracy.  Needs work.
          Updated TET logging options in scoprd.
          Improved RAR file type identification to identify RAR headers at offsets up to 8K - 7.
          Upgraded to PDFlib TET 5.2.10.
          Added XBOW_LOGGING_PDFLIB_TET XAPI configuration option. If set, PDFlib TET logging is enabled in the XPDF engine.
          Added XBOW_FLAG_MIME_BODY_PART to indicate when an extracted object is a MIME body part as opposed to a MIME attachment.
          scoprd.conf options ShowUnidentifiedFiles and ShowUnidentifiedStreams can now be set to 2 to only show files or streams that are both encrypted and unidentified in the JSON response, omitting all other unidentified objects.
          If auto-decrypt fails for an encrypted object, XBOW_FLAG_HAS_ENCRYPTED_CHILDREN must still be set on the parent object.
          Fixed infinite loop in 3D Studio Max (.3ds) file type identification.
          Fixed setting of XBOW_FLAG_HAS_ENCRYPTED_CHILDREN.
          Fixed regression introduced on Sept. 6, 2020 where RAR auto-decrypt only worked for the first encrypted file if all subsequent files use the same key.
          Show/enumerate encrypted RAR files even if they cannot be auto-decrypted.
          Added support for detecting suspiciously long underscore sequences as double-extension separators.
          Added logic to set the auto-decrypt key on OLE2 DecryptedPackage objects.
          Improved asterisk-delimited auto-decrypt key discovery logic.
          Fixed how the X7ZIP engine sets the XBOW_FLAG_HAS_ENCRYPTED_CHILDREN flag.
          Added logging of PDFlib TET open_document I/O to help track down the cause of a rare infinite loop.
          Added HTML entity decoding for strings extracted out of HTML that are expected to be URLs.
          When extracting EXE resource images via the XEXE engine, if a RESTYPE_BITMAP resource has a PNG header, use .png for the object's extension instead of .bmp.
          Upgraded to PDFlib TET 5.3.0 which includes a new timeout option.
          Added a timeout check to XPDF engine's PDF body text extraction loop that calls PDFlib TET's get_text() API which fails to return an empty string in some rare cases.
          Added getver: option to scoprd protocol to have scoprd return its version number.
          Added horizqtr=1-4 and vertqtr=1-4 details of first bar/QR code symbol.
          Added size of first bar/QR code symbol's bounding box to object's details.
          Switched to case-insensitive comparison of special OLE2 stream names like Ole10Native to match MS-Office behavior.
          Added initial support for extracting ZIPX compression method 95 (XZ).
          Added call to CHECKEXTRACTLIMITS in pdf-text extraction loop that calls PDFlib TET get_text().
          * Prevents long or infinite loops if get_text() fails to return an empty string in a timely fashion (extremely rare, but happens).
          Improved URL extraction from HTML attributes.
          * Specifically malicious URL obfuscation cases where bytes in the range 0-32 come immediately before or after the URL (browsers ignore these bytes).
          Due to malicious encrypted files, the characters '?' and '@' are no longer treated as password delimiters by the key discovery logic.
          If the MIME child object's Content-Disposition is not "attachment", set the child object's XBOW_FLAG_MIME_BODY_PART flag.
          Set some reasonable defaults for the two XAPI base64 extraction options in case the options are not specified via an XAPI config file.
          Improved ZIP auto-decrypt reliability.
          * Addresses rare cases where an incorrect key gets used.
          Added <br as HTML identifier.
          Reduced set of identified VBScript keywords to reduce VBS identification FPs.
          Added XBOW_FLAG_HAS_DECRYPTED_CHILDREN and associated hasdecryptedchildren per-object JSON response field.
          * Set on parent objects when one or more of their encrypted child objects has been successfully decrypted.
          Added <a, <p, and <img as HTML identifiers.
          Added logic to set XBOW_FLAG_REDIRECT on JavaScript files that contain auto-redirect URLs.
          Upgraded duktape 2.4 JavaScript parser to 2.6.
          Increased size of HTML parsing buffer to 1MB.
          * This enables extraction of JavaScript sections up to that size.
          * Future work is to re-factor the HTML engine's logic to extract scripts of any size.
          Reduced VBScript and JavaScript file type identification FPs further by removing more common words.
          Improved WIM/SWM file type identification.
          Added obsolete BAG archive file type identification.
          Added or improved file type identifications for APPLESINGLE, APPLEDOUBLE, CRX, EOT, FLAC, FLIC, HA, HYP, KGB, MIDI, RZIP, and XCF to more closely match VirusTotal capability.
          Added support for XBOW_FLAG_OBFUSCATED; indicates when a script contains obvious obfuscation logic.
          Added support for XBOW_FLAG_SPLIT_ARCHIVE; indicates when an archive object is one of two or more split archive volumes.
          Added support for XBOW_FLAG_SPLIT_ARCHIVE_BEFORE and XBOW_FLAG_SPLIT_ARCHIVE_AFTER; indicates when the data associated with a child object within a split archive volume spans two or more archive volumes and in which direction(s) the split occurs.
          Bad actors have discovered that MS-Word will load XML documents (normally having a .xml extension) if they are renamed to have a .doc extension
          * Thus, added .doc as a valid NCE for the XML file type.
          Added logic to extract <w:binData> base64-encoded objects and their names from XML MS-Word documents.
          Upgraded to PDFlib TET 5.3p2.
          Various NCE updates.
5.0.1309  Added Bitcoin address extraction support to XAPI.
5.0.1301  Added XBOW_TYPE_RPMSG, associated Restricted Permission Message (RPMSG) file type identification and
          extraction engine.  Note: Auto-decrypt of the DRMContent stream is not supported.
5.0.1291  Added XBOW_TYPE_SYLK and associated Microsoft Symbolic Link (SYLK) file type identification.
5.0.1274  Added ALZip auto-decrypt support.
5.0.1270  Added XBOW_SUBTYPE_FPX and associated OLE2 object subtype identification for the Kodak FlashPix file format.
5.0.1244  Added Quick Response (QR) code extraction engine.
5.0.1235  Added XAPI configuration options:
          * MaxExtractTotalSize
          * MaxTotalItems
5.0.1233  Added XBOW_TYPE_PYC and Python bytecode file type identification.
5.0.1228  Added XAPI configuration options:
          * GDPRCompliantMetadata
          * OLESSParallelDecryptedPackage
5.0.1216  Added Direct Access Archive (DAA) extraction engine.
5.0.1212  Added XBOW_TYPE_DAA and Direct Access Archive file type identification.
5.0.1199  Added decode and extract of JavaScript string arrays via Text Extract Hex option.
5.0.1197  Added RTF metadata and URL extraction.
5.0.1192  Fixed XRay's reported processing elapsed time.
5.0.1177  Improved RTF hex data extraction.
5.0.1173  Added xbowddd.cgi query string parameters documentation page.
5.0.1169  Added extraction of base64-encoded objects out of window.atob("<base64>") JavaScript code.
          Added extraction of <script> blocks out of HTML.
5.0.1161  Added normalizedname and datatypeextension to JSON and XML output.
5.0.1159  Always append the file-type-correct extension to the end of the random filename link.
          This guarantees that all returned URL links to the extracted data use the file-type-correct extension.
          Thus applications that use these URLs to download extracted files can depend on the extension in their
          decision-making.
5.0.1156  Added per-user enable/disable URL processing option to XRay.
          Can only be changed by an XRay administrator.
5.0.1141  Added PDF and JPG links to XRay JSON output.
5.0.1135  Fixed XRay per-user key handling.
5.0.1122  Added XRay server log display pages - for administrators only.
5.0.1110  Improved XRay user interface.  Removed use of HTML frames.
5.0.1106  Duplicated all XRay local file processing options on URL processing page.
5.0.1103  Added heuristic detection of malicious sequences of periods before the file extension.
5.0.1073  Added XRay xbowddd.ini max log size options:
          * MaxLogSize
          * MaxXAPILogSize
5.0.1068  Added excludedurllist.txt and onetimeuseurlkeywordslist.txt to XRay configuration.
          Added EnableVirusTotalURL XRay configuration option.
5.0.1065  Added ClamAV scan of all processed objects to XRay.
5.0.1050  Added XRay metadata extraction options for email addresses, IPv4 addresses, phone numbers, credit card
          numbers, and URLs.
5.0.1039  Added subsystem to EXE/DLL object details.
5.0.1010  Decode and extract HTML entities in Microsoft Office online video embeddedHTML objects.
5.0.1007  Added XBOW_TYPE_AU3 and AutoIt3 file type identification.
5.0.1006  Removed non-Windows x86 downloads from XRay.
5.0.1005  Added SHA-1 and SHA-256 hashes to XRay Show Activity page.
5.0.1004  Added XRay per-user option to show/hide unidentified data streams.
5.0.1003  Added xid -i option to report per-engine information: version, file type, engine name, module name,
          and license info.
          Auto-decrypt read-only Excel documents using Microsoft Excel's default password "VelvetSweatshop".
5.0.1002  Added APM, Intel Hex, PPMD, QCOW, VHD, VHDX, FreeARC, BH, PA, ZPAQ and FLV file type identification.
          Added FTP/HTTP URL processing XAPI configuration options:
          * MaxHTTPCacheSize
          * MaxFTPCacheSize
          * ConnectTimeout
          * ReceiveTimeout
          * SendTimeout
          * DataReceiveTimeout
          * DataSendTimeout
          * MaxFTPIndexMemorySize
          * MaxHTTPIndexMemorySize
          Added Intel Hex and VHD file extraction engines.
          Fixed CPIO binary file extraction.
5.0.1000  Added support for inline processing extracted URLs.
          Added XAPI options to control the maximum number of URLs processed per object and per object chain.
          Added button to clear user's HTTP/FTP cache.
          Added logic to prevent recursive URL processing loops.
          Added extraction of ftp:// and sftp:// URLs.
          Added username and password support for FTP URLs.
          Added link to VirusTotal graph in SHA-256 column.
5.0.998   Added xid option -m to scan all processed files with AntiMalware Cloud Scan.
          Extract all extractable OLESS streams up to the point where the FAT chain is invalid/broken rather than
          aborting the entire extraction process.
          Added Depth column to HTML output.
          Added support for cloud scan URL options in XRay xbowddd.ini:
          * AntiMalwareCloudScanURL1
          * AntiMalwareCloudScanURL2
          Added .vba as an allowed extension for XBOW_TYPE_VBSCRIPT.
5.0.997   Improved extraction of hex objects from RTF files.
          Improved URL extraction.
          Added option to show/hide extracted URLs in HTML view.
          Added PDF details.
          Added extraction of embedded images from PDF files.
5.0.996   Added URL extraction from TEXT, HTML, MIME bodies, and some XML files.
          Improved how encryption keys are shown in object details.
          Minimum length of automatically discovered encryption keys reduced from 4 to 3 characters.
          Upgraded liblzma and XZUtils to 5.2.3.
5.0.995   Added file type identification:
          * Microsoft InternetShortcut .url with associated target URL extracted as a detail.
5.0.992   Improved ISO-14496 file type identification.
          Improved base64 extraction.
          Improved hex extraction.
          Fixed ZIP AES-encrypted+stored file extraction.
          Upgraded 7zip engine to LZMA SDK 18.01.
5.0.990   Added file type identification:
          * DDS
          * INDD
          * MOV/QT
          * OLESS:MPP
          * VCF
          * OLESS:VSD
          * ZIP:VSDX
5.0.980   Added file type identification:
          * Mach-O
          * Mach-O Universal Binary
          * BALZ
          * EGG
          * ISZ
          * LZ4X
          * PEA
          * ZPAQ
          * ZIP:HWP
          * ZIP:IPA
          * ZIP:XPS
          Improved UUE and XXE extraction engines.
5.0.950   Added extraction of VBA scripts from OLESS documents.
5.0.910   Added "Show Activity" button on XRay "My Account" page.  Shows statistics for all files processed
          by your user account.
5.0.900   Added file type identification for:
          * ARM Image Format (.aif)
          * AutoCAD Drawing Exchange Format (.dxf)
          * Audio Interchange File Format (.aiff)
          * Audio Interchange File Format - Compressed (.aifc)
          * Packet Capture (.pcap)
          * True Type Font (.ttf)
          Added Content-Transfer-Encoding to extracted MIME object's details as "encoding".
5.0.800   Microsoft Icon/Cursor (.ico/.cur) file type identification and extraction engine.
5.0.700   Added file type identification for:
          * AxCrypt (.axx)
          * Digital Imaging and Communications in Medicine (.dcm)
          * iCalendar (.ics)
          * vCalendar (.vcs)
          * TIFF (.tif)
          * AutoCAD Drawing (.dwg)
          Added file sub-type identification for ZIP-based PowerPoint, Excel, and Word file formats
          (e.g. .pptx, .xlsx, .docx).
          Added file sub-type identification for Java Archives (.jar).
          Text engine extracts hex-encoded sequences of 32 bytes or more (e.g. "090AF0FF...").
          Text engine extracts Base64-encoded embedded objects out of HTML.
5.0.610   Added Nested Container Extensions (NCE) path to object details.
          Added support for displaying UTF-8 object names.
          Added option to show the encryption key/password for encrypted objects.
          Infected objects are highlighted with a red background in the HTML view.
          Upgraded RAR engine to the unrar 5.5.7 source code level.
5.0.540   Added extraction of base64-encoded pkg:binaryData objects out of XML documents.
          Added two new XRay output formats:
          * JSON (JavaScript Object Notation)
          * HTML Zoomable Bubble Graph
5.0.530   Added LZip and LZMA file extraction engines.
5.0.520   Added PHP Archive (PHAR) file extraction engine.
5.0.510   Added ActiveMime, Snappy, WARC, XZ, XXE and ACE v2 file extraction engines.
          Added password-protected ACE and Microsoft Office document extraction.
          Added automatic decryption of password-protected mail attachments if the correct password is found
          within the body of the mail message.
          New per-user options:
          * Extract text from Microsoft Word documents.
          * Extract text from PDF documents.
          * Redact sensitive information from MIME message header fields.
          * Redact the MIME message body.
          * Redact all MIME message attachments.
          Added embedded HTML img src base64 object extraction.
          Added Javascript file type identification
          Added Javascript, Python, VBScript, and Perl script extraction out of Windows Script File (WSF) XML documents.
5.0.371   Added ALZip, LHA, RTF, TNEF, and ISO 9660 CD/DVD image extraction.
          Added password-protected 7Zip extraction.