Skip to content

Unicode error converting Microsoft Documentation PDF to markdown file. #1597

@perpetualconflict

Description

@perpetualconflict

https://learn.microsoft.com/en-us/windows/win32/direct3d12/direct3d-12-graphics click "download PDF' in the bottom left. file too large to upload here.

Terminal output:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "...\Roaming\Python\Python314\Scripts\markitdown.exe\__main__.py", line 6, in <module>
    sys.exit(main())
             ~~~~^^
  File "...\Roaming\Python\Python314\site-packages\markitdown\__main__.py", line 93, in main
    _handle_output(args, result)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "...\Roaming\Python\Python314\site-packages\markitdown\__main__.py", line 102, in _handle_output
    print(result.text_content)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python314\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\uff89' in position 425: character maps to <undefined>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions