Korean Encoding Issue in P4Python with Non-Unicode Perforce
This content was translated from Korean using AI.

Non-Unicode Perforce

Perforce servers can be configured to operate in either Unicode or Non-Unicode mode. By default, they are set to Non-Unicode mode upon installation, and switching to Unicode mode after a long period of operation can be quite challenging. For more details, refer to the Set up a server for Unicode | P4 Server Administration Documentation (2025.1).

Therefore, when dealing with languages that use characters outside of English, such as Korean, it is generally advisable to set up Perforce in Unicode mode from the start for peace of mind.

P4Python

In addition to the GUI environment of P4V and the command-line interface of P4, Perforce provides APIs that allow you to use its features in various programming languages, including C/C++, C#, and Python.

Among these, P4Python is the Python API that wraps the C++ API for use in Python. You can find more information in the Home | P4 API for Python Documentation (2025.1).

Using this API, you can access P4 functionalities in Python as shown below:

from P4 import P4, P4Exception  
  
p4 = P4()  
  
p4.port = "127.0.0.1:1666"  
p4.user = "HAPPY"  
p4.client = "HAPPY_test"
  
try:  
    p4.connect()  
    result = p4.run("fstat", "//...")  
    print(result)  
    p4.disconnect()  
except P4Exception:  
    for e in p4.errors:  
        print(e)

Problematic Configuration

Server

OS: Windows Server
Locale: ko; Korean (Unicode UTF-8 support disabled)
Non-Unicode mode

Client

OS: Windows 10
Locale: ko; Korean (Unicode UTF-8 support disabled)

Encoding Issue

When uploading files with Korean paths to a Non-Unicode server and processing them with P4Python, the resulting encoding may become corrupted, or the commands may fail.

3_encoding_failed.png

In such cases, you can specify the encoding to resolve the issue as follows:

p4 = P4()  
p4.port = "127.0.0.1:1666"  
p4.user = "HAPPY"  
p4.client = "HAPPY_test"  
p4.encoding = 'cp949'        # Specify encoding

3_encoding_success.png

The official documentation briefly states that you should specify the encoding for receiving strings from a Non-Unicode server:

encoding

Encoding to use when receiving strings from a non-Unicode server. If unset, use UTF8.
Can be set to a legal Python encoding, or to `raw` to receive Python bytes instead of Unicode strings. Requires Python 3.

Additional Issues

While I couldn't find a precise case due to lack of documentation, there were instances where specifying the encoding still resulted in command failures.

Since there were no issues with the command-line tool P4, I also adopted a method of wrapping P4 command executions. Using P4's output formatting options (-Ztag) makes parsing the output a bit less cumbersome.

Formatting P4 command output using the -F global option with examples