[SOLVED] Getting a std::string or C string from a QString representing an arbitrary filename on Windows

Issue

I’m using QFileDialog::getOpenFileName() to have the user select a file, but I need the result to be a C string, since I have to pass it to something written in C which uses fopen(). I cannot change this.

The problem I’m finding is that, on Windows/MinGW, using toStdString() on the resulting QString doesn’t work well with Unicode/non-ASCII filenames. Trying to open the file based on the std::string fails, because some character set conversion seems to be occurring. Sometimes using toLocal8Bit() to convert works, but sometimes it doesn’t.

Consider the following (MinGW) program:

#include <cstdio>
#include <iostream>

#include <QApplication>
#include <QFileDialog>
#include <QFile>

int main(int argc, char **argv)
{
    QApplication app(argc, argv);
    auto filename = QFileDialog::getOpenFileName();
    QFile f(filename);

    std::cout << "fopen: " << (std::fopen(filename.toStdString().c_str(), "r") != nullptr) << std::endl;
    std::cout << "fopen (local8bit): " << (std::fopen(filename.toLocal8Bit().data(), "r") != nullptr) << std::endl;
    std::cout << "Qt can open: " << f.open(QIODevice::ReadOnly) << std::endl;
}
  • For a file called ☢.txt, toStdString() works, local8Bit() doesn’t.
  • For a file called ä.txt, toStdString() doesn’t work, local8Bit() does.
  • For a file called Ȁ.txt, neither works.

In all cases, though, QFile is able to open the file. I suppose it’s probably using Unicode Windows functions while the C code is using fopen(), which, to my understanding is a so-called ANSI function on Windows. But is there any way to get a “bag of bytes”, so to speak, from a QString? I don’t care about the encoding of the filename, I just want something that can be passed to fopen() to open the file.

I’ve found that using GetShortPathName to get a short filename from filename.toWCharArray() seems to work, but that’s very cumbersome, and my understanding is that NTFS filesystems can be told not to support short names, so it’s not a viable solution in general anyway.

Solution

File paths in the non-unicode API of Windows are either parsed in the current ANSI (Microsoft codec) codepage, or in the OEM codepage (see also https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfope). ANSI is the default.

So your question translates to: How can I convert a UTF-8 or UTF-16 string to ANSI or OEM?

There’s an answer for the ANSI conversion: How to convert from UTF-8 to ANSI using standard c++

Anyhow, it’s important to realize that not all UTF strings can be represented in these more narrow codecs…

Answered By – kkoehne

Answer Checked By – Clifford M. (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *