C++ std::isspace 活用術：文字列トリム・単語カウントの具体例

2025-05-31

std::isspace は、C++標準ライブラリの <cctype> ヘッダ（C言語の <ctype.h> に対応）で定義されている関数です。この関数は、与えられた文字が「空白文字（whitespace character）」であるかどうかを判定するために使用されます。

目的

std::isspace の主な目的は、文字列のパース（解析）やテキスト処理において、単語の区切りや行の区切りを判断する際に、特定の文字が空白として扱われるべきかどうかを簡単に判別することです。

判定される空白文字の種類

デフォルトのCロケール（"C" locale）では、以下の文字が空白文字として認識されます。

復帰（キャリッジリターン） ('\r', 0x0d)
フォームフィード ('\f', 0x0c)
垂直タブ ('\v', 0x0b)
改行 ('\n', 0x0a)
水平タブ ('\t', 0x09)
スペース (' ', 0x20)

注意点
ロケール（地域設定）によっては、上記以外の文字も空白文字として扱われる場合があります。例えば、一部のロケールでは全角スペースも空白とみなされることがあります。

関数の使い方

std::isspace は、検査したい文字を int 型として引数に取ります。

#include <cctype> // std::isspace を使うために必要
#include <iostream>

int main() {
    char ch1 = ' ';     // 半角スペース
    char ch2 = '\t';    // タブ文字
    char ch3 = '\n';    // 改行文字
    char ch4 = 'A';     // アルファベット
    char ch5 = '1';     // 数字

    // isspace は、空白文字であれば非ゼロ（真）、そうでなければゼロ（偽）を返します。
    if (std::isspace(ch1)) {
        std::cout << "'" << ch1 << "' は空白文字です。" << std::endl;
    } else {
        std::cout << "'" << ch1 << "' は空白文字ではありません。" << std::endl;
    }

    if (std::isspace(ch2)) {
        std::cout << "'\\t' は空白文字です。" << std::endl;
    } else {
        std::cout << "'\\t' は空白文字ではありません。" << std::endl;
    }

    if (std::isspace(ch3)) {
        std::cout << "'\\n' は空白文字です。" << std::endl;
    } else {
        std::cout << "'\\n' は空白文字ではありません。" << std::endl;
    }

    if (std::isspace(ch4)) {
        std::cout << "'" << ch4 << "' は空白文字です。" << std::endl;
    } else {
        std::cout << "'" << ch4 << "' は空白文字ではありません。" << std::endl;
    }

    if (std::isspace(ch5)) {
        std::cout << "'" << ch5 << "' は空白文字です。" << std::endl;
    } else {
        std::cout << "'" << ch5 << "' は空白文字ではありません。" << std::endl;
    }

    return 0;
}

戻り値

空白文字でないと判定された場合、ゼロを返します（論理的には false と解釈されます）。
引数 ch が空白文字であると判定された場合、非ゼロの値を返します（論理的には true と解釈されます）。

注意点

ロケール
前述の通り、std::isspace の動作は現在のCロケールに依存します。異なるロケールでの挙動が必要な場合は、<locale> ヘッダにあるロケールを考慮したオーバーロード（std::isspace(CharT ch, const std::locale& loc)）を使用するか、std::locale::global() でグローバルロケールを設定することを検討してください。
引数の型
std::isspace は int 型を引数に取ります。char 型の文字を渡す場合、自動的に int 型に昇格されます。ただし、char が符号付き（signed char）であるシステムの場合、負の値を持つ文字が渡されると未定義動作（undefined behavior）になる可能性があります。この問題を避けるためには、unsigned char にキャストしてから渡すのが安全なプラクティスです。
```
char c = some_char_value;
if (std::isspace(static_cast<unsigned char>(c))) {
    // ...
}
```

std::isspace は非常に便利な関数ですが、使用方法を誤ると予期せぬ動作やバグを引き起こすことがあります。ここでは、よくあるエラーとその解決策について説明します。

例

#include <cctype>
#include <iostream>
#include <string>

int main() {
    char zenkaku_space = '　'; // 全角スペース（環境によっては正しく扱えない場合がある）
    char half_width_space = ' '; // 半角スペース

    // このコードでは、全角スペースが空白と判定されない可能性があります。
    if (std::isspace(zenkaku_space)) {
        std::cout << "全角スペースは空白です。" << std::endl;
    } else {
        std::cout << "全角スペースは空白ではありません。" << std::endl; // これが表示されることが多い
    }

    if (std::isspace(half_width_space)) {
        std::cout << "半角スペースは空白です。" << std::endl;
    } else {
        std::cout << "半角スペースは空白ではありません。" << std::endl;
    }

    return 0;
}

トラブルシューティング/解決策

ロケールの設定
std::isspace の動作はロケールに依存します。適切なロケール（例えば、システムのデフォルトロケール）を設定することで、そのロケールで定義されている空白文字が正しく認識されるようになります。

#include <cctype>
#include <iostream>
#include <string>
#include <locale> // std::locale を使うために必要

int main() {
    // 現在のユーザー環境のロケールを設定する
    // これにより、例えばja_JP.UTF-8のようなロケールが設定されれば、
    // そのロケールで定義された空白文字（全角スペースなど）が正しく認識される可能性があります。
    std::locale::global(std::locale(""));
    // または特定のロケールを指定: std::locale::global(std::locale("ja_JP.UTF-8"));

    // ロケール設定後の isspace を使う場合
    // std::isspace のオーバーロード（ロケールオブジェクトを受け取る版）を使用するのがより安全です。
    // std::isspace(int ch, const std::locale& loc)
    // もしくは、ストリーム操作などロケール設定の影響を受ける関数に渡す。
    // 単純なcharの判定なら、グローバルロケールを設定するだけでも効果がある場合もありますが、
    // 推奨はロケールを受け取るオーバーロードの使用です。

    // 例: std::locale を明示的にisspaceに渡す方法 (C++11以降)
    std::string s = "これは 全角スペースの テストです。";
    std::string result_string;
    std::locale current_locale(""); // 現在のユーザー環境のロケールを取得

    for (char c : s) {
        // char を unsigned char にキャストしてから渡すのが安全
        if (!std::isspace(static_cast<unsigned char>(c), current_locale)) {
            result_string += c;
        } else {
            result_string += '_'; // 空白をアンダースコアに置換
        }
    }
    std::cout << "ロケール考慮後の変換結果: " << result_string << std::endl;

    return 0;
}

注意
全角スペースのようなマルチバイト文字は、char 1バイトでは表現できません。std::string や char で1文字ずつ扱うと、文字の途中でバイトが区切られてしまい、isspace が誤った判定をする可能性があります。マルチバイト文字を正確に扱うには、wchar_t や std::wstring を使用し、std::iswspace（ワイド文字版）や、std::codecvt といったより高度な文字エンコーディングの知識が必要です。

エラー: 引数の型に関する未定義動作

問題点
std::isspace の引数は int 型です。char 型の変数をそのまま渡す場合、char が signed char として扱われるシステム（多くの環境でこれがデフォルトです）では、-1 など負の値を持つ文字が std::isspace に渡されると、未定義動作（Undefined Behavior）を引き起こす可能性があります。これは、EOF（End-of-File）を表す値が通常 -1 であり、isspace は EOF のチェックも含むためです。

例

#include <cctype>
#include <iostream>

int main() {
    char ch = -50; // 負の値を持つchar (ASCII範囲外)

    // このまま渡すと未定義動作の可能性あり
    if (std::isspace(ch)) {
        std::cout << "未定義動作の可能性あり" << std::endl;
    } else {
        std::cout << "未定義動作の可能性あり" << std::endl;
    }

    return 0;
}

トラブルシューティング/解決策
char 型の文字を unsigned char にキャストしてから int に昇格させることで、この問題を回避できます。unsigned char は常に0以上の値を取るため、EOF と衝突する心配がありません。

#include <cctype>
#include <iostream>

int main() {
    char ch1 = 'A';
    char ch2 = (char)200; // ASCII範囲外のchar (負の値になる可能性あり)

    // 安全な渡し方
    if (std::isspace(static_cast<unsigned char>(ch1))) {
        std::cout << "'" << ch1 << "' は空白文字です。" << std::endl;
    } else {
        std::cout << "'" << ch1 << "' は空白文字ではありません。" << std::endl;
    }

    if (std::isspace(static_cast<unsigned char>(ch2))) {
        std::cout << "コード200の文字は空白文字です。" << std::endl;
    } else {
        std::cout << "コード200の文字は空白文字ではありません。" << std::endl;
    }

    return 0;
}

エラー: 想定外の文字が空白と判定される/されない

問題点
ロケール設定によっては、プログラマが想定していない文字が空白と判定されたり、逆に空白と認識してほしい文字が認識されないことがあります。特に、Cロケール以外のロケールを使用している場合や、異なるシステムで実行する場合に発生しがちです。

例
国際化されたシステムでは、非ASCIIの空白文字（例：ノーブレークスペース U+00A0）が存在する場合がありますが、Cロケールではこれらが空白と認識されないことがあります。

トラブルシューティング/解決策

ホワイトリスト/ブラックリスト方式
isspace の結果に加えて、特定の文字コードを自分でチェックするロジックを追加することも考えられます。

#include <cctype>
#include <iostream>
#include <locale>

bool is_my_custom_space(char ch, const std::locale& loc) {
    if (std::isspace(static_cast<unsigned char>(ch), loc)) {
        return true;
    }
    // 特定のロケールでは空白とみなされるが、isspaceが拾わない可能性のある文字を別途チェック
    // 例: 非表示文字のU+200B (Zero Width Space) などは、charでは表現が困難ですが、概念として
    // if (ch == some_other_whitespace_char) return true;
    return false;
}

int main() {
    std::locale current_locale(""); // 環境のロケール
    char test_char = ' '; // 例として半角スペース
    // char non_breaking_space_char = (char)0xA0; // U+00A0 は単一charでは表現困難な場合が多い

    if (is_my_custom_space(test_char, current_locale)) {
        std::cout << "カスタム関数で空白と判定" << std::endl;
    }
    return 0;
}

ロケールの明示的な制御
プログラム内で使用するロケールを明示的に設定し、どのロケールで isspace を呼び出すかを明確にします。特に、std::isspace(char ch, const std::locale& loc) のように、ロケールオブジェクトを引数に取るオーバーロードを使用することが推奨されます。

std::isspace は、文字列処理やテキスト解析において、空白文字の判定に頻繁に利用されます。ここでは、具体的な使用例をいくつか紹介します。

文字列のトリム（先頭・末尾の空白除去）

文字列の先頭や末尾にある不要な空白文字（スペース、タブ、改行など）を取り除く処理は非常によく行われます。

#include <iostream>
#include <string>
#include <cctype>   // std::isspace を使うために必要
#include <algorithm> // std::find_if, std::find_if_not を使うために必要

// 文字列の先頭の空白をトリムする関数
std::string trim_left(const std::string& s) {
    auto it = std::find_if_not(s.begin(), s.end(), [](unsigned char c) {
        return std::isspace(c); // unsigned char にキャストして安全にisspaceを呼び出す
    });
    return std::string(it, s.end());
}

// 文字列の末尾の空白をトリムする関数
std::string trim_right(const std::string& s) {
    auto it = std::find_if_not(s.rbegin(), s.rend(), [](unsigned char c) {
        return std::isspace(c); // unsigned char にキャストして安全にisspaceを呼び出す
    });
    return std::string(s.begin(), it.base());
}

// 文字列の両端の空白をトリムする関数
std::string trim(const std::string& s) {
    return trim_left(trim_right(s));
}

int main() {
    std::string text1 = "   Hello, World!   ";
    std::string text2 = "\t\n  C++ Programming  \r\n";
    std::string text3 = "NoTrim";
    std::string text4 = "   "; // 全て空白の文字列

    std::cout << "Original: '" << text1 << "'" << std::endl;
    std::cout << "Trimmed:  '" << trim(text1) << "'" << std::endl;
    std::cout << std::endl;

    std::cout << "Original: '" << text2 << "'" << std::endl;
    std::cout << "Trimmed:  '" << trim(text2) << "'" << std::endl;
    std::cout << std::endl;

    std::cout << "Original: '" << text3 << "'" << std::endl;
    std::cout << "Trimmed:  '" << trim(text3) << "'" << std::endl;
    std::cout << std::endl;

    std::cout << "Original: '" << text4 << "'" << std::endl;
    std::cout << "Trimmed:  '" << trim(text4) << "'" << std::endl;
    std::cout << std::endl;

    return 0;
}

解説

安全な isspace の呼び出し
ラムダ式内で static_cast<unsigned char>(c) を行っている点に注目してください。これにより、char 型の文字が負の値を持つ場合に発生する未定義動作を回避し、isspace を安全に呼び出しています。
trim: trim_right の結果をさらに trim_left に渡すことで、両端の空白を除去します。
trim_right: std::rbegin() と std::rend() を使って文字列を逆順に走査し、std::find_if_not で最初の空白文字ではない文字を見つけます。そのイテレータの基底（base()) から文字列の先頭までを新しい文字列として返します。
trim_left: std::find_if_not を使用して、文字列の先頭から見て最初に空白文字ではない文字を見つけます。そのイテレータから文字列の終わりまでを新しい文字列として返します。

文字列の単語数カウント

文字列内の単語の数を数える際にも std::isspace が役立ちます。単語は通常、空白文字で区切られます。

#include <iostream>
#include <string>
#include <cctype>

int count_words(const std::string& s) {
    int word_count = 0;
    bool in_word = false; // 現在、単語の中にいるかどうかのフラグ

    for (char c : s) {
        // unsigned char にキャストして安全にisspaceを呼び出す
        if (std::isspace(static_cast<unsigned char>(c))) {
            in_word = false; // 空白文字を見つけたら単語の外に出たと判断
        } else {
            if (!in_word) { // 単語の外から単語の中に入った瞬間
                word_count++;
                in_word = true;
            }
        }
    }
    return word_count;
}

int main() {
    std::string text1 = "Hello world";
    std::string text2 = "  One   Two   Three  ";
    std::string text3 = "SingleWord";
    std::string text4 = "   "; // 全て空白

    std::cout << "'" << text1 << "' の単語数: " << count_words(text1) << std::endl; // 出力: 2
    std::cout << "'" << text2 << "' の単語数: " << count_words(text2) << std::endl; // 出力: 3
    std::cout << "'" << text3 << "' の単語数: " << count_words(text3) << std::endl; // 出力: 1
    std::cout << "'" << text4 << "' の単語数: " << count_words(text4) << std::endl; // 出力: 0

    return 0;
}

解説

ここでも static_cast<unsigned char>(c) を使用して安全性を確保しています。
空白文字を見つけたら in_word を false にリセットし、次の単語の開始に備えます。
空白文字でない文字を初めて見つけたとき（in_word が false のとき）、新しい単語が始まったと判断し、word_count をインクリメントします。
in_word フラグを使って、現在処理中の文字が単語の一部であるかどうかを追跡します。

文字列の分割（空白区切り）

空白文字で文字列を分割し、それぞれの部分文字列（トークン）を取得する処理です。

#include <iostream>
#include <string>
#include <vector>
#include <cctype>
#include <sstream> // std::istringstream を使うために必要

// 空白文字で文字列を分割する関数
std::vector<std::string> split_by_whitespace(const std::string& s) {
    std::vector<std::string> tokens;
    std::string current_token;
    
    for (char c : s) {
        if (std::isspace(static_cast<unsigned char>(c))) {
            // 空白文字の場合
            if (!current_token.empty()) {
                tokens.push_back(current_token); // 現在のトークンをリストに追加
                current_token.clear();           // トークンをクリア
            }
        } else {
            // 空白文字ではない場合
            current_token += c; // 現在のトークンに文字を追加
        }
    }
    // 文字列の終わりにトークンが残っている可能性があるので、最後にチェック
    if (!current_token.empty()) {
        tokens.push_back(current_token);
    }
    return tokens;
}

// (参考) std::istringstream を使ったよりC++らしい分割方法
std::vector<std::string> split_with_stringstream(const std::string& s) {
    std::vector<std::string> tokens;
    std::istringstream iss(s);
    std::string token;
    while (iss >> token) { // std::istringstream はデフォルトで空白文字で区切る
        tokens.push_back(token);
    }
    return tokens;
}

int main() {
    std::string text1 = "  apple banana  cherry   date";
    std::string text2 = "  single  ";
    std::string text3 = "";

    std::cout << "--- split_by_whitespace ---" << std::endl;
    std::vector<std::string> result1 = split_by_whitespace(text1);
    for (const auto& token : result1) {
        std::cout << "'" << token << "'" << std::endl;
    }
    std::cout << "Text2: " << std::endl;
    std::vector<std::string> result2 = split_by_whitespace(text2);
    for (const auto& token : result2) {
        std::cout << "'" << token << "'" << std::endl;
    }
    std::cout << "Text3 (empty): " << std::endl;
    std::vector<std::string> result3 = split_by_whitespace(text3);
    for (const auto& token : result3) {
        std::cout << "'" << token << "'" << std::endl;
    }

    std::cout << "\n--- split_with_stringstream (参考) ---" << std::endl;
    std::vector<std::string> result_ss1 = split_with_stringstream(text1);
    for (const auto& token : result_ss1) {
        std::cout << "'" << token << "'" << std::endl;
    }

    return 0;
}

最後に、文字列の末尾で単語が終わっている可能性を考慮して、current_token に残っている内容をチェックし、必要であれば追加します。
空白文字を見つけ、かつ current_token が空でなければ、それが1つの単語の終わりと判断し、tokens リストに追加します。
空白文字ではない場合は current_token に文字を追加していきます。
split_by_whitespace 関数では、文字列を1文字ずつ走査し、std::isspace で空白文字かどうかを判定しています。

std::isspace は特定のユースケースで非常に便利ですが、以下のような理由から別の方法を検討することがあります。

パフォーマンスが非常に重要で、単純な比較で済む場合
判定対象の空白文字が限られている場合、関数呼び出しのオーバーヘッドを避けて直接比較したい場合。
ロケールを厳密に制御したい場合
std::isspace はロケールに依存しますが、std::locale オブジェクトを常に明示的に渡すのが面倒、または特定のロケールに特化した挙動が欲しい場合。
より高レベルな文字列操作が必要な場合
単に空白文字を判定するだけでなく、文字列全体のパースや正規表現によるマッチングなど、より複雑な処理を行いたい場合。

これらの状況に応じた代替方法を以下に示します。

std::string のメンバー関数やアルゴリズムと組み合わせる

std::isspace 自体を直接置き換えるものではありませんが、文字列処理の文脈で「空白判定」という目的を達成するための、より高レベルな方法です。

std::string::find_first_of / find_first_not_of
特定の文字セット（空白文字を含む）を文字列中から検索するのに使えます。

#include <iostream>
#include <string>
#include <cctype> // isspaceのため

int main() {
    std::string s = "   Hello World   ";
    std::string whitespace_chars = " \t\n\r\f\v"; // std::isspace が判定する一般的な空白文字

    // 先頭の非空白文字を探す
    size_t first_non_space = s.find_first_not_of(whitespace_chars);
    // 末尾の非空白文字を探す
    size_t last_non_space = s.find_last_not_of(whitespace_chars);

    if (first_non_space != std::string::npos && last_non_space != std::string::npos) {
        std::string trimmed_s = s.substr(first_non_space, last_non_space - first_non_space + 1);
        std::cout << "Original: '" << s << "'" << std::endl;
        std::cout << "Trimmed (using find_first_not_of): '" << trimmed_s << "'" << std::endl;
    } else {
        // 全て空白、または空文字列の場合
        std::cout << "String is all whitespace or empty: '" << s << "'" << std::endl;
    }

    return 0;
}

利点
文字列操作に特化しており、複数の文字を一度に検索できる。 欠点: std::isspace のように個々の文字の性質を動的に判定するわけではないため、whitespace_chars を明示的に定義する必要がある。ロケール依存の空白文字を正確に含めるには追加の考慮が必要。

std::istringstream を使用した文字列の分割

空白文字で区切られたトークンを抽出する場合、std::istringstream が非常に強力で簡潔な代替手段となります。デフォルトで空白文字（isspace が認識する文字）を区切り文字として扱います。

#include <iostream>
#include <string>
#include <vector>
#include <sstream> // std::istringstream のため

int main() {
    std::string text = "  apple   banana\tcherry\n";
    std::istringstream iss(text);
    std::string token;
    std::vector<std::string> tokens;

    // >> 演算子は、空白文字をスキップして次の非空白文字の塊を読み込む
    while (iss >> token) {
        tokens.push_back(token);
    }

    std::cout << "Original: '" << text << "'" << std::endl;
    std::cout << "Tokens:" << std::endl;
    for (const auto& t : tokens) {
        std::cout << "  '" << t << "'" << std::endl;
    }

    return 0;
}

利点
コードが非常に簡潔で、トークン抽出の一般的なタスクに最適。std::isspace が判定する空白文字を自動的にスキップする。 欠点: 空白文字自体をトークンとして扱いたい場合や、特定の空白文字だけをスキップしたい場合には適さない。

std::regex (正規表現) を使用する

より複雑な空白文字のパターンや、空白文字を含む文字列の検索・置換・分割を行う場合、C++11以降で利用可能な正規表現ライブラリ std::regex が強力な代替手段となります。

\s: 正規表現のメタ文字で、std::isspace が認識するすべての空白文字にマッチします（ロケールによっては拡張される）。

#include <iostream>
#include <string>
#include <regex> // std::regex のため

int main() {
    std::string text = "   Hello\tWorld!\nThis is a test.   ";

    // 複数の空白を1つのスペースに置換する
    std::string result = std::regex_replace(text, std::regex("\\s+"), " ");
    std::cout << "Original: '" << text << "'" << std::endl;
    std::cout << "Replaced multiple spaces: '" << result << "'" << std::endl;

    // 文字列を空白で分割する
    std::sregex_token_iterator it(text.begin(), text.end(), std::regex("\\s+"), -1);
    std::sregex_token_iterator end;
    std::cout << "\nTokens (using regex_token_iterator):" << std::endl;
    for (; it != end; ++it) {
        std::cout << "  '" << *it << "'" << std::endl;
    }

    return 0;
}

利点
非常に柔軟で強力。std::isspace の機能を含むより複雑なパターンマッチングが可能。 欠点: 学習コストが高い。単純な空白判定にはオーバーヘッドが大きい。

std::iswspace (ワイド文字版) を使用する

マルチバイト文字やUnicode文字（wchar_t や char32_t など）を含む文字列を扱う場合、char ベースの std::isspace では不十分です。その場合、ワイド文字版の std::iswspace を使用します。

#include <iostream>
#include <cwctype> // std::iswspace のため
#include <string>
#include <locale>  // ロケール設定のため

int main() {
    // ロケールをシステムのデフォルトに設定 (例: 日本語環境ならja_JP.UTF-8など)
    // これにより、全角スペースなどが正しくiswspaceで認識される可能性がある
    std::locale::global(std::locale(""));

    wchar_t wide_space = L'　'; // 全角スペース
    wchar_t ascii_space = L' '; // 半角スペース
    wchar_t normal_char = L'A';

    std::cout << "--- iswspace ---" << std::endl;
    if (std::iswspace(wide_space)) {
        std::wcout << L"全角スペース (U+" << std::hex << (int)wide_space << ") は空白です。" << std::endl;
    } else {
        std::wcout << L"全角スペース (U+" << std::hex << (int)wide_space << ") は空白ではありません。" << std::endl;
    }

    if (std::iswspace(ascii_space)) {
        std::wcout << L"半角スペース (U+" << std::hex << (int)ascii_space << ") は空白です。" << std::endl;
    } else {
        std::wcout << L"半角スペース (U+" << std::hex << (int)ascii_space << ") は空白ではありません。" << std::endl;
    }

    if (std::iswspace(normal_char)) {
        std::wcout << L"通常文字 (U+" << std::hex << (int)normal_char << ") は空白です。" << std::endl;
    } else {
        std::wcout << L"通常文字 (U+" << std::hex << (int)normal_char << ") は空白ではありません。" << std::endl;
    }

    // std::string を std::wstring に変換してiswspaceを適用する例
    std::string s_mb = "これは　全角スペースを含む　文字列です。";
    // マルチバイト文字列からワイド文字列への変換（ロケール設定が重要）
    std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
    std::wstring s_w = converter.from_bytes(s_mb);

    std::wcout << L"\n--- std::wstring と iswspace ---" << std::endl;
    for (wchar_t wc : s_w) {
        if (std::iswspace(wc)) {
            std::wcout << L"['" << wc << L"'] は空白です。" << std::endl;
        } else {
            std::wcout << L"['" << wc << L"'] は空白ではありません。" << std::endl;
        }
    }

    return 0;
}

利点
char ベースの isspace では扱えない、ロケールに応じたマルチバイト/ワイド文字の空白を正確に判定できる。 欠点: wchar_t や std::wstring を扱うための追加の知識と、文字列エンコーディングの変換（std::wstring_convert など）が必要になる。ロケール設定が正しくないと期待通りに動作しない可能性がある。

特定の文字コードとの直接比較

もし判定したい空白文字の種類がごく限られており、ASCII文字セットに限定される場合、パフォーマンスを最優先するために直接文字コードと比較することも可能です。ただし、これは非常に限定的なケースにのみ適用すべきです。

#include <iostream>
#include <string>

bool is_basic_whitespace(char c) {
    return c == ' ' || c == '\t' || c == '\n' || c == '\r' || c == '\f' || c == '\v';
}

int main() {
    char ch1 = ' ';
    char ch2 = '\t';
    char ch3 = 'A';

    if (is_basic_whitespace(ch1)) {
        std::cout << "'" << ch1 << "' は基本的な空白です。" << std::endl;
    }
    if (is_basic_whitespace(ch2)) {
        std::cout << "'\\t' は基本的な空白です。" << std::endl;
    }
    if (is_basic_whitespace(ch3)) {
        std::cout << "'" << ch3 << "' は基本的な空白です。" << std::endl;
    } else {
        std::cout << "'" << ch3 << "' は基本的な空白ではありません。" << std::endl;
    }

    return 0;
}

利点
関数呼び出しのオーバーヘッドがなく、非常に高速。 欠点: ロケール依存の空白文字を扱えない。判定したい空白文字のリストを手動で管理する必要があり、拡張性に乏しい。

C++プログラマ必見！std::tolowerの正しい使い方とUnicode対応

std::tolower は、C++標準ライブラリの関数で、与えられた文字を対応する小文字に変換するために使用されます。この関数は主にASCII文字セット（または現在のロケールで定義された文字セット）の大文字を小文字に変換します。関数シグネチャ: int std::tolower(int ch); ch: 変換したい文字。int 型で渡されます。通常、char 型の文字が自動的に int に昇格されて渡されます。戻り値: もし ch が大文字であり、対応する小文字が存在する場合、その小文字の int 値を返します。 ch が既に小文字であるか、英字でない（数字、記号など）場合、ch の値をそのまま返します。