Post

C++ Tips : String

string, string::npos, sso, sbo, string_view.

C++ Tips : String

C++ Tips

std::string, std::string_view


std::string

  • capacity increases but does not decrease.
  • if you want to reduce capacity, use shrink_to_fit()
1
2
3
4
5
6
std::string s;

for(100'000)
	s.push_back('A');
// there is more than one reallocation.
// if you know the size, use reserve() first
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>

int main() 
{
	std::string str;
	char arr[] = "Hello World!";

	str.assign(arr, 5);
	std::cout << str << '\n'; // Hello

	std::string s1(12, 'A');
	std::cout << s1 << '\n'; // AAAAAAAAAAAA

	std::string s2{52, 'A'};
	std::cout << s2 << '\n'; // 4A
}

std::string::npos

  • used to indicate that no position found.
  • functions like .find(), .rfind(), .find_first_of() return an index if found.
  • otherwise they return npos.
  • prefer to use .at() rather than operator[]
  • operator[] won’t thow. if pos > size, the behaviour is undefined.
  • .at() will throw std::out_of_range, if pos >= size.

careful with .c_str()

1
2
3
4
5
6
7
8
9
10
11
12
#include <iostream>
#include <string>

int main() 
{
	std::string str = "Hi ";
	auto p = str.c_str();
	str += "Hello World! Hello World! Hello World! Hello World!";

	std::cout << p << '\n'; 
	// the behaviour is undefined, if reallocation happens.
}

SSO

  • Short String Optimization
  • Short strings are stored directly inside the std::string without needing mem alloc.
  • It saves both mem allocs and cpu cycles.
  • Buffer’s size depends on compiler and platform.
  • SSO is just SBO for strings
1
2
3
4
5
6
7
8
9
// std::string looks something like this
struct StringImpl {
    union {
        char* heap_ptr;          // for large strings
        char sso_buffer[15];     // for short strings (implementation-defined size)
    };
    size_t size;
    size_t capacity;
};

SBO

  • Small Buffer Optimization
  • A class stores small data directly inside itself, avoiding heap allocation for small sizes (usually a few bytes).
  • It saves both mem allocs and cpu cycles.
  • Buffer’s size depends on compiler and platform.

How to detect SBO in action?

  • Use sizeof(std::function<…>) to check internal buffer size.
  • Use tools like Valgrind
  • Overload new and delete to log

std::string_view

  • #include <string_view>
  • C++17
  • Read-only.
  • A non-owning view of a string.
  • No allocate/copy memory.
  • Just pointing the data.
  • Faster, really faster.
  • Dangling ref may happen, so be careful!
1
2
3
4
5
6
7
8
std::string_view sv;
{
    std::string temp = "hello";
    sv = temp;  
} 

// temp is destroyed so undefined behavior happens.
std::cout << sv << "\n";

Prefer to use swap functions for strings and containers. It is very efficient as it only changes pointers. Also use STL algorithms as much as possible.

1
2
3
4
5
std::string str;
// by this, you can read an entire line of text (including spaces)
getline(std::cin, str);

// Example: str -> Hello World
This post is licensed under CC BY 4.0 by the author.