Description of std::string_view vs. C++17 std::string - DZone (2023)

How much is itstd::string_viewfaster than normalPattern::ChainThe operation?

Check out some examples where I comparestd::string_viewagainstPattern::Chain.

introduction

I was looking for some examplesview string, and after a while I was curious to see what performance gain we could make.

view stringis conceptually just a look at the string: usually implemented as[ptr, longitude]. if oneview stringis created, the data does not have to be copied (unlike when creating a copy of a string). What elseview stringIt's smaller thanPattern::Chain- related to the stack/heap size.

For example, let's consider a possible (pseudo) implementation:

view_string { t_size _length; const CharT* _str; }

Depending on the architecture, the total size is either 8 or 16 bytes.

Due to minor string tweaksPattern::Chainis usually 24 or 32 bytes, so doubles or triples the sizeview string. This way, this string can contain between 15 (GCC, MSVC) and 22 characters (Clang) without having to allocate memory on the heap. Of course, a longer string takes more memory, but 24/32 bytes is the minimum size of the string.Pattern::Chain.

For more details on SSO, check out this excellent postExplore std::string. This one:SSO-23(as suggested in a comment).

Return views from strings, create views from strings, use, of courseSubstrit's definitely a lot faster than deep copies ofPattern::Chain. However, initial performance tests showed thisPattern::Chainis usually heavily optimized and sometimesview stringyou don't earn that much

series as

This article is part of my C++17 Library Utilities series. Here is the list of other topics I will cover:

C++17 STL resources:

view stringThe operation

view stringis modeled very similarlyPattern::Chain. However, the view is not owned, so no operation that modifies the data can get into the API. Here's a short list of methods you can use with this new type:

  • Operator[]

  • NO
  • Forehead
  • to return
  • Data
  • Size/Long
  • maximum size
  • file
  • delete_prefix
  • delete_suffix
  • Trocar
  • copy of(NOconstexpr)
  • Substr- ComplexityO(1)and notOne)as inPattern::Chain
  • compare
  • meet
  • meet
  • find_first_of
  • find_last_of
  • find_first_not_of
  • find_last_not_of
  • Lexicographic comparison operators:==, !=, <=, >=, <, >
  • operator <<

An important note is that all the above methods (exceptcopy ofmioperator <<) You are tooconstexpr! This function now allows you to work with strings in constant expressions.

Also, for C++20 we will have at least two new methods:

  • begins with

  • ends with

Which are implemented for bothstd::string_viewmiPattern::Chain. As of now (July 2018) Clang 6.0 supports these features. So you can try them.

A basic test -Substr

Substroffers probably the best advantage over the standard chainSubstr. It has the complexity ofO(1)and notOne)like normal strings.

I created a simple test withC++ Quick Referenceand got the following results:

Description of std::string_view vs. C++17 std::string - DZone (1)

Using Clang 6.0.0, -O3, libc++

Code:

static void StringSubStr(benchmark::State& state) { std::string s = "Hello Super Extra Programming World"; for (auto _ : state) { auto oneStr = s.substr(0, 5); auto dosStr = s.substr(6, 5); auto threeStr = s.substr(12, 5); auto fourStr = s.substr(18, 11); auto fiveStr = s.substr(30, 5); // Make sure the variable has not been optimized by the compiler reference point::DoNotOptimize(oneStr); reference point::DoNotOptimize(twoStr); Benchmark::DoNotOptimize(threeStr); Benchmark::DoNotOptimize(fourStr); Benchmark::DoNotOptimize(fiveStr); } }

Is forview string:

static void StringViewSubStr(benchmark::State& state) { // code before the loop is not measured std::string s = "Hello Super Extra Programming World"; for (auto _ : state) { std::string_view sv = s; auto oneSv = sv.substr(0, 5); auto dosSv = sv.substr(6, 5); auto threeSv = sv.substr(12, 5); auto cuatroSv = sv.substr(18, 11); auto fiveSv = sv.substr(30, 5); reference point::DoNotOptimize(oneSv); Benchmark::DoNotOptimize(dosSv); Benchmark::DoNotOptimize(tresSv); Benchmark::DoNotOptimize(cuatroSv); Benchmark::DoNotOptimize(fiveSv); } }

Here is the full experiment:Banco @Quick C++

For this test we have a10x acceleration!

Can we get similar results in other cases?

string pitch

After the basic tests, we can go one step further and try to compose a more complicated algorithm: Let's take string division.

For this experiment, I collected code from these resources:

Here are the two versions, one forPattern::Chainand the second tostd::string_view:

std::vector<std::string> split(const std::string& str, const std::string& delims = " ") { std::vector<std::string> salida; auto primeiro = std::cbegin(str); while (first != std::cend(str)) { const auto second = std::find_first_of(first, std::cend(str), std::cbegin(delims), std::cend(delims)); if (primero != segundo) salida.emplace_back(primero, segundo); if (segundo == std::cend(str)) break; primeiro = padrão::próximo(segundo); } regresa a la saída; }

now himview stringExecution:

std::vector<std::string_view> splitSV(std::string_view strv, std::string_view delims = " ") { std::vector<std::string_view> salida; tamaño_t primeiro = 0; while (first < strv.size()) {const auto second = strv.find_first_of(delims, first); if (primero != segundo) salida.emplace_back(strv.substr(primero, segundo-primero)); if (segundo == std::string_view::npos) break; primeiro = segundo + 1; } regresa a la saída; }

And here is the benchmark:

const std::string_view LoremIpsumStrv{ /* a paragraph about Lorem Ipsum */ }; static void StringSplit(benchmark::State& state) { std::string str { LoremIpsumStrv }; for (auto _ : estado) { auto v = split(str); reference point::Nooptimizar(v); } } // Register a benchmark function BENCHMARK(StringSplit); static void StringViewSplit(benchmark::State& state) { for (auto _ : state) { auto v = splitSV(LoremIpsumStrv); reference point::Nooptimizar(v); } } BENCHMARK(StringViewSplit);

Will we get the same 10x performance speed as the previous benchmark...? hmm:

Description of std::string_view vs. C++17 std::string - DZone (2)

Dies ist GCC 8.1, -O3.

Slightly better with Clang 6.0.0, -O3:

Description of std::string_view vs. C++17 std::string - DZone (3)

A slightly better result when I run it locally in MSVC 2017:

String Length: 486 Test Iterations: 10000 String Split: 36.7115 ms String View Split: 30.2734 ms

Here is the scale:Banco @Quick C++

Any idea why we don't see 10x acceleration like in the first experiment?

Of course, we cannot assume that 10X is realistic in this case.

First we have a container:Pattern::Vector— which the algorithm uses to generate the results. The memory allocations withinPattern::VectorIt affects the overall speed.

If we run the iteration once and if you replacenew operatorI can see the following numbers (MSVC):

String Length: 486 Test Iterations: 1 String Split: 0.011448 ms, Allocation Count: 15, Size 6912 String View Split: 0.006316 ms, Allocation Count: 12, Size 2272

We have 69 words in this string thatRope-Version generated 15 memory allocations (both for strings and increasingVectorstorage space) and a total of 6912 bytes allocated.

Östrng_viewVersion uses 12 memory allocations (only forVectorsince no memory needs to be allocatedview string) and consumed a total of 2272 bytes (3x less than thePattern::ChainExecution).

Some ideas for improvement

Watch themJFT's Commentarywhere he reimplemented the split algorithms using raw pointers instead of iterators and made many other performance improvements.

Another option is to reserve some space in the array in advance (and then we can useshrink until it fits- This way we save a lot of memory allocations.

Compared toboost::split:

For the sake of completeness, I also run the benchmarkboost::split(1.67), and our two versions are much faster:

Runs on WandBox, GCC 8.1

String Length: 489 Test Iterations: 10000 String Division: 42.8627 ms Allocation Count: 110000 Size 82330000 String View Division: 45.6841 ms Allocation Count: 80000 Size 40800000 Pulse Division: 117.521 ms Allocation Count: 160000 Size 0.839300

So the handmade version is almost three times faster than thatboost.splitAlgorithm!

Play with the code @WandBox

String splitting and uploading a file

You may notice that my test string is just one paragraph of "lorem ipsum". Such a simple test case may cause some additional optimizations in the compiler and give unrealistic results.

I found a good post by Rainer Grimm:C++17 - Copy with std::string_view - ModernCpp.com. In the article you used TXT files to process strings. It's a much better idea to work on some really big text files instead of simple strings.

Instead of my Lorem Ipsum paragraph, I'm just uploading a file, say ~540KB of text (Project Gutenberg). Here is the result of a test run with this file:

String Length: 547412 Test Iterations: 100 String Split: 564.215 ms Allocation Count: 191800 Size 669900000 String View Split: 363.506 ms Allocation Count: 2900 Size 221262300

The test runs 100 times, so we have for one iteration191800/100 = 1918Memory allocations (overall we use669900000/100 = 6699000 Bytesper iteration) tooPattern::Chain.

Forview stringWe only have2900/100 = 29memory allocations etc221262300/100 = 2212623 Byteused per iteration.

While it's still not a 10x win, we have 3x less memory usage and about a 1.5x increase in performance.

Sorry for the little interruption in the process. I've put together a little bonus, if you're interested in C++17 check it out.Here.

risks of useview string

I think every article aboutview stringYou should also mention the potential risks associated with this new type:

  • Pay attention to (not) null-terminated strings:view stringcannot contain NULL at the end of the string. So you have to be prepared for such a case.
    • Problematic when calling functions likeFan,pressAccept null-terminated strings
    • Conversion to strings
  • References and temporary objects —view stringit has no memory of its own, so you must be very careful when working with temporary objects.
    • upon returnview stringa function
    • save on computerview stringon objects or containers.

To involve

By exploitingview string, you can achieve a huge performance boost in many use cases. However, it is important to know that there are limitations and performance can be even slower compared to sometimesPattern::Chain!

The first is thisview stringYou don't own the data, so you have to be careful not to end up with memory references removed.

The second thing is that compilers are very smart when it comes to strings, especially when the strings are short (so they work well with SSO - Small String Optimization) and in this case the performance gain might not be as visible. .

a few questions for you

What is your experience withview stringPerfomance?

Can you share some results and examples?

Top Articles
Latest Posts
Article information

Author: Ouida Strosin DO

Last Updated: 03/05/2023

Views: 5760

Rating: 4.6 / 5 (76 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Ouida Strosin DO

Birthday: 1995-04-27

Address: Suite 927 930 Kilback Radial, Candidaville, TN 87795

Phone: +8561498978366

Job: Legacy Manufacturing Specialist

Hobby: Singing, Mountain biking, Water sports, Water sports, Taxidermy, Polo, Pet

Introduction: My name is Ouida Strosin DO, I am a precious, combative, spotless, modern, spotless, beautiful, precious person who loves writing and wants to share my knowledge and understanding with you.