C++17: std::byte « Marc Gregoire’s Blog

C++17: std::byte

3 Jun, 2018 C++ C++17 Visual C++ 2017

What is it?

C++17 introduced a new type: std::byte.

Previously, when you needed to access raw memory, you would use an unsigned char or a char data type. However, these data types give the impression that you are working with characters or with numeric values. The new std::byte data type does not convey character or arithmetic semantics, it is just a collection of bits. As such, it’s ideal to represent raw memory. An std::byte only supports initialization from an integral type, and can be converted back to an integral type using std::to_integer(). The only other operations supported are bit-wise operations. Both std::byte and std::to_integer() are defined in <cstddef>.

How to use it?

Let’s look at some short examples. First, here is how you can initialize an std::byte:

#include <cstddef>

int main()
{
    std::byte myByte{ 2 };
}

Next, you can use std::to_integer() to convert an std::byte back to an integral type of your choice:

#include <iostream>
#include <cstddef>

int main()
{
    std::byte myByte{ 2 };
    std::cout << std::to_integer<int>(myByte) << std::endl;
}

The following bitwise operations are supported: <<, >>, |, &, ^, ~, and <<=, >>=, |=, &=, and ^=. Here is an example:

#include <iostream>
#include <cstddef>

using namespace std;

void PrintByte(const byte& aByte)
{
    cout << to_integer<int>(aByte) << endl;
}

int main()
{
    byte myByte{ 2 };
    PrintByte(myByte);	// 2

    // A 2-bit left shift
    myByte <<= 2;
    PrintByte(myByte);	// 8

    // Initialize two new bytes using binary literals.
    byte byte1{ 0b0011 };
    byte byte2{ 0b1010 };
    PrintByte(byte1);	// 3
    PrintByte(byte2);	// 10

    // Bit-wise OR and AND operations
    byte byteOr = byte1 | byte2;
    byte byteAnd = byte1 & byte2;
    PrintByte(byteOr);	// 11
    PrintByte(byteAnd);	// 2
}

Why use it?

You might wonder what the difference is with just using the existing uint8_t instead of std::byte. Well, std::byte is really just a bunch of un-interpreted bits. If you use uint8_t, you are actually interpreting the bits as an 8-bit unsigned numerical value, which might convey the wrong semantics. Also, std::byte will not allow accidental arithmetic on it, while uint8_t does.

Raw memory buffers and interoperability with legacy C-style API’s

If you need a buffer of raw memory, then you can use std::vector<std::byte>.

Additionally, if you need to pass such a buffer to a legacy C-style API accepting, for example, an unsigned char*, then you will have to add a cast. Here is a brief example:

void SomeCApi(unsigned char* buffer, unsigned int size)
{
    for (unsigned char index = 0; index < size; ++index) {
        buffer[index] = index;
    }
}

int main()
{
    std::vector<std::byte> buffer{ 100 };
    SomeCApi(reinterpret_cast<unsigned char*>(&buffer[0]), buffer.size());

    for (const auto& element : buffer) { PrintByte(element); }
}

Conclusion

When you are working with raw bits, use an std::byte instead of an unsigned char or char data type.

10 Comments so far »

Jacob Lifshay said,

Wrote on June 4, 2018 @ 10:44 pm

~= is not an operator
torusl said,

Wrote on June 5, 2018 @ 5:12 am

And what’s the benefit over using good old uint8_t?
Marc Gregoire said,

Wrote on June 5, 2018 @ 6:26 am

Jacob: You are right, my mistake.
I have updated the article.
Thank you for letting me know 🙂
Marc Gregoire said,

Wrote on June 5, 2018 @ 6:30 am

torusl: std::byte is really just a bunch of un-interpreted bits. If you use uint8_t, you are actually interpreting the bits as an 8-bit numerical value, which might not be the right semantics that you want. Also, std::byte will not allow accidental arithmetic on it, while uint8_t does.
Robert Dailey said,

Wrote on June 6, 2018 @ 6:24 pm

Hey Marc, long time no see! I liked this article, and I actually didn’t know about std::byte until now. I did have the same question that @torusl had, so I’m glad it was asked. I think because traditionally to get a buffer of bytes prior to C++17 people did this:

std::vector<std::byte>

It might be worth it to amend your article with the following:

1. Your comment about std::byte vs std::uint8_t, there seem to be unexpected semantic differences that I think are extremely important to point out.
2. Give some examples of `std::vector<std::byte>` (assuming that’s valid) and if it’s inter-operable with fixed buffers when interfacing with C-APIs, such as:

std::vector<std::byte> buffer{100};
SomeCAPI(&buffer[0], buffer.size());

Stay well friend!

[Marc] Fixed HTML.
Robert Dailey said,

Wrote on June 6, 2018 @ 6:26 pm

Looks like your blog parsed my C++ code as HTML haha.. my vector above was actually:

std::vector<std::byte>
Marc Gregoire said,

Wrote on June 7, 2018 @ 12:40 pm

Robert Dailey: Added a section “Why use it?” 🙂
Marc Gregoire said,

Wrote on June 7, 2018 @ 1:06 pm

Robert Dailey: Added a section on raw memory buffers and interoperability with C-style API’s 🙂
Thank you for the suggestion.
Kyetuur said,

Wrote on June 11, 2018 @ 7:21 pm

What is better to use, std::byte or bitset ?
Marc Gregoire said,

Wrote on June 13, 2018 @ 5:13 am

Kyetuur: It depends on your use case. If you really want to work on individual bits, then probably a bitset is best. However, if you are working with blocks of raw memory, then std::byte is best. I would use a bitset for a small number of bits, for example, to represent a collection of 32 bits to be used as boolean flags. On the other hand, if I need a block of raw memory of 4MB, I’ll use a vector of std::byte, instead of a bitset of 4 million bits.