C++17: std::byte
What is it?
C++17 introduced a new type: std::byte.
Previously, when you needed to access raw memory, you would use an unsigned char or a char data type. However, these data types give the impression that you are working with characters or with numeric values. The new std::byte data type does not convey character or arithmetic semantics, it is just a collection of bits. As such, it’s ideal to represent raw memory. An std::byte only supports initialization from an integral type, and can be converted back to an integral type using std::to_integer(). The only other operations supported are bit-wise operations. Both std::byte and std::to_integer() are defined in <cstddef>.
How to use it?
Let’s look at some short examples. First, here is how you can initialize an std::byte:
#include <cstddef> int main() { std::byte myByte{ 2 }; }
Next, you can use std::to_integer() to convert an std::byte back to an integral type of your choice:
#include <iostream> #include <cstddef> int main() { std::byte myByte{ 2 }; std::cout << std::to_integer<int>(myByte) << std::endl; }
The following bitwise operations are supported: <<, >>, |, &, ^, ~, and <<=, >>=, |=, &=, and ^=. Here is an example:
#include <iostream> #include <cstddef> using namespace std; void PrintByte(const byte& aByte) { cout << to_integer<int>(aByte) << endl; } int main() { byte myByte{ 2 }; PrintByte(myByte); // 2 // A 2-bit left shift myByte <<= 2; PrintByte(myByte); // 8 // Initialize two new bytes using binary literals. byte byte1{ 0b0011 }; byte byte2{ 0b1010 }; PrintByte(byte1); // 3 PrintByte(byte2); // 10 // Bit-wise OR and AND operations byte byteOr = byte1 | byte2; byte byteAnd = byte1 & byte2; PrintByte(byteOr); // 11 PrintByte(byteAnd); // 2 }
Why use it?
You might wonder what the difference is with just using the existing uint8_t instead of std::byte. Well, std::byte is really just a bunch of un-interpreted bits. If you use uint8_t, you are actually interpreting the bits as an 8-bit unsigned numerical value, which might convey the wrong semantics. Also, std::byte will not allow accidental arithmetic on it, while uint8_t does.
Raw memory buffers and interoperability with legacy C-style API’s
If you need a buffer of raw memory, then you can use std::vector<std::byte>.
Additionally, if you need to pass such a buffer to a legacy C-style API accepting, for example, an unsigned char*, then you will have to add a cast. Here is a brief example:
void SomeCApi(unsigned char* buffer, unsigned int size) { for (unsigned char index = 0; index < size; ++index) { buffer[index] = index; } } int main() { std::vector<std::byte> buffer{ 100 }; SomeCApi(reinterpret_cast<unsigned char*>(&buffer[0]), buffer.size()); for (const auto& element : buffer) { PrintByte(element); } }
Conclusion
When you are working with raw bits, use an std::byte instead of an unsigned char or char data type.
Jacob Lifshay said,
Wrote on June 4, 2018 @ 10:44 pm
~= is not an operator
torusl said,
Wrote on June 5, 2018 @ 5:12 am
And what’s the benefit over using good old uint8_t?
Marc Gregoire said,
Wrote on June 5, 2018 @ 6:26 am
Jacob: You are right, my mistake.
I have updated the article.
Thank you for letting me know 🙂
Marc Gregoire said,
Wrote on June 5, 2018 @ 6:30 am
torusl: std::byte is really just a bunch of un-interpreted bits. If you use uint8_t, you are actually interpreting the bits as an 8-bit numerical value, which might not be the right semantics that you want. Also, std::byte will not allow accidental arithmetic on it, while uint8_t does.
Robert Dailey said,
Wrote on June 6, 2018 @ 6:24 pm
Hey Marc, long time no see! I liked this article, and I actually didn’t know about std::byte until now. I did have the same question that @torusl had, so I’m glad it was asked. I think because traditionally to get a buffer of bytes prior to C++17 people did this:
std::vector<std::byte>
It might be worth it to amend your article with the following:
1. Your comment about std::byte vs std::uint8_t, there seem to be unexpected semantic differences that I think are extremely important to point out.
2. Give some examples of `std::vector<std::byte>` (assuming that’s valid) and if it’s inter-operable with fixed buffers when interfacing with C-APIs, such as:
std::vector<std::byte> buffer{100};
SomeCAPI(&buffer[0], buffer.size());
Stay well friend!
[Marc] Fixed HTML.
Robert Dailey said,
Wrote on June 6, 2018 @ 6:26 pm
Looks like your blog parsed my C++ code as HTML haha.. my vector above was actually:
std::vector<std::byte>
Marc Gregoire said,
Wrote on June 7, 2018 @ 12:40 pm
Robert Dailey: Added a section “Why use it?” 🙂
Marc Gregoire said,
Wrote on June 7, 2018 @ 1:06 pm
Robert Dailey: Added a section on raw memory buffers and interoperability with C-style API’s 🙂
Thank you for the suggestion.
Kyetuur said,
Wrote on June 11, 2018 @ 7:21 pm
What is better to use, std::byte or bitset ?
Marc Gregoire said,
Wrote on June 13, 2018 @ 5:13 am
Kyetuur: It depends on your use case. If you really want to work on individual bits, then probably a bitset is best. However, if you are working with blocks of raw memory, then std::byte is best. I would use a bitset for a small number of bits, for example, to represent a collection of 32 bits to be used as boolean flags. On the other hand, if I need a block of raw memory of 4MB, I’ll use a vector of std::byte, instead of a bitset of 4 million bits.