Back to Basics: Master C++ Value Categories With Standard Tools - Inbal Levi - CppCon 2022

CppCon6,723 words

Full Transcript

so hello everyone thank you for uh getting up this early in the morning I really appreciate it my name is inba Levy I'm from Israel I'm the NBA chair Israeli Envy chair and I'm a software developer at storage I'm an active member of work group 21 as ranges SG chair and lewd chair I love software design I also have a passion for cataloging so this talk is sort of a catalog of the topic of value categoric categories so we're going to start with motivation why do you why why should you care about value categories then we're gonna have intro to Value categories we see how we use them in real life code and then in generic code and then at the end of the talk we'll go over some tools for handling them in your code in your code base so notice that I cut corner Corners full definitions are as always at CPP reference which is a great site recommend all of you to use it and examples will ignore function pointers and arrays I will focus on objects for this talk but value categories are also relevant for for those as well so consider the following code we have here a struct called Data and it have Constructor copy Constructor and move Constructor I have members it's not really crucial now and we have dysfunction get data returns data cons data so here we have usage of this data and my question is how many times are the this data object is going to be created in each line now of course I gave you the first answer um in the first line we're clearly going to have one data created correct this is a Constructor what about the second line you can guess you can just draw numbers I will not uh call you by name so yes one you think like on the second line one would be created okay any other suggestions okay we call student move we have one that was created on line one what will happen on line two zero right because we're basically taking the resources from the object in line one how many will be created on three one close enough and what about four what about four can throw numbers is it zero is it five is it one okay interesting so actually here we're gonna have two objects why is this happening yes const correct whoever said const you are correct oh right you're cheating um yeah so as I mentioned on the first line we're going to get the Constructor here's an object we created an object D1 on the second line We call we call move on our the object created now as you've noticed the color of D1 just turned into Gray okay this is my way of uh saying that this is now a temporary and we can still its resources and now D2 stole the resources from D1 D1 lost its resources move Constructor is called involved um line three is not really interesting in this manner but line four gets to call get data now we get we call get data with 42. get at our return data right but this data is actually const so we kind of did something that is probably not the best and we have a miscommunication with the compiler so copy Constructor is being called on the subject that's been returned from get data the const data and we copy so we have a second one second object the second object is uh the stood move is applied to the second object and the second object is being moved from so we basically have two object here the first and the last one right so what happened here is that we basically miscommunicated with our compiler we confused our compiler okay we gave it contradicting instructions we did something that doesn't make sense and our compiler is confused so uh in this talk I'm going to talk about how do we communicate better how do we avoid such miscommunications and such errors and this is by the way not something that only beginners stumble into this is a very easy mistake to to do okay so I hope you got a bit of the motivation I'm going to summarize that here as well so your compiler sees those objects and some of them you can see as Temporaries as I've colored them in Gray and he knows sorry it knows that it can basically steal from objects that are temporary okay so you can use stood move explicitly to color objects in Gray basically telling your compiler that this object is something that you can that it can steal the resources from okay but there's some other conditions to Mark uh in Gray implicitly so again once you have these objects that are marked in Gray the compiler know it can still the object the sorry the resources still from the object just like that so now let's move to intro to Value categories so we're gonna see what are value categories and the evolution of them throughout the lifetime of C plus plus so they came from C C used to have L value expression it originally referred to the location of the expression in the regards to assignment so we have L value this is on the left and we have r value on the right so convenient that I have those two strings and value category of an entity defines its lifetime and whether it can be moved from or is it a temporary is it observable after changes and also its identity does it have an address uh is its address can be taken and used safely okay so the used safely part is really important in value categories affect very important aspects performance and overload resolution now this is a back to basic talk I don't want to spread into too many topics there are a lot of good talks in this conference but if you're writing a library or just in your own code this sort of uh bug as you saw in the first slide can't happen to you and affect performance so I before we go deeper into value categories I do want to mention references because this talk will also talk about references they're often a source of confusion when we talk about value categories so here we have our Strack data again that we saw before and we create an object and then we create L value ref for the object we can initialize it our value graph we cannot initialize with address of the object what we can initialize with a temporary correct so this is a very basic um just a brief reminder of how how versions work and again we have here L value reference the one with the single Ampersand and our value reference so Valor category okay so we now know what reference are or at least we're reminded and we now know why value categories are important so we're going to dive deeper into what are they so very important thing to remember and this is really if there's a single thing that you're taking from my talk is please please do take this this sentence value category is the quality of an expression it's not uh of an object it's not a function the same thing in one context can have a different value categories in different contexts okay this is very important so again we have our stack data and we have some full function takes data by R value reference right does something with data so here we create our value reference to data we initialize it but we are unable to call Foo with this a that is our value reference right this is weird we just said food takes r value reference right we created an R value reference we're unable to pass it to Foo do you have idea any idea what happens here um under go ahead uh okay yes it's an L value that is that is correct okay so what may look like value category can in fact be the type okay this we call it r value reference but this is confusing because a have a name we can take it we can change it which means this is an L value okay but yeah but its value category is an L value so this is very important whenever I say something that is very very important uh there's there's going to be a small image of the compiler because you know communicate better with your compiler and this is something that uh that is that I think that is uh something that you need to mention so what Miss Linden looks like the same entity is in fact not it depends on the context so the entity can have different value category in different contexts so let's look at this function called Foo that takes data okay and during this function call what happens so first we create this unnamed temp data 73 right we passed the data 73 and it's been bind by the Foo function full function have r value reference so it binds the temporary to this r value reference X right and this entity that used to be data 73 now have has a name inside the scope of the full function so inside the scope of the full function this entity is an L value because we can take its address we can assign to it so the different scope gave different value category to this thing that used to be data 73 it's now X I hope this helps you a bit to understand how complex this topics is and yet our computer is happy so each Express expression have two properties as I mentioned a type and a value category and value category this is I think you already saw this sentence is a quality of expression so standard also understand that people working on the standard also understood that and they've put this diagram that you may have encountered and may have seen in some places this diagram is trying to describe all the value categories that we have in C plus and it is it appears uh coming uh it's coming from the paper a taxonomy of expression value categories by William M Miller proposed in 2010 and as you can see underneath this diagram and I'm going to go back to this diagram later value category uh is something of an expression taxonomy so so at least we have that and value category changed dramatically throughout the lifetime of C plus plus so we're now going to see how it evolved how beautiful so as I mentioned before C we had three types of Expressions L value expression non-l value expression letter to be known as r value expression and function designator I'm not going to talk about those uh oh you can almost not see it okay interesting you're supposed to see it very very um in a gray shade but that doesn't matter because I'm not going to talk about it yeah it's very light and uh but that's fine because uh the whole point was that I'm not going to talk about it in this presentation but please please feel free to go and check it later for simple past 98 added the L value reference so expression is either an L value or an R value so L value are objects functions references r value is non-l value as I mentioned whatever we inherited from C the non-l value now becomes an R value C plus plus O3 had no significant change and if any of you uh yeah so in generally this version was not very not very large 11 added L value reference and move semantics okay so so far we only had um Expressions uh um qualification um qualities r value L value now we have references and references which uh their name may be um unfortunate because it's misleading are now adding uh some more complexity so this was added by Howard hennant better demov and Dave Abrahams in again few papers that you cannot see but doesn't matter you can look for them later and we got this table we get we got three types of value categories the first one has identity and can't be moved from and this is an L value now what does it mean it can't be moved from it means that our program will not work correctly if we move from I mean if we try to take these resources of course it also means that this is not a temporary x value has identity and can be moved from so for example the data 73 that we saw other things as well and PR value doesn't have identity and can be moved from extreme guitars so C plus plus 17 added guaranteed copy illusion guaranteed copulation means that we basically can save time by instead of creating Temporaries we can just take resources and and our program gets faster so the main thing that this paper added and I don't want to go too much into the details of the paper but I think it's very interesting it's at it's instead of the pr value it's now defined something called PR value materialization okay so the idea of PR value materialization is that you have you still have PR value but you can take them something from this materials um sorry you can do things with this uh pure value before it materialize that you cannot do with PR value before so for example you can create a peer like this thing object that represents PR value which hasn't been materialized even if it's not a complete type even if you have only declaration you don't know what this thing looks like you can still sort of pass it on until you materialize it so we're going to talk a bit about that later and the result of the pr value is the value of the expression stores in its context so then we materialize this thing and we get the pr value so simple plus 20 added implicit move so basically more move abilities and um also which I feel that is very important and interesting it finally moved a value category section from basic to expression so I emphasized to you quite a lot how important in it it is to remember that value categories are quality of an expression and not of an entity object and I guess uh people from the community have realized that so so that's good and for simple plus 23 uh we got some really interesting feature called the dosing this I'm going to talk about it a bit at the end of this talk it was suggested by gosh Berryman cyberant bandin and Barry rasmin which is here are supposed to be here at the end of the conference so you should go to him and thank him it's a great feature um the paper also added uh like tea which is a utility that makes you change the value category look at some expression with a different value category and forward like which is very similar so let's go over this very intimidating table that you're probably going to see in every talk about value categories and try to make it a bit less intimidating so we have a main category um this is a just for classification you'll never see an object that is from those types gel value in our value but GL value represents things that have identity our value represents things that can be moved from and we have our subcategories subcategories are L value x value and PR value that we just saw before and now we're going to see some examples before that I just want to briefly go over the terminology but don't feel that you have to capture all of that right away just you know give you some sense of how complex those definitions are so GL value are expressions whose evaluation determines the identity of an object or a function x value is a GL value that donates an object whose resources can be reused okay usually before the end of the lifetime of this thing L value is a jail value that is not an x value this is somehow a strange definition right PR value is an expression whose evaluation initializes an object or computes the value of operand or operator as specified by contacts Etc so uh sorry and R value is a PR value or an x value so you're now introduced to the to the exact terminology that the standard is using for Value categories and it's okay to feel uncomfortable with this terminology I also feel uncomfortable with it somewhat confusing but we'll now see some examples so stay tight keep that okay so L value L value is uh something that have identity right this is something that have a name we can pass it on for example a Declaration of an array for example [Music] um yeah members so if I have a d data I don't have my laser pointer but if you can see we create data and then we get to n and PN which are members these are all L values and also string equation assignments assignments string literals okay and this is a bit this is something that you need to consid take under consideration because um just literals are not are not our values free increments and of course the return of a function that creates this data and notice that R A oh sorry RA in this slide has the type again we saw it already has the type R value reference to int but its value category is L value it's very important PR value is literals like 42 which is strange right because string literals are L values um also nail polisher um the address of a did this object post increment the return of a Lambda like this and even the throw because throws type is void and its value category is PR value let's now look at X values they're in the middle so for example this data 42 that I've mentioned before the temporary created and also the result of the move that we apply on this object of D1 the return of this trinary Opera operator that returns those Temporaries Etc and for for this example you can see that data Dot N is actually an x value and not a PR value the reason for that is that in order to get to n we need to actually create this object even if it's temporary because the way the compiler looks at objects is that it have to have this root the the initial address and then look for the member so that's why you can see it at the data which is a temporary if we get to its member is actually in fact in x value and not a PR value so I hope you now feel more comfortable with all those terminology and definitions and we're now going to dive into a real life code to see how those things work so I think the first most important thing to remember about value categories as this is um terminology for Expressions but the second thing that you need to know is how does things bind in the language this is very confusing so expression with different value categories binds to different types of references so we've we've seen a few slides ago L value references and our value references and now we're going to see which kind of uh which Expressions which value categories of which Expressions can bind to each so here we have our reference L value reference La can bind to a and this works that's great here we have our const L value ref and of course we cannot change it because it's a const all of the one on the on the right are R values if I haven't mentioned this is an R value reference and you can actually um assign to it which is great but you can also have a const r value ref on which you cannot change because it's const makes sense so the dot binding rules are so this is just an example of how we bind uh PR values into references but binary models can have again apply in different in three different events so the first event on which we just saw is initialization or assignment if you have something that you want to bind into then you have to consider binding rules second event is the function call including non-starty class member Etc we're not going to talk about them too much in the stock and on return statements so all of those three different different events that happens in your code needs to take the value category and The Binding rules under consideration so for initialization or assignment again I don't want to go into all of the details of each one of those expressions but one thing I did wanted to mention for you is and it's very important our values can be bound to const L value reference and to our value reference and to const our value reference L values can be bound to L value reference and cons L value reference so basically the bug that we saw in the first slide is result of the fact that we can bind our values either to const L value reference or to our value reference and this is very important in the scope of talking about move Constructor once we wanna trigger a move Constructor and we have this const uh somehow gets into our code then the move Constructor can not be uh may not be triggered as we expected and this is and this is uh not great and the lifetime of an object can be extended by those uh bindings so const L value reference can extend the lifetime of an object but it doesn't allow modifications so if I mean as I've mentioned if you have r value you can bind it to const L value reference and it will extend its lifetime our value reference can extend the lifetime of a temporary so let's look at what happens in a function call here we have our Stark data that we're already familiar with with the Constructor and we have four different functions declared for this data so we have Alpha reference console Vari reference our value reference and con star value reference so now we declare those [Music] um parameters and each from each type and now we're going to try and understand which function is going to bind Which object so let's look at the first one we have L very ref d which of the one two three four functions do you think is going to bind this of L value reference ID and I'll give you a hint it's more than one but there's like priority so you can again just cream number shout numbers feel free to be wrong okay one and three anyone else want to try yes one and two great okay so one is clear is obvious two is uh also you know uh come from the const rules of the language what happened for const elvalue refd which ones of those overloads is going to bind the const L value reference day and say it in order because so we have cons L value reference d what did I yeah two correct so whoever said two you are correct how about the next one our value which ones of the overloads are going to bind our value one two three okay one two almost hahaha good [Music] um and how about const our value nice wow okay so I have to say this is not a vector basic crowd I mean you've been you've been able to to know the correct ones even better than experts it's very impressive I asked that question for other conferences as well so uh data 73 um that's our PR value who who's gonna find this one two and three three four two close enough and how about the get data remember to get data returns constata so who's going to bind the get data and hint uh remember our bug from the first slides two and four okay so two would be our equivalent of copy Constructor and four is the equivalent that sorry three actually is the equivalent of move Constructor right and that that wasn't that was a hint yeah okay we correct we didn't get the move Constructor called so if you and of course this is not something that you need to memorize but if you are familiar getting yourself familiar with those with those binding rules you will be able to more easily locate bugs as you've seen in the first slide okay so I think even though it's not a simple thing to um to keep in mind I think it's a very important um rules to memorize so limitations of whatever we just found is according to the function so of course if we got a function to take something by const we'll not be able to change this thing inside the function and now in the scope of the function the object the rules applying to the objects are according to whoever whichever function boundates right so it have nothing to do with the original object only have something to do with what happened inside the function that was called okay so now let's look at return statements so starting from C plus plus 17 we have guaranteed copy allegian there are mandatory lesions of copy and move Constructors so basically you can see here data 42 and then data 42 at data wrapping data 42. so normally you sorry you'll have Constructor and then copy Constructor right but what happens with guaranteed copy illusion is that you just have one Constructor this saves time in our code right this improves perform performance and this comes from guaranteed coffee lesions so for return statements we get similar Behavior again uh we have this get data function I don't know if you can see the master oh okay good so it's not hiding the get data we have the get data function and instead of creating this return data and then um copying it to the temporary and then initializing D we just get a single copy Constructor called so one more theoreon I want to talk about regarding return statements and I've mentioned it before is the materialization okay so I've mentioned that we had this change that instead of just PR value we now have something that is pure value materialization and we can do more things with it instead of just the pr value so I'm just gonna uh briefly uh read the the standard terminology that came from this paper a PR value of type T can be converted to an x value of type T this conversion initializes a temporary object of type T from the pr value by evaluating the pr value with the temporary object as its results object and produces an x value donating a temporary object so a lot of words to basically describe that we get something that we can pass around I look at it as a ghost for example so we have this ghost of an object and then at some point we materialize these ghosts and now it's a PR value and then we initialize the x value with it but um T in order to materialize T shall be a complete type basically what it means is that if you have something that is not a complete type you are now able to pass it along and unless you materialize it um it works the compiler doesn't yell at you so to summarize sorry binding rules apply in the following events initialization or assignment function call including non-static class member functions and return statements and the behavior of an entity is defined by the things that binds it so for example in initializations limits are according to the reference which binds in function calls limits are inside the function according to the function that the overload that was chosen and in return statements limits are initialization with additional rules that apply from optimizations and so on so this summarize how do we bind uh objects so our next topic is value categories in generic code any question so far I hope you now feel more comfortable with this very strange diagram that is very intimidating yeah Andre uh yes exactly exactly that that exactly what it means okay so value categories in generic code reference collision and forward reference so when the compiler sees uh multiple symbols Ampersand symbols either in genetic code or in code using type analysis there's something called reference collusion okay so let's look at a let's very briefly look at some example we have LR type depth and we have RR type Deft in our reference so now if we look at this we create this object called lrfb okay because of the type death we now have basically something that looks like this right so this makes sense a compiler knows what this what this is feels comfortable with it but here we have three references so instead of those three references we're going to have intref instead of those three references we're again going to get an in-depth and only the last one was going to be an R value ref okay so this is something that is important to recognize reference Collision can change the type of the expression it's not um it's not like copy paste it's not um exactly uh the amount of references uh that is in the type the compiler will collide into different type and for forwarding reference uh we take this under consideration so forwarding reference was initially suggested by Scott Myers in Universal in the term by the term Universal reference and later it was formalized as forwarding reference so this is a utility from the standard Library it allows you to keep the value category of an object and and also take into considerations its value category when you forward it to forward it on so in order to declare this uh to have this forwarding reference in your code you need to have a template function okay and Define something that looks like an R value reference but is in fact a forwarding reference and this is very important only in the scope of a template function this r value reference will actually be a forwarding reference and now we get something some behavior that is very similar to what we saw before reference Collision so basically if you call this Foo function with a we we get interf and not intraffref as you may have expected if you just look at the code again similar const interest and int at the end so the reason that the forwarding reference works the way it works is because the compiler takes under considerations the number of ampersands and to summarize we get the value category of the object being passed away okay so we now got to our final uh our final part uh I can see I still have okay um around 15 minutes but it will be enough to go over them I think uh we're gonna see six utilities from the standard Library and from the language that help you manipulate value categories to understand better what you just saw and to control them in your code so the first one is stood move so as we've mentioned before move basically allows you to explicitly Mark something as temporary it produces an x value expression t reference it's equivalent to like uh to doing static casts to an R value graph type and it may not always do what you hoped so as I mentioned in the first slide and this is the same example basically we get the proper calls for most of the objects but once we get something uh once const is in the picture we may get something a bit different than expected uh by the way I forgot to mention all of those code Snippets you can see numbers on the right side the one and two and all of that these are all code Snippets that are added to the talk so I will publish my slides later and you can go into God bolts which is a great uh tool and play with those examples and try to understand how things work and and so on so next utility do we have is stood forward we already saw stood forward um just a bit of terminology it's basically been suggested by this paper by Peter Dem of Howard H in and Dave Abrahams in 2002 and it suggests a solution for the forwarding problem so basically the paper was called the fall reading problem they've recognized that there is an issue and the value categories are something that needs to be preserved and suggested this utility and forward use some other utilities from the standard library in in the implementation and it's com commonly used combined with forwarding reference so an additional example uh to the one that we saw before I have here uh three overloads of the functions one for INT ref L value ref one for const alpharograph and one for r value graph and we have two template functions declared first one is with forwarding and we stood forward in the in their return and the second one is without so let's see what happens for each of those cases so for INT we get interest under return for L value a console value we get a console value in the return but for PR value we may get different results depending on whether or not we added the stood forward to our Return of the function so this is the problem and this is the solution decay so Decay is a type trait part of the trades Library and its result is accessible through underscore T just like other traits that you may have seen and it performs conversions of array to pointer function to pointer and L value to our value um the idea is uh that we have something if you're familiar with the behavior Auto and I assume that you've all have seen Auto before there it is the Decay is doing something very similar to what Auto is doing so basically if you use Auto you'll get automatic Decay process for your type decal type is a language thing is language utility and it basically gives you back the type of the object including value category which is very important so unlike alt Auto it preserves value category because as we said Auto decays and doesn't preserve value categories and it can be used instead of a type as a placeholder which preserve the value category so for example here we have a which is declared with Auto and we have decal type Auto B and you can see that the type is different value category in this case is actually the same so just notice that the pr value doesn't have to be materialized which is what I've mentioned before you can use decal type on things that are unmaterialized starting from 17 and if evaluation fails the program is ill-formed notice also and this is very important people get uh people sometimes miss it that using decal type with double um [Music] parentheses creates different results than with a single one so decal type main use cases when the type is unknown we can use that to retrieve the type okay so here for example and to preserve value category when we declare something so if you want to preserve the value category we should use tackle type instead of Auto okay decalval does something completely different it's basically takes a object and Returns the value so sorry yeah it can be used on expression to return the expression reference type and it can return non- non-incomplete types just like I said before so for example this type is incomplete type we try to create this type it doesn't work but we can actually use decal Val on this type in order to get some information from the compiler so this is again a way for you to communicate with your compiler and combined with decal type we can date we can get a type of a member of an object for example or other things as well like in this example I'm gonna leave this here and let it any questions okay so as I mentioned it allows access to members and we get the value category of the object and combine together they can be used to transform between type and instance Okay so we've now reached the final um the final utility that I wanted to mention in this talk and this is really is advanced stuff at least in my mind uh so reducing this uh got into the standard for C plus 23. uh basically it's uh been proposed by Gasper PSY and Ben Dean and barriers and sorry and it was voted into the standard it's a lot of specifying uh from routine a member function the value category of the expression it's invoked on so basically we are now able instead of those three functions okay and here we have three different overloads inside of a type struct one for each member type we can basically take all three of them and use single declaration template function that can combine all those three together so also as you can see the explicit this keyword is is basically the addition of the feature so this helps you a lot if you need to write for example libraries or you need to write multiple overloads for your type and now you can instead of writing all those different overloads just write a single template function and get them all together it also comes with some other utilities like undersculty will take will take an object and then return a value category applies the value category on this object and forward like for an instance of type view and cons and ref qualifies to the T so I see I don't have enough time to go over all those so I'll just skip so to summarize we saw uh what are value categories how do we use them in practice and the fact that bindings are very important in this case value categories in generic code are you need to consider forwarding reference if you start writing them if you start writing generic code and care about value categories and we saw how we manipulate value categories using tools from the standard Library but I will mention that there are a lot more tools that you need to take under considerations when you do start working with code that cares about value categories so again the talk is too short to cover all of them but I recommend that you go and look for them in CPP reference and it's very interesting a lot of type traits that you need to use a lot of Library tools all language utils thank you for listening and uh I just want to thank uh Karen devere and Andre here for um helping me review this talk and I would love to get your input um thank you [Applause]

Need a transcript for another video?

Get free YouTube transcripts with timestamps, translation, and download options.

Transcript content is sourced from YouTube's auto-generated captions or AI transcription. All video content belongs to the original creators. Terms of Service · DMCA Contact

Back to Basics: Master C++ Value Categories With Standard...