Multi sensory interaction

Hello all, welcome back to our course on digital accessibility. And today we'll talk about what is multi-ensory interaction and we'll walk you through some examples of existing tech. Um we have uh discussed the various aspects of including multiensory feedback systems and multiensory uh input systems in our some previous lectures uh primarily probably in the universal principles of design session. You can go back and check it out. uh we also discussed about it in emerging input and output technologies lecture. You can go back and check that out as well. uh what we are going to talk about today is uh some specific aspects of uh multiensory interaction and I also introduce you a new term which is called uh multimodel a effective computing and u just a little bit of a glimpse of it and uh so that you are aware of what all is happening in this domain. So let us start at the point where we want to understand what is multi-ensory interaction. So multiensory interaction in accessibility uses site, sound, touch or even smell or taste to create inclusive design digital and physical experiences. Right? So I think we all are aware that we have these five senses, right? touch, sight, taste, hearing, audio and smell. Right? This is something we all learned in school. And uh so what now we are talking about in emerging tech and new solutions is catering to all or several of these uh modalities at once so that we get um not just inclusive solutions. So for example uh if somebody is unable to see they are able to experience um the system or interact with the system through other means or for example if I I am able to see but having multiensory modes enables me to interact with the system even in situational disabilities or in temporary disability conditions. Right? So I for example am unable to visually give attention to a screen at a given point of time but still the information is communicated to me through um audio or through some feedback systems which are virotactile. So I'm still able to get an idea about what the uh where the video is or what the system is talking about despite having uh despite not having the need to continuously visually interact with the system. Right? So it also enhances overall experience of a person. Right? because you're not the system is not constantly seeking your visual attention, your audio attention and thus uh or any other modality for that matter and thus it's not a consistent cognitive load on the user and that is what makes it a little bit more easier to interact with it. That's why we say that it enhances user experience and particularly in terms of accessibility of course it's uh it's going to enable inclusion because uh say people uh who are who have visual impairment they they have no access to um one of the modalities then relying on other modalities become even more necessary and that is why the system needs to have mult multimodel interactive elements. So first having multimodel interactive elements helps in providing alternative information channels for users with sensory impairments such as audio cues for visually uh challenging uh persons or visual alerts or close captioning for hearing loss etc. enhances engagement for everyone, right? By catering to diverse cognitive needs, right? So as I mentioned uh before uh as well that if I am not constantly required to engage one of the modalities 100% with 100% attention with the system I am also able to relax my uh modality for snippets of time. I am also able to divide my cognitive load between modality. So for for a bit I am not watching the screen but I'm just hearing for a bit I am you know kind of letting my ears and my audiary modality relax a bit and just looking at the pictures. So all of those the combination uh can enable a better and enhanced user experience for everyone and particularly with diverse cognitive needs. People with attention deficit for example. Uh so dividing it amongst modalities, dividing attention, the limited amount of attention that they may have uh in between modalities can be really helpful. Also helps in making your content more memorable and intuitive through combined sensory inputs. So why do you think it makes it more memorable and intuitive? So imagine that you're walking in a forest where there is uh no we're not talking about any systems but we trying to understand how human break human brain and modalities function how human attention functions right so we are not engaging with any systems we are just walking in a forest and uh can you imagine how different modal ities are engaged in experiencing the real world uh in a real time um in a natural setting where there is no artificial artificially created sound or artificially created um visuals but you're just walking through a forest. How does one interact or experience the forest? You at all times may have visual input which is you are able to see uh the pathways, the trees, the brooks and everything. You are able to constantly hear different sounds like rustling of trees, rustling of leaves. There may be insects, there may be crickets, there may be animals, there may be bird sounds. There may be certain smells which are coming at all times in uh through different flowers, different types of trees. There is also tactual feedback where you are stepping in on a softer surface say mud or stones which are kind of pinching your feet. You are maybe also touching different trees or you know leaves kind of uh touch your skin and uh you are interacting with that as well. And you may also uh you know kind of uh pluck pluck an apple or pluck uh a wild uh fruit and have a bite at it. Right? So basically humans innately have interacted with the world in a multimodal fashion. Right? And that is why it is more intuitive to interact in a multimodal fashion. Uh forcing the s in an artificial environment. What happens is uh traditionally developed technologies they might require some particular kind of mortality to um to be employed far more than other modalities which may not have been the case in uh where our ancestors were walking through the forest or even you today if you go into a forest and walk through a forest all your senses is like auditory and hearing in for those little bit sounds and uh um you know smell uh catching up on those smells. Uh so all of those uh have to be used. What has happened uh in uh the recent past where most of the artificially created man-made technologies have required more and more visual attention uh to be the primary mode of interaction and that has led to uh users humans to have diluted their other modalities like hearing and smell and taste and all of the other uh tactile equity and everything, right? So, we are hardly really interacting with um all of the other um uh modalities. We are largely relying on visual attention and that has um you know kind of helped in advancing technology for the previous decades, past two decades to be particular. uh but in now in emerging tech and up and cominging technologies are realizing it and trying to make use of multimodal interaction. So visual plus audio plus tactile plus this plus that. So I mean all of those facets are now coming in form of interactive systems and thus making it more intuitive more tailored to the human way of thinking. Right? So anything that is intuitive that means literally that it is tailored to a more human way of thought process and thus it makes everything a little bit more memorable also because you are gathering information from multiple channels and if at all there was a attention deficit in one of the channels you there is a better chance that your brain may have registered it through another channel that makes multimodal information more memorable as well. So this approach moves beyond the traditional accessibility thought process creates more richer more holistic experiences for users with visually a visual auditory motor cognitive any disability for that matter. So in this image we can see that uh different multi how multiensory design functions right so designing experiences that involves all five senses or as many senses as possible rather than just one or two. Um we are talking about sight which involves movement, light, color. We're talking about sound which helps us in governing distance. um loud, quiet, all of those aspects. We're talking about smell. Uh it helps us in identifying if something is fresh, if something is now stale or is stinky or all of those. It helps us in assessing the temperature of something that is we that we are going to eat. It helps us in assessing texture, uh flavors. Again, touch also helps us in assessing textures, temperature, vibrations and all of those um through our senses, right? Um so a combination of multiple of these parameters can help us in engaging with the world a little bit uh better. So creating u u the why we using multiensory design we're trying to create more inclusive environments for people with sensory impairments and uh maybe we are asking some specific questions. So uh we can ask why what is the desired effect how what are what is the desired experience what objects uh elicit that experience. So similar sampling um primarily relating to real life not man-made objects. So it would be easier to understand the intuitive aspects of multimodal design. What are the sensory experience going to be and translate the findings in a coherent design. So uh what are the um benefits of uh multiensory design or aspects of multiensory design? It it is a cross modality design approach. It combines information across sensory channels. It is it it by definition it becomes inclusive in its approach because designing from the outset to create cater a broader spectrum of abilities and not just as an afterthought. So from the beginning we are talking about that we will be looking at all of these or several of these modalities and that kind of automatically enables inclusion because uh we creating uh the system which can effectively be used by people with sensory deprivations and sensory impairments. helps in uh making uh holistic experiences integrating sensory inputs uh and um like sight, sound, touch, smell, everything into a unified meaningful experience. So in order to get to it, all of these steps may be useful. So you can check on that. Another uh aspect which is again part of uh emerging technology and emerging frameworks is something called multimodal affective computing or MAC and it also helps us in coming up or generating or developing holistic unified experiences. Right? Unified is very important and it is also a very important aspect of something which you might have heard about it's called IoT internet of things and why we are saying unified because one uh the inputs may um you know be not just through different modalities ities but also through different other parameters which are emotional, which are uh stress related, which are and uh as a system the environment is also kind of adapting to your emotions and then reacting and then telling you the feedback. That is where that is where I systems come in. So what is happening in a multimodel a effective computing is uh it also so this is an interdisciplinary field. It is kind of uh it finds its roots in multiensory design or multimodal uh interaction. It uses multiple data sources. So modalities of course like facial expressions, voice, text uh etc. as well as it also looks at psychological physiological signals uh like heart rate like stress waves like brain waves etc. So it basically recognizes, it interprets and simulate human emotions creating a more robust and accurate emotion aware system than just single than just modality related approach. Right? In a modality what happens is that you are you await an input based on that you can u u give a reactive output right or you have to uh build in punch in a lot of environmental conditions. But the moment we are saying that we want now with emerging tech we want customized solutions we want we want personalized solutions right. So uh the more personalization one would require the more emotional and context aware the system has to be. In order to for the system to be emotional and context aware uh about their users, they have there have to be other physiological signals such as EG signals, heart rate signals and all of those uh so that the system can respond in an appropriate manner. So what are the basic fundamentals of affective computing? So the fundamental theory is that emotions are commonly defined according to models from the field of psychology. Then there is signal colle collection. So text, speech, facial expression data, gestures, physiological data and all signal emotion. Then there is collection part. Then there is analysis. This is where ML and AI models kind of come in where we say we are saying machine learning and deep learning models are used for modeling and recognition. Then there is multimodel fusion. So fusing different types of emotional signals improving classification accuracy generation and expression. So robots etc. need methods for producing recognizable emotional signals as well. Right? So now with um MAC we are also talking about humanoid interaction. We're not just talking about humans interacting with systems. We're talking talk talking about humans interacting with robots as well. So by fusing information from vision, audio text and and bio sensors, MAC overcomes the ambigu ambiguity in single signals enabling better understanding and modeling of complex human affective states for applications in personalized effective tech uh assistive technology. So this is just like um intro to this terminology. Uh and maybe you can kind of go and read a little bit about yourself. If you are interested, you can search for um affective computing. You can search for uh all of these terms. So signal collection in effective computing theories of uh multimodal affective computing. Uh then fusion of different signals, model signals and emotional signals all of this. Then there is sentiment or emotion analysis in um affective computing. So you can go uh and kind of search and uh learn a little bit on your own. um because this is a very uh technical uh model and um the course uh domain does not um you know cover all of these aspects in very much detail but I can have a session a separate session uh covering some other aspects of multimodel a effective computing particularly how it is used in personalized assisted technologies. So now let us walk through some examples of multimodal assisted technologies. To begin with there can be multi-ensory alarm systems. So imagine that there is a fire alarm it is only blaring sound and if a person who is a person with hearing loss they will not be able to even understand what the commotion is for. So there are these uh multimodel alarm signals which also emits light and as well as sound. Then there can also be tactile beepers or vibrators particularly for people who are deaf blind or who may have attention deficit. Then there are multi-ensory uh alarm systems which get triggered by multiensory input as well. So not just fire but also heat uh or also smoke uh so uh so that it can be more effective. Then there can be braille plus audio textbooks right. So then these kind of systems are also available where u students with blindness they interact with braille material uh which is also audio tagged either through electronic systems or through now we are also uh looking at solutions which are um using augmented reality and gesture recognition for audio tagging of uh tactile material. Actually there is a demo in upcoming session which you can also see it's it's of a project called talk tile where we are using augmented reality audio tagging audio tagging for tactile books and maps. So um in one of the following sessions you can have a look at that demo as well and there are several more solutions which are available uh throughout the world in that domain. Then there can be interactive whiteboards. So uh using sound, visuals, touch, there's also vibratory feedback uh and um um you know now with haptic gloves and virtual reality learning environments, this can become even further immersive. uh and uh so I mean smartboards is something which is already implemented in a lot of classrooms around the globe including India and um um that enables real time um you know visual auditory and tactile feedback for uh engagement it's a more fun for children as well as teachers to engage with also then there can be tactile maps and 3D models So for geography or STEM learning. So these can be 3D printed or or they can be models again which are enabled through gestures. So audio is enabled through gesture. So as soon as you touch uh a particular part of the model uh based on the position of your hand the system is able to uh churn out an audio feedback and uh tell you about that particular building. Then I'm sure you all have already seen that uh in assisted navigation mobility um there are tactile paving uh on the road as well as beeping systems on the sidewalk and railway platforms. So uh one can also receive an audio signal at about when the traffic stops and you can cross the road. Then there is also tactile paving which helps you understand at what uh point on the uh sidewalk the the crossing the zebra crossing is placed at. This is something uh which we in India may be able to see primarily inside metro stations or railway stations. Uh however globally it is something which is also very much inbuilt in the public road based infrastructure also. Then there can be like multi-ensory interactive rooms particularly for autism therapy uh using calming lights, textures, sounds etc. um for uh children who may be on autism spectrum. So as you can see in this image there's also light there all kinds of um you know different tactile furniture. So different kind of tactile inputs and sens sensory um inputs as well as visual inputs as well as you know screenbased inputs all of those uh are present. Then in terms of variables which is again a very emerging uh aspect of technology in today's day and age haptic navigation vari uh variables use vibrations as well as visuals and they largely rely on vibrations uh because nobody's you know watching all the time keeping a watch um on their watch all the time. So uh they are also experimenting a lot with um virotactile patterns right so for example this haptic nav um they are providing uh haptic navigation solutions so it's guiding people through cities campuses airports and everyday life using universal language language of touch. So it's like a navigation system which runs on your phone. Also has audio feedback also has uh tactile feedback through a variable device. Then gaming and virtual reality. So controller vibration signals in games uh events for players with hearing impairments uh and uh so that they are able to get an idea of if the character is kind of banging in itself somewhere or you know shooting. So there is also a vibration on the controller. Then there are more games which um which are uh enabling high contrast visuals or vibration and audio. So people with some you know light perception and some visual impairment can also interact with these games. Then there can be rehab technology. So multi-ensory rehabilitation tools they help stroke patients to relearn motor skills. So there is a screenbased system. There's also audio guidance. There is also motor guidance uh where the robotic arm is kind of guiding uh the hand for writing again uh in public spaces and smart cities. So we as I said that in metro stations etc. we might have seen that there is audio announcement as well as tactile floor guidance uh to help passengers uh navigate through um the through the space and reach the respective doors. Then there can be other dayto-day interactions like elevators with spoken floor numbers or as well as braille buttons. uh then wavefinding signals which uses high contrast icons as well as raised uh braille text. So in summary, multimodel interaction enhances the overall experience of users not just users with um certain impairments or disabilities but for everyone. It enables inclusion of various kinds of individual needs. Now with emerging technology, a effective computing and multimodal effective computing can allow for a high degree of personalization uh in the emerging tech because now we are also looking at other physiological parameters like heart rate and stress level and EG signals for interactive interacting with the signal. So adaptive uh system behaviors can be designed using all of those parameters. So this is all for this topic and uh I'm sure I'm sure you are intrigued uh to know more about multiensory design, multiensory interaction, multimodal effective computing. So I would uh request you all to please uh do a little bit of a research and reading at your end as well about these topics and we'll be happy to discuss in the forums or in the live discussion discussion session as well. Thank you for joining today. I'll see you in the next session.

Full Transcript

Need a transcript for another video?