Vagner Santana, Author of Interaction Data Analytics: Methods, Tools, and Applications
A Leanpub Frontmatter Podcast Interview with Vagner Santana, Author of Interaction Data Analytics: Methods, Tools, and Applications
Vagner Figueredo de Santana - Vagner is the author of the Leanpub book Interaction Data Analytics: Methods, Tools, and Applications. In this interview, Vagner talks about his background, getting a PhD, the fascinating details of human-computer interaction, computer anxiety, explicit and implicit forms of interaction,his book, and at the end, they talk a little bit about his experience as a self-published author.
Vagner Figueredo de Santana is the author of the Leanpub book Interaction Data Analytics: Methods, Tools, and Applications. In this interview, Leanpub co-founder Len Epp talks with Vagner about his background, getting a PhD, the fascinating details of human-computer interaction, computer anxiety, explicit and implicit forms of interaction,his book, and at the end, they talk a little bit about his experience as a self-published author.
This interview was recorded on October 13, 2022.
The full audio for the interview is here: https://s3.amazonaws.com/leanpub_podcasts/FM240-Vagner-Santana-2022-10-13.mp3. The Frontmatter podcast is available on our YouTube channel at https://www.youtube.com/leanpub, in Apple Podcasts here https://podcasts.apple.com/ca/podcast/frontmatter/id517117137, on Spotify here https://open.spotify.com/show/00DiOFL9aJPIx8c2ALxUdz, and almost everywhere people listen to podcasts.
This interview has been edited for conciseness and clarity.
Transcript
Len: Hi I’m Len Epp from Leanpub, and in this episode of the Frontmatter podcast I’ll be interviewing Vagner Santana.
Based in Yorktown Heights, New York, Vagner is a Research Scientist, Certified Data Scientist, and Master Inventor at IBM Research. He has been developing software professionally for over two decades, and is an expert in Human-Computer Interaction, with a Ph.D. in Computer Science from Unicamp in Brazil.
You can follow him on Twitter @santanavagner and check out his website at plasticdesign.eti.br.
Along with Unicamp Professor Maria Baranauskas, Vagner is co-author of the Leanpub book Interaction Data Analytics: Methods, Tools, and Applications.
In the book, Vagner presnts methods, tools, and applications you can use to understand and uncover the value of interaction data when people interact with computing systems, going beyond just measuring clicks.
In this interview, we’re going to talk about Vagner’s background and career, professional interests, his book, and at the end we’ll talk about his experience using Leanpub to self-publish his book.
So, thank you Vagner for being on the Leanpub Frontmatter Podcast.
Vagner: Thanks for having me.
Len: I always like to start these interviews by asking people for their origin story. So I was wondering if you could talk a little bit about where you grew up, and how you first became interested in computers and human-computer interaction?
Vagner: Sure. Well, I’m Brazilian. So I grew up in a small city in the state of São Paulo, close to the capital of the state, which is also São Paulo. I first got interested in computers, because, my brother, my old brother, he is - he did his undergrad in Computer Science as well, and he started working really early with programming languages.
So I believe the first time I saw a computer was like mid 90s. I did a couple of courses on DOS operating systems back then, and I - and a few times, I saw like my brother working on his notebook, and I got interested. And when I started like learning the technical, more technical courses, just trying to understand how things worked. And trying to make sense of programming languages, flow charts, that I was, used to see, back when I was living with my mom and my old brother at that time.
And then, well, I noticed that I always had this - always liked to interact with like math, and these other more exact disciplines, right? And when I started for the first time, I was able to do some courses on programming languages in school. Also early - mid 90s as well.
And then I started, I got this opportunity of working in the newspapers. The newspaper I worked at called, “Folha São Paulo.” And at that time, was only editing like pictures, that time was early - I’d say early age of the web, especially in Brazil, at least. And I started working there just doing some HTML and like cutting and treating some photos for the newspapers.
And then things got serious, like I started doing lots of courses on programming languages. And like one year later, I was working as a, that time, we call like “webmaster.” Nowadays it would be like the full stack developer, or something. And that’s it, I started working as a webmaster there. And then soon after, I just joined, started college. This I did in São Paulo as well, in a university called Presbyterian University, Mackenzie.
And then I just didn’t stop, right? I continued studying and did my Master’s at Unicamp, and PhD. And after that I joined IBM, IBM Research for - I am, I’ve been at IBM for almost 10 years now. For 9 years, I was based in Sao Paolo. And then this year I came to Yorktown Heights, working in a specific project for the next two years. So I think it’s, in a short way, that’s how I started, and got interested on computing and well to -
You mentioned also about HCI. I think that I started to get more sense of HCI, especially like early 2000s. I started trying to get a sense, especially when I was working at this newspaper. It was, like at the time, one of the most viewed newspapers in Brazil. And at that time, they had a really good audience. So I started to be concerned about how people would access, to try to make things always as easy as possible, Brazil is - well, it’s a developing country, so it has the same challenges concerning social economical challenges we see in almost all developing countries.
So, I, at that time, I was worried about doing the right thing, and trying to reach more people. So I started studying disability, accessibility. And then I got interested on this aspect more connected to HCI, that at that time, I didn’t even know that, the name of this branch of Computer Science.
And then when I was finishing my undergrad, I did what they call there, this project for the graduation. And I started studying more about the theoretical references behind HCI, and how we can like see guidelines helping people to, how to make - creating more usable user interface. So, and well, after that I got interested in reading some stuff. And started publishing my first academic papers, just at the end of the under graduation. Got interested in pursuing my Master’s and PhD degrees.
And then after reading about different stuff, I got super interested in this aspect of analytics. Because I had this feeling of how we can get data. Especially coming from this experience in the newspaper I mentioned before. Where it is possible to understand how people use this kind of system at scale. Just analyzing how people interact with data, like analyzing logs.
And so I got really interested about this kind of dataset at the time. And then, when I was pursuing my Master’s degree, I tried to kind of combine some of the things I got interested, like, “What if I get this kind of data, but a more nuanced kind of data or more details about interaction, but having in mind like accessibility and usability? What can be done?” Because at that time, I was already feeling that, “Okay, nowadays we only analyze clicks, but what happens between clicks is kind of interesting, and sometimes it’s really important to understand what happened.”
And then I started to - trying to understand, the values we can extract from this kind of data. And then, it’s how basically my interest in HCI got really - well, a space inside the community as well. And I also noticed that there was this, I’d say this gap, on bringing more state of the art of data analysis to some areas of HCI. So that’s where I thought that I could contribute somehow, like bringing this data analysis mindset to this specific realm of HCI. And so, in short, that’s how I got interested -
Len: Thanks very much for sharing that story, I’m really interested and getting into sort of data and what you talk - when you talk about things like what happens in between the clicks, right? Because a lot of people, when they, they might - if they’re into sort of like analysis of sort of interaction, the immediate thing you think about is like clicks per, things that people click on, how - of all the things they see, how many of the, which one do they pick to click on? And things like that, so clicks is - and of course, that’s usually a sort of stat used in advertising pricing and stuff like that on the web, so that’s why people would think about clicks.
But before we do that actually, I’ve got - one of the sort of pleasures of this podcast, and sort of being part of Leanpub, is that we have authors from all around the world, who have all these different experiences. And I wanted to ask you specifically about what - how PhDs work in Brazil. I mean, assuming there’s a kind of general kind of way they work, right? Because it can be very different in the UK from the United States and Canada, for example.
And I’m looking at your LinkedIn profile here, and I can see that your PhD is 2009 to 2012. And that’s more of a UK style timeline than an American one, which might be like six or seven years, or something like that.
So I was wondering if you could just, yeah, explain a little bit about, are you all about dissertation from day one? Are you expected to teach and take classes when you do a PhD? Things, just, how does it work?
Vagner: Correct. Well, I didn’t mention that, how was my transition from the work I was doing in the newspaper, at the Folha São Paulo. From Folha São Paulo to academia, and then research at IBM.
Some point when I was like doing this research when I was undergrad student, I got super interested on doing research. And then, at the time I decided to go all in, and did the academia, right? And this is one of the possibilities, right? So I was, I applied to this position at University of computers, Unicamp.
And then I joined a project that was running under supervision of Professor [?]. And then there was a scholarship for me there. And I was able to work and study full time there, over the next few years. So this is one of the ways of engaging with research outside. So receiving scholarships. Over the years, back in Brazil, when people are like in this full time dedication, it is expected for a good time for a Master - Master’s project is kind of two to three years. And for PhDs, the regular time is like four year period for the whole thing. And I would say, for the PhD, usually when you are doing that, we can, we need to have some credits.
Len: Okay.
Vagner: And then, the PhD thesis. That is somewhat half and half. So if you have four years, usually you keep your two first years to control all the credits. And then to follow-up as a dissertation. Sometimes people do start - if they have already a really good idea of a project, and like it’s a fit for the supervisor, then can be thesis from the day one, right? Especially if you are coming already with some background on the area from the master degrees, then it helps a lot.
So, and for instance, my case, it was the next step. The PhD was the next step over my Master’s project. So it was, I had a backlog of publications. I had a backlog of ideas. So was more in the sense of getting deeper on that subject, right? So that is one way. And to your point on the period. I decided that if I was doing like full time, I was, do as fast as I can. Because I wanted to apply that. I wanted to finish as fast as I could. And at the time, I had the scholarship that was for three years. And then was also another, let’s say, motivator.
Len: Was a good incentive.
Vagner: Yeah.
Len: Yeah, thanks very much for that. That’s interesting. So for anyone sort of not familiar with sort of the world of PhDs, there are - it might be surprising, or it might make sense, but some people do actually sort of start the PhD, without necessarily having a very clear idea of what they want to work on. And that can be in any subject, right? But I can say, like, from my own experience, having gone in with a - for my own doctorate, with a very clear idea of what I wanted to do. Like you will probably finish a lot faster, if you sort of like have a sort of worked out idea and plan. But of course, that involves a lot of work to get there. So you kind of frontloaded it, before you start the PhD, if you’re in that, like if you’d done that kind of thing already.
Vagner: Yeah.
Len: So, yeah, so that was really interesting. And, yeah, as I said, if your scholarship runs out after three years, that’s a great incentive to finish. Because as a student, you’re probably not living the high life anyway. And so, yeah, would you mind actually talking a little bit -? I know it’s been 10 years, but would you mind talking a little bit about what your PhD thesis was about?
Vagner: Sure. I mentioned about finding the area of usability and accessibility. I really would like to understand ways that we could like do user interaction to be as simple as possible, and to be efficient, to be effective, and people feel satisfied after using that. And I started looking to some papers and books around the subject. And when I started thinking about Master’s project, in fact, the thing that I thought at the beginning, ended up to being the whole thing. Like my Master’s, and PhD. Because what I wanted to do at the very beginning was too big to fit in a Master’s project. And I started to explore piece by piece, right?
And then, the first thing I was, like concerned about, was that we always need to have this kind of data super well structured to analyze what happened, and we need to have clicks and this kind of analysis. And I was thinking about, “Okay, but what if we start analyzing everything? Like everything. Instead of using one type of event, like click, we have tens of events triggered while we interact with computers. And we can customize, we can trigger more events. We can have - nowadays, we have more devices that trigger even more events. What if I could analyze all those events? And think about how people interact, different ways?”
And then, when I started to research about this topic, I really found that universal design was the philosophy I would like to follow. Because it is something that opens your mind to so many different possibilities. And with the help from Professor [?], I was super intrigued about the possibilities that universal design - and also, what universal design brings to you when you’re thinking about possibilities. Because usually you would say, “Okay, let’s analyze clicks.” But does everyone use mouse? What about blind people? Then you start thinking, “Okay, oh but we can analyze if users did this, this and that, or if they downloaded something.” But what about places that the connectivity’s not so good? We can even analyze the speed that the images are being downloaded. Because sometimes we can see that things are not working properly, because, just because the internet connection.
And then I started to see that there was this whole range of possibilities of analyzing this really rich data set, about how things actually happened. And when we, for instance, when I started to research about useability, there’s this whole idea that when we freeze, we don’t know what to do when we are looking at a user interface. You’re trying to make sense of that. “What should I do? Where can I go?” And then, even the lack of interaction is an interesting information that you can consider. Like, right? If you, people are moving, and then suddenly they change, they speed up their use of the mouse. Or they suddenly change the way that, the speed that they are typing. So what does that mean, if you start analyzing the nuances of this information. And the possibilities of combining everything, like, okay.
And my first idea was, how can I differentiate or analyze differences between, when people that is using the mouse - for instance, are navigating through a newspaper website or an ecommerce. Like, “Okay I just entered, and I want to go to this link.” It’s just to move the mouse over that 2D screen. But for people that are using screen readers, and screen readers are those systems that read content behind the 19:55 [?], right? We have structural and textual information. And the screen readers, usually they bring this information for people that cannot see, or even people that use screen readers as an additional modality, to also hear what is in the screen.
And if we have, let’s say, blind people using the same web page animation before. Probably they will have to tab over the links, navigate through some structure. So instead of reaching, starting from A and going to B like in one straight line in a few seconds, probably this other person will have to traverse like multiple links, until they reach the same point. So we have different ways of doing that. And I thought about creating a system that could first identify those differences. And the second aspect, how can I improve the UI, the user interface, for everybody? Like to make this accessible to the most extent possible.
And that’s where universal design comes into place. Because it’s the idea that you shouldn’t create one specific solution for each specific group of people based on their abilities. You should create one that tries to bring everybody to the same level, right? Brings to - allows everyone to use the same way. And the classic example is the supermarket automatic door, right? When you are entering, the sensor doesn’t know if you’re carrying a baby, if you’re using a wheelchair, if you’re using a 21:53 [?] why can’t. If you’re blind. If you’re walking. You just enter, and the sensors, identifies you, and the doors open after that. That’s it. That’s the classic example. Usually gave, usually give when we talk about universal design.
But if, we’re not talking about computers, we need to think about the modalities and how people might - I’d say, they code that information, right? So we can think about, “Okay, if I’m providing this information on textural way, is it well structured to the point that a screen reader can access that, and decode that for people that cannot see?” And also, sign language, is there a way to convert that textural information in English or in Portuguese, for instance, to the sign language of that region?
And when you start to think about all of these possibilities, then - I really got interested on doing something around that. And after researching even more, I also changed it a bit, the way of looking at this problem. Because at the beginning, I was like everybody. Too much focused on disabilities, on types of disabilities. But if you start paying attention to the world around, you see that it’s not so much about the disability. Because if you think about one specific type of disability, let’s say blindness. Well, I need to use glasses, because I cannot see a certain distance. And if we - we see lots of technologies, and lots of research projects dealing with one type of accessibility, or one type of disability. And in like descript 24:05 [?] way. Or if you, either you have or you don’t have.
But the real world’s not this way. When we get older, we’ll have multiple types of disabilities or limitations, that is, and these continue - that are in this, continue, from a mild to severe, or something. And this, just to talk about this 24:32 [?{ versus continuing way of looking at different abilities. But also the context of views. Because, well, right, if we are, let’s say, listen to music, and then our cell phone just rings. Our, my hearing capabilities are not available to that channel, to that modality. So accessibility goes beyond the disability. And it’s more related to the context of use, right? If the modality, if that specific situation I can use all my capabilities. Because when you start talking about accessibility, we, I’ve heard that couple of times. “But, oh, you’re talking only about 5% of the population or even 10%, 12%.”
And when we talk about universal design, now we are talking about everybody. Because it is situational. It’s not something that either you have or you don’t have. It depends on the situation. So if you see a ramp, oh but you’re not in the sidewalk, well, if you don’t use a wheelchair, you think that is not for you? But if you’re carrying your baby in a stroller, then it’s for you as well. If you’re carrying a supermarket cart or something, you can use that. So it’s a way of changing your mindset to see all of these possibilities. And I try to apply that in this specific area of capturing data, and trying to see, “Okay, what are these possibilities?” There’s multiple flows, I mentioned before. We have groups of people that go from A to B really fast, really quick, in a few steps. But there’s this other group that goes this, all, this challenging way, with lots of steps, to reach the same point B. What can we do? And to summarize -
Len: Yeah.
Vagner: My project was like on getting this data, understanding or trying to identify these problems or these barriers. And with a way of applying those rules, we could apply small adjustments to the UI and identify if that improved or not. So basically, cluster groups of people, based on the way they interact, not by the way that, by disabilities. Because, for instance, I don’t use a mouse, but I use the keyboard a lot. And when - assistive technology, that benefits blind users, will also benefit people that use keyboard a lot. We will have, that’s a way of interacting and 27:16 [?] charge cards, and so on.
So that was the first thing that I brought. Like analyzing the data based on interaction, not on the capabilities, right? And then if I have, there’s multiple possibilities. Ones that are faster, others that loops, and some different paths. How can I improve those UIs to get, to help these people to get the same point, in fewer steps?
And then, there was the - the tool basically had a few rules. And it was this way of creating these adjustments. And based on these clusters, first the system identifies what are the adjustments that can be applied to that cluster? And then divides each cluster in a, experimental, in the control group. And then applies and analyzes next visits, how, and if that path was improved or not.
So that’s the whole story behind the system. But it tries to make one simple thing, that is to get this, the number of actions as short as possible for everybody. But there’s this whole tooling aspect done. To decode that involves capturing data in a lightweight manner, transferring to somewhere else. Also to identify these multiple paths, then how people are interacting with the UI performing this clusterization of these kind of data sets.
And then applying this. Identifying properly. What are the adjustments that should be applied to this, this and that? So the adjustments are supposed to be small improvements, or changes. We only know if it improved or not in the subsequent visits. But imagine that you have a small link on a certain page, that some people can - like let’s say, a log out link. Some people can easily move the mouse over, and then click and log out. So we have this, half this set of, short set of actions. But we have also people that try to click on the link, and the link is so small for that specific group, that they go, they have the mouse to hover out the - or to go out this element. They click outside. And then they click, move over. Hover over the link again, and then click. So we have an additional click. We have click on a non-clickable area.
So these are rich information we can use. And for instance, one adjustment for this specific situation could be try increasing in 10% the size of this link. And the next visit, we analyze the whole set. Okay, for this next visit, did it improve, or not? If not, we can increase again. Or, just, you can forget that specific rule, because it’s not working, so just forget. But the whole idea was to apply these small adjustments, and see if that improved or not, the UI.
Len: Thanks very much for that. It’s interesting, you captured sort of a lot of the range of the sort of things that people were involved in human-computer action have to get involved with. Which includes sort of like, really thinking through things at a deep level, about what human-computer interaction is, right? And so, I loved when you mentioned it’s situational. You reminded me of a very specific image actually, of a - I interviewed a Leanpub author, and she had been using our - on her laptop, she’d been using our in browser editor writing mode, right? So you write in the browser. But she’d recently had a baby. And so, now she was one handed.
And when you mentioned the sort of ramp, I mean a lot of people think, “Oh, that’s for -“ Sort of like, you did invoke that sometimes people get kind of sarcastic or even negative about it. They’ll be like, “Why are you designing this whole thing for 5% of the population?” And it’s like, “Well there’s, what if someone’s got a cart of groceries? What if they -? That they’re pulling behind them. What if they’ve had a baby, and now they’ve got a stroller or something like that?” And there’s all kinds of other things. What if they’ve kind of been injured, and steps are suddenly difficult? What if you break your hand? All of a sudden, one-handed kind of things. And maybe it’s your mouse hand that you’ve lost temporarily? Things like that.
And so, being able to design to all of these things, isn’t just for sort of like - it’s much more complex. And sort of, as you say, there’s sort of universal, than just these - the specific types of people who are sort of coherently self-contained problem, who have coherently self-contained problems. And you need to design something for every case, if you can.
But that reminds me of something, I was looking at, I think it’s a relatively recent paper that you worked on, about computer anxiety. And so it’s interesting. You talk about what happens in-between the clicks, all that kind of stuff. But there’s actually, even before, there’s this human-computer interaction problem that happens even before people get to the computer. Which is, they’re anxious about interacting with computers.
And I remember once having a very striking example of that. I was, I’d moved to a new place and I was at the post office, and I was doing something. And there was a guy in front of me, who was my age, who was getting a post office box. And the employee at the post office asked him for his email address. And he said, “Oh I never learned computers.” And it was just so striking. Because, like, to me, I take it for granted that I use computers all the time for everything. They don’t usually frighten me too much. I’m usually not too intimidated by them. But there are a lot. And like, not to be too presumptuous about this guy.
But like, there are a lot of people for who, the computer is this like, just a big scary thing. I remember a BBC tech writer, years ago, wrote about how like he didn’t dare tell his mom that her remote control for her television was actually a computer. Because if he told her, she’d stop using it.
But just generally speaking on computer anxiety, can you talk a little bit about what that, the sort of like, I guess, sort of scholarly or kind of scientific approach to what that really is, and what you can do to address it?
Vagner: Sure. Well, it, computer anxiety, I got interested about that specific topic when I started trying to map all the barriers people face. And specifically, there’s this richer context, really to the interaction. And as I mentioned, when we talk about barriers, some, most of the time we are focused on the interaction itself. How people, why people are interacting. And, this, when I started reading about computer anxiety, I got interested. Because it’s a very 34:41 [?] before the actual interaction. So people like feel that they’re going to do something wrong, right? They feel that, okay it’s not going to happen.
And this whole idea that, “It’s not going to work, it’s going to break. I’ll look stupid using that.” This brings so heavy burden to these people, that they, start the interaction with lots of preconceived ideas, right? And while literature says that usually people that feel this way, they didn’t have the opportunity to interact with computers in a meaningful way. And I mean, usually they have their first experience in a bad way. They have like, there was a computer in, let’s say, a shared computer. And they were like afraid of breaking or deleting something, or looking stupid when using, of making mistakes.
And usually the differentiation is something that pops up when we are discussing. Because usually people say, “Oh, I have my grandson, doesn’t care, or he does a lot of stuff.” And usually kids do, because they aren’t afraid of breaking stuff, right? And this is the first barrier that usually we have. Like, “Okay I’m, am I doing this right way? Am I going to break something?” And this fear of also looking stupid in front of others, it’s something that ends up holding people back. Because they are afraid of showing up that they don’t know. And this just makes the things worse and worse, right?
And we also try to study the role of mobile phones on that. Because usually people, if you have your smartphone, your mobile phone, you are comfortable in doing things, and you’re the only people seeing what you’re doing. So that is something. It is an interesting aspect of the role of smartphones for people with computer anxieties, they can explore stuff. And that, it’s something that we saw in the literature as well. And in this specific research, we - and I’m saying, “we,” because it is part of a PhD project I’m supervising for my university of - The Federal University of ABC in Brazil, collaborator there. And I’m supervising this PhD project.
And the idea is to use the dataset I mentioned before. So tens of events that we can capture. And see if and how we can simplify user interface for this population. Because we’re, as time passes, and with technical education or with experience, we can reduce this kind of anxiety. And this is also present in the literature. Because - but what if we can bring meaningful first experiences for this group of people, and to make user interface as simple as possible for them? So if we have lots of distractors, and we can easily see that lots of applications and websites and systems we have nowadays, they have so many distractors, like advertisement, things moving around. And these kinds of things end up calling attention to different things that are not related to the main task, right? Or to the things that people want to do.
And we are trying to use this dataset to identify first, what is the task people are performing? And second, is there anything that I can remove from the UI right now for this specific task, for this specific people? And then we are trying to do this automatically. So basically, using this kind of data. Trying to have multiple ways of identifying these distractors.
And we started using like a tracker to see how people interact with eye gazers around the UI for a specific reduced set of tasks. And we also thought that, well, having eye tracker on every single, let’s say, personal computer, is not feasible for some populations. For instance, in Brazil.
And then, what if we could use a more common dataset, like mouth movements, as a proxy to gaze interaction. And then we are trying to identify these distractors, and this is all the, part of the whole challenge of, first, identifying the task, and then the distractors, and then simplifying when it makes sense.
But I think that said, there’s this whole pipeline of identifying first people that have these higher levels of computer anxiety, then identifying the task, and then identifying what are the distractors for them. And then trying to simplify the UI for that specific person performing that specific task. So -
Len: Yeah, that’s so fascinating. I mean, in the end of the whole - but we will actually talk a little bit probably about data, and how you actually gather it, right? Because you said you’ve got to have - when we’re talking about like eye movements and stuff like that, well there’s got to be something looking at it. And there’s probably, there’s hopefully someone who opted in to agree to have their eye movements tracked, and stuff like that. Which is a super sensitive thing in all this kind of analysis.
But I just wanted to point out that like in addition to computer anxiety, and I’m now going to just make up my own term. There’s computer hostility. And that’s sort of like getting a little bit - kind of going away now. But it reminds me of a very specific thing that happened in kind of the corporate world. Which was that, one of the reasons, if anyone remembers, the old Blackberries, which were these mobile phones. But mobile computing devices that had a little physical keypad on them.
And one of the reasons they became just so explosively popular in the business world, was that, there’s a certain kind of business executive culture, in which typing on a keyboard is seen as a lower status activity. That’s the kind of thing a worker does. Not the kind of thing a boss does. And the idea that you even knew how to kind of type. For people - I’m going to be generalizing negatively. But for people of a certain generation, iit was actually kind of like a marker of lower status. And so, when finally people could, didn’t have to sit at a keyboard and type like a typist, and they could actually just sort of do things on their, with their thumbs, that type of interaction unlocked all this amazing technology that these executives could finally use in a kind of culturally acceptable way.
And it’s amazing. So it unleashes computer hostility, this basically got kind of whole political dimension to this kind of stuff. And that can come up. There’s, with sort of status symbols, and things like that. But again, in particular there’s this - once - the kind of work that people do on human-computer interaction, can actually unlock like powers for people. But in many ways, and sort of it’s not always as sort of technical as it were, of sort of gathering data. But there can be these high level kind of things that happen. Which make it such a fascinating field.
But just moving on on that topic. So your book is called, Interaction Data Analytics. So let’s sort of get into the heart of this here. So before we start talking specifically about how you get the data, I was wondering if you could talk about what interaction logs are?
Vagner: I mention about this whole idea that people have about focusing too much on clicks, right? And then after researching about this subject, I started seeing that there’s so many possibilities in this, let’s say, in a log, that you can have more information. And when I started researching about rules and different analytics platforms, they usually, back in the 90s and early 2000s, we had, like - and now it’s all server logs. And at that time was - that was the data source people used on research and so on. Because it is an actual product of web servers working. So we have everything there.
But when you start trying to get more information about how people actually use that, then this kind of data source is not that useful. Because, again, it has, the page views are the downloaded pages, or the requests, more specifically saying. So we don’t know what happened between these requests. We can have requests that are triggered automatically. Nowadays we have screens that triggers lots of requests, right?
And then I started looking into these possibilities of kind of kind of using all available events. Combine them, creating new events. And then about thinking, “What if I have like an additional sensor? If I had like a sensor that could detect how much pressure I put in a the screen. What if I had like a sensor that could detect my skin conductance, to identify events of frustration? What if we had like an eye tracking that I could detect my corporate violation, or something, right?
Then we have lots of things that we can capture. We have the famous click, that it’s kind of an explicit way of doing something, right? And but we have lots of other types of information that we can capture, and have really interesting insights from that. So we have the way that you move the mouse, the way - and recently, what we saw that is interesting also, is the interval that you stop over a link, before clicking on it. It’s an interesting predictor for some of the machine learning models we are creating. Or any sort of interesting attribute, to be more technically - to be technically correct.
So after looking around of this range of possibilities, I saw that, okay, it’s beyond only looking to certain kinds of information. And when I started, tried to look around a term, I didn’t find the term that could encompass everything. And then it’s okay, logs, we use logs, server logs from every - from multiple systems. But to have this specifically analysis of interaction, then we should have something that could cover this whole variety of data. And then we started discussing, okay, we have - any kind of data that we can get, that is a result from the interaction, we are calling “interaction data.” And if we capture a bunch of these, we were calling “interaction log.”
Because, again, after interacting with systems, we - and we are trying to cover really the whole variety. Because if we talk about Internet of Things for instance, we have information of people entering and leaving, let’s say, a museum. And this way of looking, the technology, actually gave birth to lots of patents I have in IBM. Because it is a way of analyzing how people interact with technologies, and try to see, “Okay, what if I could, I -?” Let’s say, collected of how people interact with an environment. And a newcomer comes into this environment, and they don’t know how to interact with people and with the environment.
What if I use all the possibilities and how people usually interact, to inform this newcomer to this environment? So it would be like a way of presenting the social protocols around that specific environment for a newcomer. So we can think about a museum, we can think about the situation like in an airport, or in a smart city or in transit, right? If you’re driving here in New York, or in São Paulo, you’ll see different behaviors. Some of them are acceptable. Some of them are not. So, and we could use like logs from cars. Like how do you sign when you are changing lanes? How you’re behaving using lights. When you dim, when you keep your - tt’s this whole range of possibilities of getting this specific piece of data of how people interact with systems.
Len: Yeah, that’s super interesting. One thing you touched on there was - You said “explicit interaction,” and then the - another, you introduced these terms at the beginning of the book, and then there’s this term “implicit interaction.” “Explicit interaction” is kind of what you consciously choose to do more or less, right? Like, “I’m going to move the mouse over there, and I’m going to click on that button, and I’m going to buy that, whatever, or go to that link.” And then implicit interaction is like, what you were talking about. Like involuntary eye movements, or, and kind of skin conduction, or something like that. Where you can sort of sense people’s anxieties or - even imagine like sort of heart rates, or something like that, right?
Vagner: Yeah.
Len: Which is really interesting. And it’s interesting too that you mention, sort of like, introducing people to things. That reminded me. I think the opening of Super Mario Bros. is sort of famous in the gaming world for like - in the first minute, you kind of have to survive. The first minute you kind of have to learn everything you’re going to need to know. And they very consciously designed the beginning of the game that way. But when it comes to museums, for example, when you walk in, and sort of know - qalking in and even knowing what to do or where to go, is actually part of the, hopefully, part of the design of what’s happened there. And so these kinds of interaction logs are sort of useful. Not only for websites -
Vagner: Yeah.
Len: But for all kinds of different things, and the same kinds of principles apply. And, yeah, you mentioned patents. So we haven’t really talked too much about, at all, about your work for IBM. But if - I mean, people, interested, what a “Master Inventor” is, and sort of just, in general, what kind of work do you do for -? Have you been doing for them for the last 10 years?
Vagner: Yeah, well, IBM is known for one of the companies that - of companies with more patents than the USPTO. IBM was the leader for 29 years in a row in this rank of companies patenting technologies. And it’s part of the culture in the company. And when I started learning about that, I got super interested. And I had with a few colleagues and friends, like some of the most incredible brainstormings I had in my whole life. Like thinking about new technologies, how to solve problems with people from different backgrounds, bringing different contributions to the table. And this is basically part of our daily work at IBM research. And also in other business units, but I think it, specifically inside IBM research is, it’s really present. Sort of, say, for my, at least, based on my experience.
And, well, as soon as I started this whole journey on patenting and understanding what a patent is, and how to create a patent, and how a patent differs from a paper and how, and all this stuff. And I invited a mentor. I want people, that - a friend who was already a Master Inventor at the time. And he taught me a lot about the process about doing patents and about creating, preparing the documentation. And a few years later, I was nominated as Master Inventor. Master Inventor is an internal title at IBM. And it is a temporary one, because it lasts for three years. So basically, you need to keep up -
Len: Okay.
Vagner: Of inventing, to maintain the title, so to say. And that’s basically what it means. But also as part of this title, the Master Inventor title, is that you’re supposed to help people that want to start patenting or start exploring these possibilities. So as a Master Inventor, we are encouraged to mentor people interested on patenting, and also interested in the whole process of innovation, and how it connects with - against business, and how it connects with the role that our labs have inside IBM Research. So it’s interesting to connect with this way of impacting, the, IBM’s patent portfolio. And, well, I really liked this part of my job. Because, well, it’s super challenging, and super - I say that -
When we have a patent granted, it’s really interesting. Because we have a whole process inside IBM, that we have boards that assess the idea. And we have this approval internally, and the whole document preparation. And then it’s submitted to USPTO, which is the trademark and patent office here in the US. And then there’s this, another assessment in USPTO. So when we have an idea, something you’ve created with your colleagues and friends, and passes this whole workflow. And at the end, there’s an expert on patents that says, “Okay, this is really new. This has something new, and it is worth a patent and it’s worth to be granted.” So but it’s really - how should I say? It’s revingorating. You’ll see that, something that you came up, was passed through this whole workflow, and was assessed as valuable stuff, right? And it’s new. And that’s really, really cool when it happens.
Len: Yeah, that’s really, that sounds, it sounds like such an interesting job. I mean, you get to work with all these really smart people. You get to sort of write papers, and work on cutting edge, not just technologies, but ideas and things like that. And for anyone listening who hasn’t been through it before, I’ve been sort of, in a way, I was part of a project that ended up getting a US patent. And in the end, you sort of submit your ideas to some people at the US Trademark and Patent Office, who decide whether it really is a new and unique thing. And that’s got to be a kind of interesting job, of course, as well. But to get that validation that you really have done something new in the world, is, it must be really exciting.
I did mention earlier that we would, I would ask about actually how you go about in sort of, let’s say your ordinary human computer interaction experimentation, how do you actually get data? So do you, is it, do you get a group of volunteers and sort of set them up in like a lab in front of a bunch of computers, where they’re, they’ve got their heart monitor or something on their finger, or something like that to - how does that actually work?
Vagner: Well, it can be done in different ways. We can have this local user session, as you mentioned. When we have like this whole process that we must do in research. Like going to an IRB board, to have the project assessed. And then we need to present a consent form, and make sure that people understand that they are participating in a study. And then, for, specifically for this one that you mentioned about computer anxiety, that’s how, was done. Basically we - and for this one around computer anxiety, we partnered with a community center in downtown São Paulo that supports older adults in that area. So they offer multiple services to older adults. And we went into this partnership with this. And we, I mean, my PhD student and this center.
And we have to communicate to the participants, and make sure that they understand what is the study about, and what are the data that we are collecting. And, well, as the - we do by the book. Like explain things, and if they are, they can interrupt their participation any time. So we go all the - we follow the, usually the - we follow the standard protocols for doing that, right? Just for local studies. We can also do like remote studies. And usually when we talk about scale and we want to know more about information, then usually we inform this, and require consent for capturing certain types of data. Nowadays we have regulations for cookies, for instance, and how to explain that, what kind of information, what’s your cookie policy? What are the goals of using that information?
And for the research I mentioned before related to interaction data analytics, usually we present that, we inform also the types of data we are capturing, how we are going to use that. And this is done only under a user’s consent. So that’s - I’d say that is the big difference from one to the other context. And the hard thing is that when you do - when we do local studies, usually we have, we’re not able to have like hundreds or thousands of sessions, due to space, time, money limitations in research, in general. And - but for these local studies, we ended up having really rich information about the interaction, and about how people felt during that specific interaction. We got opinion, we got - we see how they change their facial expressions while interacting with the system we are studying.
But on the other hand, when we go to scale, we have more data of how multiple people used that specific system we are evaluating, but then we miss something of that detail that we have when we are doing in-person sessions. So it’s a balance that usually we try to have, and depends on the system we are creating. It depends also, the way that you recruit participants, voluntary, sometimes. You need to have a specific group of participants with certain characteristics, to represent somehow the population that you want to - that system to be meaningful and easy to use.
So it’s a tricky thing to do, and to design this kind of experiment. But yeah, in some - this is the difference. In one situation, we need to explain, we follow the protocols, and we have this kind of partnership and multiple ways of recruiting participants, and also doing everything under the consent. And when we do like remote, at scale studies, we need to inform people that we are capturing certain types of data, what are we going to do with that specific type of data? And how they can like opt out, for instance, the study. We need to get a contact person for that specific study, so on and so forth.
Len: Speaking of reaching the right population, I did mention your book, Interactive Data Analytics, and I was just wondering if you could talk a little bit about - and we’ve covered a lot of the topics that are in there, but, obviously - but I was wondering if you could talk just a little bit about who the book is for?
Vagner: Yeah, I mentioned about this kind of gap that I saw between using state of the art, let’s say, algorithms and techniques of data analysis. And sometimes, we, and how to design experiments to capture certain nuances and certain details of how people interact with the systems. So first I thought about kind of summarizing, and how I could help people in the same, with the same interest. And also try to get some structure. Because, as I mentioned, I’m researching about this subject for some time.
And when I interact with people that are interested in this subject, then they ask for references, and I usually send a couple of papers that I did write with colleagues, or other papers I used as reference. And usually they have to kind of connect everything. I did teach a couple of courses around interaction data analysis. And I thought of - after a couple of these courses, I thought about having this collection or this thing from start to finish, talking about how to think about universal design, as I mentioned before. How to think - how to plan the data capture, and how to collect the data, how to analyze, and visualize.
And these steps started this idea of, “Okay, what if I write a book about that, like combining everything I did in these years, and leveraging the structure of the courses I already taught?” So it was like this idea of having something that could tell this story for people interested in this specific aspect, like bridging, let’s say, UX to data science. So how can we do that connection, right? How can we think about user tests to capture interesting data. And get that data, and transform somehow in a complex structure, like a graph, or a table with certain metrics I’m interested in. And then visualizing thousands of people interacting with that specific technology. Or even creating a machine learning model, to do something interesting around that specific type of data.
So I thought as being something that I could handle, like or suggest to someone, asking me, “Okay, I like that, I like this topic, what do you recommend?” And then I thought about the book as being this thing I would recommend. Okay, this is not the whole story, but it’s a good start for you to have a sense of what can be done, and how you can explore this kind of data with different tools and applications. So, that in short, is the whole motivation behind the book.
Len: Yeah, well thanks very much for that. It’s a great project, and it’s a great looking project. And I just wanted to talk to you a little bit about a couple of details about it. So we save for the last part of the interview, if the guest is an author, a little about their process. And the book is currently marked as 25% complete. So you’re writing it in progress.
Vagner: Yeah.
Len: And I was just wondering if you could talk a little bit about what your approach is to that way of publishing? Are you, do you have a plan? Like, “I’m going to work on it for five hours a week, and publish one chapter a month,” or something like that?
Vagner: Yeah, well, the first time I’ve heard about Leanpub was in a conference. And it was like a few colleagues. They were just transitioning from a different publisher to Leanpub, and they recommended it. And then I started reading about the whole proposal and different way of publishing. And the first - I have two books. This one you mentioned is Interaction Data Analytics that is in progress. And it’s 25%, because based on the material I have in mind and the courses I mentioned I taught about it, I think it’s, it will be a book, around 200 pages. So we have 50 pages.
Len: Okay.
Vagner: So that’s how I’m planning that.
Len: Yeah.
Vagner: And, well, I have some material. The material I have nowadays is in Portuguese, so I’m trying to get some of the papers and results I have obtained recently in this more fluid way along the book. But back to the point on publishing on Leanpub, is that, I have this project called WARAU in the past, that was about helping people to make accessible and usable code.
Len: How do you spell that, just so we’re clear?
Vagner: Oh, it’s an acronym. So it’s W-A-R-A-U.
Len: Okay, thank you.
Vagner: So it’s - And it’s available at leanpub.com/warau which is -
Len: Oh, I’ve got it right here. That’s what, that’s where it comes from.
Vagner: Yeah.
Len: Okay, got it.
Vagner: And then this was about a project, that there was this website hosted at Unicamp. At that time, I was doing my transition from master’s degree to the PhD degree. And then there was this website. It was the reference website with lots of code, lots of examples for people to do things, daily basis. And one thing at a time that we identified, is that when people start thinking about accessibility or usability, they sometimes leave at the end of the project and that never comes up, right? We are always late quoting stuff. So we thought about kind of digesting and creating lots of examples on how to, for instance, how to make accessible tables, accessible structure for multiple reasons. And then we have all of these examples in the website. And then, well the - it was complicated to maintain the website, due to some security issues.
Then I thought about, “Okay, I’ve heard about Leanpub. What if we just get all the examples of the codes and the theories we have -?” It was not in a linear structure in the website. But then after discussing with the co-authors, we had, at that time, kind of suggested one sequence of reading that. Like going from some aspects to other more complicated ones. And at that time, we also identified the need for a reference in Portuguese for this specific subject. Then I thought, “Okay, if I would try just converting the website as a book,” and then the first book, we published on Leanpub. And then, with this one, I had this, these materials about the courses I taught, and the papers I’d been publishing. And then I thought, “Okay, probably I would not have time to put my head on this project for like the period I need. But I have the whole plan, I need to get it produced.” And I was afraid of losing the time of doing that.
Len: Right.
Vagner: So because, as I mentioned, I see value on that, I see that there’s something happening in this specific area. And I was like, I felt that I need to do something when I start seeing some positions, some new positions like on LinkedIn, or on different other platforms, talking about this connection. And so, okay, it’s not only - so other people are also seeing value on this connection between UX and data science. So I should do it, and also do it in a way that I can make the best of the feedback that I can have, and make it as a more iterative process than something that it’s a one time thing.
So that’s how I got to know Leanpub. And then I had this first experience just transitioning from this website, this reference website we had, to the Leanpub. And then with this idea of this book, I had already an idea for the cover, I had an idea of the whole structure of the book, and there was something missing. Then I said, “Okay, lot’s of things to do. Lots of patents, lots of papers. So I need to get something done so that I can iterate over that.” So that was the idea behind that.
And, again, I have the whole structure, the whole idea of the book. And basically it’s to convert some of the courses I taught in a more fluid, way with references, with lots of examples. Trying to structure in a more formal way with certain definitions. Trying also to invite people to think more about the definitions, and things we have. And to get, trying to make people more aware of these definitions, and how the literature already treats that. So they shouldn’t get too worried about hypes, or influenced by hypes on certain terms. So that’s the idea behind publishing it, as, in parts.
So, and to answer your question, you mentioned about plans. I don’t have a specific plan of concluding it, or a specific deadline. But one thing that I want to do is like to have like these chunks. Like 50 pages, for instance. So if I had, say - I’m planning on having like, more, three chunks of 50 pages added to this book throughout this year and the next one. So this is the - I’d say the way that I see the project right now. But again, that’s my - it might change. But I’m planning on adding these chunks of 50 pages, and improving as well as possible and making the best of the feedback that a community can also provide.
And well, I - and it’s funny, because, before starting this book, I had the idea. And after piloting the short course I mentioned, one person said, “Okay, you should write a book about it.” And I said, “I’m doing this, I’m doing that, I just don’t have the time to put everything together, but I’m doing this.” And it’s on my backlog, so. And now I’m just trying to bring this from the backlog to the in-progress part of my Kanban and to have iterations over that.
Len: Yeah, thanks very much for sharing that great story. That’s so interesting to hear about both of the projects, and how sort of Leanpub turned out to be useful. I mean, in particular, of course feedback and stuff like that, getting feedback while you’re writing. But also, it’s kind of just getting something out there, is sort of a - one of the main reasons we exist, right? Is sort of like, so that you don’t have to have a completed book. You don’t have to sort of face the daunting task of writing a whole book, before you get something out there, and actually start playing a role in the conversation. If it’s, just, people are talking about these ideas, and things like that. Get it out there, be playing a role, but in sort of book format, but before it’s finished. So that’s fantastic.
And the last question I always ask on the podcast, if the guest is a Leanpub author, is, if there was one magical feature we could build for you, or if there was one thing you really hate about Leanpub that we could fix for you, can you think of anything that you would ask us to do?
Vagner: Well, as researcher, I use Overleaf a lot for writing papers -
Len: Okay.
Vagner: In LaTeX. So some way of connecting Overleaf to Leanpub, I think that would be really interesting. Or maybe some way of integrating -? Well nowadays, what we are doing, is like, we are using Overleaf to write it. And then I generate the PDF, and then upload a PDF to Leanpub. And well, I like the idea of using LaTeX, because it generates PDFs with good accessibility level, due to the whole markup. So this is one aspect that we are interested also. And I think that’s it. Because, again, I’m using Overleaf because lots of materials I have, and the references, they are also from papers I’ve written and materials I have.
So, and I like the way that some documents look as well. Not only, under the hood, like the accessibility and the whole structure. We are able to create really good looking documents. So that’s why I like using LaTeX. And, to answer your question, I think connecting maybe with Overleaf could be an insertion plus. But again, more for researchers and nerds like me, that like using Overleaf, instead of other processing or -
Len: Yeah no, great. Thanks very much for that feedback. That’s really interesting and it’s definitely something I’ll let the team know about. Yeah, it’s just so - that’s one of the things I really like about doing these interviews, is finding like all the very specific things that people from different - I mean, like people from academic areas, like institutions might approach things one way. People from big companies might approach them another way. People who are just completely independent, might approach them another way. And, yeah, and I was actually very curious too. Because I saw you were using our upload writing mode, and then to see what tools you were using to produce this great looking book. So thanks very much for sharing that. And, yeah, thanks also very much for taking the time to be, out of your afternoon today, to be on the Frontmatter podcast.
Vagner: Thank you. Thanks for having me.
Len: Thanks.
Vagner: I really enjoyed it.
Len: Thanks.
And as always, thanks to all of you for listening to this episode of the Frontmatter podcast. If you like what you heard, please rate and review it wherever you found it, and if you’d like to be a Leanpub author, please visit our website at leanpub.com.
