AI Today Podcast #013: Avoiding Biases in AI Training Data – Interview with Tess Posner of AI4All

The basis for most modern AI systems are Machine Learning and Deep Learning. It goes without saying that in order for something to be “learning”, something has to teach it. Specifically, what makes these systems effective is a large quantity of good training data that is relevant to the area in which you’re trying to gather intelligence. However, what if there are implicit (or even explicit) biases in this training data? It follows that the AI system will inherit those same biases.

In this podcast, we interview Tess Posner, Executive Director of AI4ALL, an organization looking to address and resolve issues of bias in AI in all its forms. We go over what sort of biases there are in AI data, how to address them, and the challenges in building AI systems that don’t have the same flaws as its human trainers.

Show Notes:

Episode Sponsors:

Fiverr is a marketplace for creative and digital freelance service. We use Fiverr for quite a few needs at Cognilytica including podcast editing, transcription, and more. Use promo code ‘AITODAY’ for 15% off your first purchase on Fiverr.com. Offer valid until December 31, 2018

___________________________________________

A transcript of podcast is available below:

 Kathleen: Hello and welcome to the AI today podcast I’m your host Kathleen Walch.

Ron: And I’m your host Ronald Schmelzer. Our guest today is Tess Posner who’s the executive director at AI-4-All. Hello Tess.

Tess: Hi, thanks so much for having me on. It’s really great to be here with you both.

Kathleen: Yeah, thanks. We’re excited to interview you today. We’d like to get started by having you introduce yourself to our listeners and let them know what AI-4-All is all about.

Tess: Yeah, absolutely. I’m the executive director of a nonprofit organization called AI-4-All and our mission is to increase diversity and inclusion in AI or artificial intelligence.

Ron: Great. Well, actually tell us a little bit more about that what’s all the scope of diversity inclusion and kind of where do you see sort of the current state of that within the market.

Tess: Yeah, absolutely. I mean it’s no secret that AI is really going to be one of the most impactful and potentially disruptive technologies of our time really been called the driver of the fourth Industrial Revolution. However, what we’re seeing right now in AI is that the technologists that are building this incredible technology products and solutions and the data that power them are not representative of the wider society which is very diverse and complex. So we might be familiar with some of this statistics around diversity impact generally that’s getting a lot of attention right now. For example, only around 13% of AI and machine learning focused companies have female CEOs and tenure-track engineering faculty are less than three percent African-American and less than four percent Hispanic and so we’re seeing that there’s not a representation of those that will be affected will be users will be drivers and innovators of these technologies building them today and so what AI-4-All is really trying to do is increase the future more diverse pipeline of those working in AI education, AI development and research.

We start early with high school students and introduce them to basic technical concepts and AI really sharing with them what AI can be used for and the broad implications and uses for the technology and they get to work on a hands-on project using AI for good. So our programs partner also with some of the top AI lab that universities to really give these young people exposure to some of the most current research and trends in the industry as well as faculty and other peers and role models and mentors that are working on this today.

Ron: That sounds good. Well, as a matter of fact we saw you at the recent NYU future labs AI summit and you gave a really awesome presentation there thou this was the summit that was in October 30th and 31st 2017 for our listeners. So if possible can you give our listeners a two-minute recap or a short recap of what you had talked about there at the conference and some of the challenges that you talked about.

Tess: Yeah, absolutely and thank you so much. So I think as we were just talking about there’s really this lack of diversity in terms of in the AI field as well as in the data sets that are powering AI technology solutions algorithms and so what I talked about was really what are some of the impacts here we know that diversity is good for business. So more diverse teams tend to develop better more robust solutions or creative solutions and having a homogenous group is actually bad for business. There’s a lot of research that prove this. However, in AI specifically there’s another impact that we’re actually seeing today other than just less profitable businesses and teams and less creativity but this is really unconscious bias and so we’re seeing that there are biases present in the data sets that are used to train AI algorithms and these are being reflected mirrored and sometimes even amplified in the technology that they’re creating.

So some examples that I touched on included a racially biased sentencing software called compass that was studied and released study on last year Pro Publica did that actually predict someone’s likelihood to commit a crime in the future and this software is being used today by courts in the US and it’s actually been shown to be racially biased against African Americans and we’ve also seen examples like translation software that displays gender bias because of gender bias that’s present in existing data sets that it was used to train this algorithm and then also facial recognition software that doesn’t recognize non-white faces or expressions. So obviously this kind of software is affecting people’s daily lives and has life-altering consequences and so when bias is getting embedded, it’s deeply problematic and you know potentially it has even more far-reaching consequences as AI gets embedded into more tools that we use every day and to make decisions for big things like in the financial space in the criminal justice space and otherwise.

Kathleen: Yeah. I know that AI bias has been something that we at Cognilytica have been looking at and also I know that you know companies as well because it’s training data that starts to make the AI bias and you know things continue on from there so how is AI-4-All working to address this bias.

Tess: That is a great question. I’m glad you all are bringing attention to it. I think obviously there’s a lot that should be done to prevent bias and steps that companies and others should take including the transparency of the data set, how you analyze the data set, how you do pre-launch testing, how you monitor these things after they have developed but we really believe that you need to get to the route and make the field more diverse from the start. That’s what AI Farrell is specifically focused on but in order to do that, we really have to understand why it’s not diverse in the first place and that there are barriers for people who are currently underrepresented in AI and computer science more generally some of these include actual discrimination whether it’s in the hiring process or at the workplace, lack of role models because there’s homogeneity in the field. It’s hard to find people and you’re a young person looking at different careers who looks like you who’s in this field and that you can be inspired by or learn from research also shows that lack of early technical exposure prevents you from going into the field in the future and so really starting in high school is important to get that early exposure and then there’s not a lot of supports to counteract these barriers. As an individual it could be really hard to kind of break through although we see many examples of people doing that it can be really challenging and so AI-4-All is trying to build in the early technical exposure connecting young people to mentors and really building these diverse communities that can help be a vaccine against bias and discrimination in the field in the future and really help people succeed not just to get into it but to successfully move up in their careers and be and really thrive in AI and computer science.

Kathleen: Okay. So AI-4-All basically is trying to take a diversity approach to AI so that the people giving the training data will not be homogeneous and therefore we’re hoping that the training data will not be homogeneous because it won’t be a certain type of person doing all the training.

Tess: Yeah, exactly. I mean I think that it’s a lot easier for a more diverse team that’s representative of those that would be using the product to more easily pick up on things that might be missed by another group. For example, we see that the facial recognition software that wasn’t trained on Asian faces didn’t recognize when Asians were blinking and so it was on camera until it wouldn’t actually take a picture and so if has been trained by a more diverse group of individuals we wouldn’t see that happening and so we really believe that yes, having more individuals that represent the society working on these things will prevent that as well as just generally diversity helps create more better solutions more creative products and art is better for business. So I think all of those reasons why we believe it’s specifically important in AI.

Ron: Well, that’s cool. I know one of the other challenges in the AI field is just a lack of talent and lack of folks who have the expertise to build these training models and to apply AI or develop AI systems and so from a different perspective not just addressing the areas of bias and the AI machine learning data, what do you see as some of the challenges increasing the skilled AI workforce and how can AI-4-All help address these challenges especially from this diversity and inclusion perspective.

Tess: Yeah. It’s a really great question because it’s not just individuals missing out, it’s companies missing out on the talent, talents of individuals or talent pools of people. I think access is a really big issue so as we see that computer science and tech skills generally are becoming the new basic skills really needed to participate in the economy, we also see at the same time that only forty percent of US schools teach computer science which is astounding and a third of US states don’t even count computer science course credits towards graduation requirements….

Kathleen: Now is this college or high school?

Tess: This is high school.

Ron: Oh okay.

Tess: So we’re seeing a real lag in terms of what young people have access to develop these skills and as we talked about earlier this early exposure is critical for people to get into the field in the future. So if they’re already missing out in high school, it’s less likely that they’re going to go into it in college and beyond and so I think that we’re really trying to at AI-4-All democratize access to AI and technical education early and also develop these communities of students to ensure that we are developing a diverse group of future leaders in the field as well as ensure that companies don’t miss out on these amazing talent pools and we’ve really heard amazing results from our students just share one of the students remark to me after our course this summer. She said I used to think I wasn’t smart enough to do computer science in AI, but now I’ve gained so much confidence because of all the support and being around other girls who are into the same thing. They make me feel comfortable to speak out and ask questions or another student that said to me I never thought that people who look like me to succeed in computer science and AI and many of our students go on to create their own AI club or programs that teach coding and AI to younger populations which is amazing. They’re already showing tremendous leadership and so really this early exposure and just democratizing these tools really makes all the difference and so we can’t wait to see what our students will create in the future in the AI fields and beyond and companies would be missing out if there wasn’t this access and exposure that we need to provide at a wider level in the US.

Kathleen: Okay, yeah. Very interesting so what do you see as some of the greatest challenges in AI adoption for you know enterprises.

Tess: Yeah. I think broadly speaking there are several major challenges. I think because of the tremendous global economic impact that this will make and companies are sort of jumping at the chance to incorporate and research AI technology as it can have major efficiency gains and other benefits, we need to make sure we’re investing in the talent pool. There was a New York Times article that came out recently that cited that only 10,000 people in the world have AI skills to actually innovate in this field and that they’re paying hundreds of thousands and millions of dollars to recruit this talent and so we need to not just try to get the talent now and go into a talent war by trying to get the best people but really invest in the future talent pool and that needs to start now or we’re just going to be in more of a talent war down the road.

Ron: Exactly, yeah. Now it’s actually a really good feedback because we just had this interview with Steve Kuyan in one of our earlier Cognilytica podcasts about the AI brain drain, not only is it hard to find the talent to implement AI and understand how to continue development and research of AI but a lot of the faculty members that are at universities that are teaching these folks are being poached, they’re being approached by folks in the industry and giving them incredibly lucrative salaries. If you’re a PhD right now with any amount of experience in AI, let alone let’s say ten years of experience I mean you could basically call your ticket right now and you can go to any company that’s doing any sort of AI whether it’s here in the US or abroad and find employment for a multiple of what you’re getting paid at the universities. So it’s sort of a dual challenge is not just we need more talent in the pool to be able to build AI systems but we need to find a way to retain the trainers otherwise we’re going to be in this unusual situation of people trying to enter the job pool and trying to enter the AI work for us and not having the skills to do that which is a real problem area so hopefully you can find a way to address that challenge as well.

Tess: That’s interesting and brings up a great point and one of the reasons why we’re not just a coding bootcamp if you will to get people into AI development we really are looking at the different opportunities in AI. You can be an AI educator, you can work in policy and this is going to be one of the biggest challenges that policymakers or lawyers will be facing in terms of what are the ramifications for the different parts of society and we need people to be working on all segments of different industries that are going to be touched by AI, not just development and as you’re saying if we don’t have enough teachers and researchers because they’re getting poached by companies, that’s a big problem. And same thing if we don’t have policy makers that deeply understand AI and implication it’s going to limit the types of solutions that we can come up with for what the technology is going to bring in terms of disruption to our society…