Episode #7
September 26, 2019
Computer Vision
With Graham Leslie From JBKLabs
Host James Benham is joined by JBKnowledge’s Head of R&D, Graham Leslie. Learn about Computer Vision in-depth, its uses in the insurance industry and how in the world all that is related to hot dogs!
INTRO
On episode 7 of the InsurTech Geek Podcast, talking about Computer Vision and Hot Dogs with Graham Leslie, the Chief Geek at JBKLabs!
The InsurTech Geek Podcast powered by JBKnowledge is all about technology that is transforming and disrupting the insurance world. We will be interviewing guests and doing deep dives with our own research and development team in technology that we see changing the industry. We are taking you on a journey through insurance tech, so enjoy the ride and geek out!
INTERVIEW
JAMES: And back for another fun episode on a great day and back in the studio with us again, Mr. Graham Leslie, chief geek at JBKLabs. We have had a couple of weeks with just awesome interviews and we are back with another awesome guy our chief geek at JBKLabs Graham. Graham, how are you doing today?
GRAHAM: I am doing wonderfully. A little hungry so we should not have done the hot dog episode before lunch.
JAMES: I know right? I should have done this over lunch. And maybe a beer.
GRAHAM: I like where your head is at.
JAMES: That would have been better yeah. I like podcasting with whiskey, it is Joe Rogan I blame him. He had his whiskey episode with Elon Musk, and I had a couple of whiskey episodes and it always loosens the lips a little bit and free flows the conversation. But we are not doing that today. Stone–cold sober. We have got some water in our hands and no food in front of us, and yeah, we are going to talk about computer vision and hot dogs today on the InsurTech Geek Podcast. So just a reminder for everybody out there, Graham Leslie is our chief geek at JBKnowledge Labs. He is our director of JBKnowledge Labs. Does all of our research and development here at JBKnowledge. We are a 211-person technology firm, dedicated do the insurance and construction industries. And so, Graham has been around here for a while now, since 2014, so five years and he started as an intern then went to team lead now, now he is director of research here. And kicks butts and takes names every day. Also shares my love and passion cars and trucks and likes to tear them apart and rebuild them and he is super into that, so it is good to have you on today. Today Graham, we are going to talk about Computer vision. This is a topic you have had quite a bit of opportunity to research on.
GRAHAM: Yeah, you have got it.
JAMES: What do you think is the lay definition of computer vision for the uninitiated out there in the insurance space that is listening to this?
GRAHAM: So, a camera is a pretty simple concept, right? You take an image, computers have been able to do that for a long time, digital cameras were not that much of a revolution in terms of computer vision but, being able to recognize what is inside those images, that is a pretty tricky problem. Back in 1966, they did not think it would be that tricky of a problem, but they turned out to be a little wrong.
JAMES: Yeah right. So, you have got pictures being taken for quite some time. I enjoy by the way applying computer vision to old photographs. Computer vision techniques had been used as of recent to reconstruct things, mainly ancient things, that were destroyed in wars and battles that were happened to be photographed. There are some neat applications of computer vision, but computer vision starts with, lasers and non-lasers. It starts with light right? It starts with the light hitting some kind of sensor that, then does something right? This is fundamental physics here and if you look at how old photography is which tells you how many years, we can apply computer vision to, it is pretty old. I like going back to the original photographs. The coining of the word photography is usually true to Sir John Herschel, 1839. Of course, its origin is Greek. It means drawing with light. Which I think is a great way of describing it at and then computer vision interprets that drawing with light, right?
GRAHAM: That’s exactly right. And that interpretation is the very-very difficult part. It is very easy for us as humans, but for computers not so much.
JAMES: Yeah. And we are edging up on the 200-year history of photography. In 1826 was the very first photograph made in a camera. It was in France. Joseph Niépce, I think is how you say his name but I’ve, you can criticize me later for that. It was taken from the upstairs window of his estate in the Burgundy region of France, and when you look at the number of photos taken and you look at the exponential growth, we are definitively on, and you’re from Pittsburgh and, you grew up in Pittsburgh, you know hockey is a big deal there, you are on the steep side of the hockey stick now on the growth of photos. I saw a statistic somewhere, I would have to go and try and reference it that, we take more photos in a year then all previous years combined because we are on the exponential growth for photos and the big question that people has been asking is, is what the heck do we do with all of them right?
GRAHAM: Yeah and there are tons of opportunities and we will get to talk about each of those today.
JAMES: Yeah. So, what are the fundamental underlying technologies behind computer vision that are helping us pull it off today as opposed to having to pull it off in the 90s or early 2000s?
GRAHAM: Alright, so comes down to artificial intelligence. That machine learning aspect can help with, the four applications of machine learning of computer vision. Those are recognition, you know to recognize something right, what is that in a photo, motion analysis, imagine taking a couple of photos and figuring out how something is traveling to those photos, how fast and in what direction, scene reconstruction, tear point about reconstructing things that might have been lost. I think there is a big photogrammetry product on the Notre Dame immediately before the roof fire.
JAMES: Yeah.
GRAHAM: That’s the right example where they are using photogrammetry now to recreate that in 3D.
JAMES: So, let us talk about Notre Dame for just an example. This is a beautiful structure. It is a legendary building, it experienced a horrific fire, and was significantly damaged. And the big question is, how do you restore it? Now, they have laser scans that were taken of the structures. That is when a spinning laser, so a laser on a spinner that goes around, they go to different points and laser scan it. Laser scans precise but it is hard to catch every detail in a laser scan because you have to move the laser scanner a lot and it can take up to a few minutes per scan to scan it with a laser scanner, but what they have billions of, is photographs of Notre Dame. And they didn’t know exactly how it was the day that it burned or the day before it burned, but they did have photographs of that and so they took photographs, photogrammetry is the mathematical process for taking more than one photo of a scene and then reconstruction it into a 3D point club, right?
GRAHAM: That’s exactly right.
JAMES: So, you are using a mathematical interpolation to calculate the XY and Z access of a point of light, so you can reconstruct those points of light into a point cloud which then be reconstructed into a mesh and an object. That is exactly what they did with Notre Dame, and they produced an extremely, and it took millions of photos that were taken in the last week of it before the damage, they were able to generate a very accurate computer model of what was exactly there but then something else cool happened. I do not know if you saw with 3D printing, some very creative people took the stone that fell, and they grounded, and they used a 3D printer to 3D print replacement stone from the reconstructed model. They then sent it to a 3D printer, and they used, you know those polymer adhesives where you can take if you can take a stone or even wood, you have seen these. We have looked at them together. And they 3D printed replacement stone with the actual raw material that fell out of the ceiling.
GRAHAM: That’s amazing is not it?
JAMES: Like the future is now right? And you think about insurance. A big part of insurance is making a client whole right? Fixing what was damaged, I mean you think, how, if you could restore it to the exact, to the millimeter precise object that was there before with the original building materials, it does not get much better than that doesn’t it?
GRAHAM: That’s it.
JAMES: So, you have scene reconstruction and what else do you have?
GRAHAM: And then the last major one is image restoration.
JAMES: And that is when you fix stuff?
GRAHAM: Right, yeah and traditionally art restoration is called in painting. That process of taking damaged photographs and, it is the same process that humans follow right? Interpreting what should be there even though they cannot see it and making the best–educated guess at how to fill in.
JAMES: Yeah, could you try and figure out what would have been here had the image been captured.
GRAHAM: Exactly. Is that human pattern recognition process but just applied to a computer.
JAMES: Yeah, in scale and mass. Instead of having one person, because image restoration is extremely expensive because it is a very-very specific skill set that not many people have and it is a very pricey very, time–consuming process. And you talk about taking that process and shrinking it into seconds and letting the machine do its restoration.
GRAHAM: Seconds may even be a very long time.
JAMES: Yeah. Like to quote my one of my favorite sci-fi shows, I’m a massive Star Trek fan, when Commander Data was offered a human skin by the board, he “Pricard” said well did you think about excepting, and he said I thought it for 00001 nanoseconds, he goes, and for an Android, that’s a very long time.
GRAHAM: I am a Star Wars fan, but I appreciate it.
JAMES: Cowboys and space it is history whatever. It is all good. It is fantasy and it is a good fantasy story. I happen to be a hardcore sci-fi fan, although, unfortunately, you have had to endure my barrage of Star Trek the last several years, but I do appreciate Star Wars. And the vastly profitable enterprise they created. So, there are 4 applications for computer vision. Recognition, motion analysis, scene reconstruction, and image restoration. These are all extremely useful in the area of insurance and you even highlighted that in recognition.
GRAHAM: Yeah absolutely. So, the recognition problem is finding something in an image and potentially finding the different varieties of it. So, in the concept of insurance it could be, imagine taking a photo of damaged property, and being able to recognize, so what part of the property is this. Well, it is a roof and how damaged is it? Or if you have some type of classification, you can see some shingles are missing, or there is actual, visible, structural damage, you could potentially classify all that.
JAMES: Yeah, so you can classify how great is the damage and how severe the damages is, where the damage is, and you can even detect damage that humans might not detect on their own. Simply because of the scale or size of that damage. So from a claims process this couldn’t be useful in insurance, and even adjudicating a claim, when a client comes in, you could have a machine score damage before it goes to human, so they can see hey, your machine–scored this as and 8/10 and here is the circle of the different spots where it detected either crack or a missing tiles or a hole in the roof, burn damage, I mean, whatever it is, you have now, first you have to train that machine learning algorithm on what a healthy roof looks like right?
GRAHAM: That’s exactly right.
JAMES: And, in my other podcast, I covered some news today that ImageNet was just used for a very interesting process to prove that machine learning had bias and I don’t know if you saw this but they did an image in that experiment where they tagged peoples photos and that went viral on Twitter and it was called ImageNet Roulette. If you have not done it and if your listener out there in listener land, just Google ImageNet Roulette and upload some selfies that you do not mind being on the Internet by the way. Keep that in mind. And it will tell you automatically something about you, based on its giant library. That is recognition. Now, it did some interesting things. I had Sebastian go in. Sebastian is our chief operating officer. I had him, he had his sunglasses on, I had a selfie of me and him that we took recently, I uploaded it and it boxed his head in and said, a fighter pilot. And he did have aviator sunglasses on, he looked kind of like a fighter pilot in that picture. And it said, fighter pilot. I was like, wow, that is an interesting conclusion, and you know we kept on uploading some that I had, and it gave some interesting results. It was very good at identifying beards. Those are pretty easily recognizable right, but vs a fighter pilot, how would the machine learning algorithm determine that someone looks like a fighter pilot? That is an interesting recognition pattern.
GRAHAM: Yes, it is a bias towards bias right, and it is a tricky problem with computer vision so these narrow networks, the way they are trained is, one manor of training to give them a set of data. And that data has to be annotated. And this is no small set of data this is huge.
JAMES: Huge.
GRAHAM: This is potentially hundreds of thousands of not if not millions of images that are annotated so if you provide it with a 100,000 photos of fighter pilots with beards and aviation sunglasses, it’s going to start to develop that bias towards I see a beard, I see sunglasses. It is interesting too because we cannot know if that is the bias that is developing because if learns on its own. With or without your input, they can guess that that is probably what it is identifying.
JAMES: Yeah so, I think recognition is one of the more useful tools that you are seeing, we are starting to see roof inspections companies use this as a preliminary analysis tool, we are starting to see property–casualty companies using the claims management. In underwriting is particularly useful because when you send in an underwriting inspector out, typically they do a, I think it’s called an RF04 roof report, where they are peeling up the tile and looking and they are taking a lot of pictures of the roof, they are climbing up on it, and of course now you can fly a drone and get a much, I mean drones are integrated into this type of imaging and computer vision correct?
GRAHAM: And so much safer too.
JAMES: Yeah because you not having to send a human being up. The drone has a high-resolution camera so talk about the correlation between the resolution on the camera and accuracy in computer vision.
GRAHAM: Yeah, it is just the amount of detail that you can get out of that. The resolution of the camera is very important. Also, if you are flying a drone, the actual accuracy of the positioning of the drone is very important too so the drone knows where it is when it takes the photo. For the photogrammetry aspect that is critical to be able to use your stereoscopic vision.
JAMES: Yeah. Do you think that the use of lasers is just getting more and more and more narrow because we have, gigapixel images, we have massive high-resolution images that can get just as much data out of?
GRAHAM: That is a super interesting question. So, it all comes down to precision at the end of the day, right? Remember that your drones are limited by their positioning which is often based on GPS right? And GPS is limited because the federal government does not want it to be used for ICBM’s so GPS is just statically limited to a low-resolution drone factor in there IMU’s, which is their position speed and trying to get the best guess. So back in heavier accuracy of point clouds down to plus or minus of several inches in a very good case, if you need your millimeter accuracy that’s where laser scans are supercritical.
JAMES: Sure. So, we have seen that with action and recreation, we have seen that with underwriting and claims where people send the laser scanner out and laser scan it, which is a great idea, if you can do it is a great idea. And laser scanners or getting way cheaper, and way faster. And Leica makes one, the BLK360 that does this amazing job and it does not just take a laser. It always it also has thermal imaging. So, you can even use thermal images for computer vision, right?
GRAHAM: Yeah that is exactly right.
JAMES: Because you can identify if that thermal image is a human being, so it is any type of image that can be fed into a computer vision algorithm.
GRAHAM: Exactly. It is any type of data that the algorithm can use to determine some type of pattern.
JAMES: So, sum up motion analysis. That seems to be the big one. I think for loss control, loss prevention. You are trying to prevent a loss, you are trying to prevent a safety event, motion analysis is an interesting field that I think would be super useful in any type of job site, worksite environment, manufacturing and construction, and others because you can detect the movement of a human and figure out if they are running across your workplace instead of walking right?
GRAHAM: That is right. It is just taking those several images and then interpreting based on the time they were taken and the movement of the subject of the photo.
JAMES: Yeah.
GRAHAM: Simple math to figure out the time in between.
JAMES: So, we have seen solution providers in the market that are using motion analysis to look at the pattern a worker uses when they lift objects. How would it do that?
GRAHAM: That’s interesting. I imagine the algorithm is probably trying to inform itself about the skeletal position of the worker right, trying to interpret their actual skeletal position and then through those multiple photos, determining the movement taken and the potential stress on any joints in that skeletal view.
JAMES: Yeah and what’s neat about it is their overlay their digital skeleton on that user or just with the big, the trunk, the body trunk and all the inflection points in the body like the heaps and the knees and the ankles, and then it allows you to see if workers are lifting with their back instead of their legs right? That is the old lifting analogy, lift with your legs and not with your back, and you want to see if workers are doing that. Machines can interpret that now. That is pretty heavy stuff.
GRAHAM: It is a little scary is not it?
JAMES: It is because you wonder how far this is going to go and I think it is never going to stop. I mean you are already seeing advertisers using computer vision to determine gender and age of who was walking by the ads. I remember some good sci-fi movies from the 90s and the early 2000s and they did it is exactly this. They had digital ad boards, had cameras on them, and as you walked by, they not only identified your age and your gender and then presented age and gender–specific ads to you, they also identified who you were, with their facial recognition database. The facial recognition is also part of this whole field.
GRAHAM: Oh yeah and it seems so high tech and futuristic to us but if you are in Shenzen in China right now you can pay for a meal just with facial recognition.
JAMES: You can also check–in for your flights now in the United States. Some airports are doing facial recognition for flight check–in.
GRAHAM: It is Delta piloting, that is right?
JAMES: Yes. Delta is piloting it so there are no boarding passes, no ids. Your face is your ID. That intimidates a lot of people though. So how do you overcome some of those objections to computer vision? What are the ways you think that we can help people to get over the potentially scary part of this?
GRAHAM: That’s a tricky question, James. If I knew the answer to that, I would be informing a lot of companies right now.
JAMES: Maybe assure them of where the data is stored?
GRAHAM: Yeah, I think one thing that you are seeing a lot of like, I saw a video on Delta’s pilot, and I remember, they were very explicit about the results of the facial recognition, stays within this loop, right? They are not used in any other business process, and marketing process, it is just to determine your flight information and passing through security. And I think if companies are open about that and make it very clear what your data is being used for, it’s kind of analogous to when you download a mobile app today right, and it asks you, hey are you or okay with using your photos, yes, are you okay with using your microphone, yes, if that same type of disclaimer is there, it’s some comfort.
JAMES: Hey I think that Apple has done a great job of leading this charge on the mobile phones side, saying we are going to embed a bunch of machine learning components in the actual processers. So they are putting, you’re seeing AI chipsets installed on drones and mobile phones and tablets so that they can do all the processing locally instead of pushing into the cloud and sending all of your facial and image data with it, they are doing local processing and they are even guaranteeing that it is not being shared.
GRAHAM: Alight and that is the cyclical processes and that add to the cloud and back in, out to the cloud, back to the end.
JAMES: Yeah, then who owns it when it is in the cloud?
GRAHAM: Exactly.
JAMES: Does it stay there? Are people looking at these images? I do not know. So, there’s a real use case for computer vision and construction whether it is sending a drone in to automatically analyze the damage, or using your iPhone and your drone and a professional camera to take underwriting photos and to make sure that that properties are taken care of or documenting an accident, right? Let us say you have got a work comp accident, pretty severe, and you have a 360 camera which is not expensive. You can get an LG 360 camera, that is, for those that do not know what that is, it is your two fish islands are back to back so it can do a 360 stitched image in under a second. You can use the heck out of that for computer vision so you could recreate that site by just taking two or three 360 photos, you could digitally recreate and walk through in VR in accident site correct?
GRAHAM: Yeah definitely. I think it would take a little more than three or four photos now with today’s technology, but you are absolutely right about where we are headed.
JAMES: Yeah.
GRAHAM: I once heard a comparison. When it came to surveying for example, if you come out with a long ruler and measure everything that you are going to capture, such a small subset of the data, but if you show up with a 360 camera or laser scanner or something like that, you have this enormous set of data that you collect that you can use to answer so many future questions you might not even know about.
JAMES: I am seeing solution providers that even use 360 videos or computer vision.
GRAHAM: Bingo, because then you have all that source data that you can interpret into creating a 3D model.
JAMES: Yeah well, it is 30 images every second. So, they are breaking apart the video into its component frames and then using all the different angles in object to capture it and to model and map it.
GRAHAM: Exactly, and that is scene reconstruction.
JAMES: Yeah, that is seen reconstruction. So, you got some great applications and recognition like facial recognition, object recognition, the status of an object, motion analysis and loss control and safety and being able to determine if workers are even running across the job site, we are seeing risk management divisions at large owners. Like tech companies, there are building data centers, putting webcams on their job site. Feeding that into a computer vision system and then the webcam looks for any bad activity, like running across a job site, jumping too far, lifting improperly. It can all be done by clouds connected security cameras now with the software scene behind them, right?
GRAHAM: That’s exactly right.
JAMES: Yeah. Does not creep anybody out, does it?
GRAHAM: Not at all.
JAMES: Put this tracker on. We are going to document your face and monitor everything you do, and the machine is going to text you if you lift improperly or go somewhere, so there is a social impact on this, that necessitates training, education, reminders. They need to understand what the limitations are, how it is used, where did this stored, what is being done with it. We are not trying to create digital overload; we are trying to keep the human safer right? And ultimately, this ties back to productivity as well, of that company. So, what is some background because this all story started in 1966.
GRAHAM: That’s right that is a funny story. A handful of computer scientists said, you know we could solve this computer vision problem, and they decided, we will do it in a summer project. So, they got some summer workers, said let us break it down and we will figure this out over this summer. And then at the end of the summer, we will have software that will tell exactly what is in an image. They failed.
JAMES: Yeah.
GRAHAM: Pretty drastically. Even though they are still working very hold on this today.
JAMES: Yeah. Yeah given that 53 years later we are still hot after it.
GRAHAM: Exactly. You know when it popped back up, was with translation services right? Do you remember Google Translate? When that broke into the mainstream and how incredible that was, but it was still pretty funny in the beginning, right? Some of the interpretations it would make.
JAMES: It was. It is not anymore, because it is remarkably accurate. And that is not computer vision, but it is still machine learning.
GRAHAM: Exactly.
JAMES: You are still teaching a machine on how to deal with natural language or image output from the human being.
GRAHAM: It is the same narrow networks that are recognizing the patterns and language that are recognizing the patterns and images. You know I think around 2016 was when Google replaced its translation services with neural networks and that is when neural networks broke into the mainstream right?
JAMES: Yeah.
GRAHAM: People started recognizing we can apply this for vision problems right now as well.
JAMES: So, 50 years after the original computer vision experiment, it had a big deal. Yeah and that is awesome. So, the current state of the art computer vision, what is that about?
GRAHAM: This is where we get to talk about hot dogs.
JAMES: Yes! I love hot dogs! Especially all–beef hot dogs. I like going to Wrigley Field, by the way, I am a Rabbit Cubs fan, and I like going there and eat a good old beef hot dog.
GRAHAM: Across the street is Byron’s hot dogs, best hot dogs in Chicago.
JAMES: Hmm, tasty. I love a good Chicago dog, pickles on it.
GRAHAM: Drag it through the garden as they say.
JAMES: Yeah. Drag it through the garden? I never heard them say that! But that is definitely what I do as a hot dog. I mean there is everything in the garden on that hot dog.
GRAHAM: So how does this relate to hot dogs?
JAMES: Yes, how does it relate to her dogs?
GRAHAM: Silicon Valley, the TV show, have you ever seen that episode?
JAMES: Oh yeah. Sadly, very true and hilarious.
GRAHAM: Sadly, very true and very hilarious. We can have that separate venture capital discussion there too, but there is an episode that in Silicon Valley where one of the characters develops our mobile app called “Not hot dog” and what this app does is, it can detect if anything passed in, any photo taken is a hot dog or not. And it is funny because he presents this to a bunch of VC’s and they said it is great because we can use this for so much, and he says no, you do not understand because it is only for hot dogs. The funny thing about this, the app is real. And I read the blog post from the developers who put that together. They built the real app and released it in the App Store for Google and Android, and they have a fascinating block post that is all about how they built it. They built a convolutional neural network that is what is it is called. And that is the most useful type of neural network right now for these types of problems. And what it does is it recognizes as they say the bread, the sausage, the actual glistening of the sausage, the bread texture, and the general hotdog shape and that’s how the features it what is called, they factor into detecting if something is a hot dog. They have got some hilarious images from their process of putting some mustard on their arm and it thinks it is a hot dog and I realized okay, we need to factor in the bread and, but it is fascinating how they put this together.
JAMES: Yeah, I mean there is way more to it because I think we take for granted as we try to replicate a human, I think we take for granted all the things that our brain does. For example, we have a brilliant computer vision right, our brain is a computer, as a computer vision, vision is not just being able to have the light hit a sensor. It is interpreting what that all means right, so photographs are not computer vision. Videos are not computer vision. It is what you do with him later that’s computer vision. So, I think we take for granted how our brains where I mean how brilliant they are. They are geared for facial recognition, aren’t they?
GRAHAM: But only when the face is right–side up.
JAMES: Not upside down?
GRAHAM: The human mind fails immediately with an upside-down face trying to do facial recognition. It is an interesting example of pattern recognition in humans.
JAMES: So if you take a photograph on your phone, and you randomly load it, before you look at it right side up, you look at it upside down, oh yeah I’m doing it right now, that is confusing as heck!
GRAHAM: It is. There you go.
JAMES: So, you just taking random photos, and trying to recognize who they are, would be really, it is very challenging!
GRAHAM: That is the human neural network.
JAMES: Yay it is the way that our brains have been trained to operate and so we do not think through that, so one important thing for everybody in insurance out there, there is no easy button to this. You know I think the answer, like my favorite software developer answer, it depends. You know can you implement this for my company? Yes, it depends, because you can take the same models that work for faces and people and hot dogs and it is not going to work for determining the hurricane damage on a building.
GRAHAM: That’s exactly right and the state of where things are right now, is you can find a lot of general neural networks that are pre-build and they recognize simple objects like this is an apple, this is a cup but if you are prompted to do something much more complicated like recognize and categorize all the damage to a home after hurricane damage as you said, we’re going to have to start from scratch and to do that you are going to need hundreds and thousands of photos of the damage, all different that you can begin to train a model on.
JAMES: Yeah. The data set, that is why tools like image.net are very helpful because they have billions of pre hinted, pre–tagged images that allow you to train on whatever data set you are trying to train on. So if you go to image.net, you can search through and look for a bunch of images tagged with again, hot dogs, and then use that to train on the recognition of all the components of a hotdog and when they are all together, that it constructs a hotdog. By the way, I installed that app, “Not hot dog” and I am not a hotdog. There you go.
GRAHAM: Thank goodness.
JAMES: Thank goodness. Because that would be concerning and you as well, let us see if you are a hot dog and, you are not a hotdog. So, it successfully identified that we are both not hot dogs now the key is, since we are so hungry, we need to leave, go get hot dogs, take a picture of it, and see if it works. But image.net and other tools like this, and they are not the only ones out there that have databases of hinted data. And I reported on a news article recently online, saying artificial intelligence is made of people, it reminds me of my favorite Charlton Heston sci-fi movie, Soylent Green, where Charlton Heston at the end is being carted out of the building yelling, soil and green are people, soil and green are people! A lot of this is people, right? I mean it takes, there are entire companies offshore mainly India, where they have thousands of people sitting all day long just tagging and hinting pictures, just to be used for computer vision systems.
GRAHAM: Exactly right and it is funny. You see a lot of the cloud vendors today, Microsoft and Amazon and they released so many tools for machine building learning networks, but they release so many tools too for simply collecting big data sets of images and imitating them.
JAMES: Yes, like Amazon Mechanical Turk. So, if you want to buy people do hint insurance–related imagery, you can go on Amazon Mechanical Turk which by the way is just Amazon outsourcing mundane boring tasks do a bunch of human beings just want to work from home right?
GRAHAM: That’s exactly right.
JAMES: OK so it is called Mechanical Turk. But it is made of people. It’s a bunch of people that will sit back and analyze any kind of imagery that you want to send across, and then they will hint and fill your data sets out, so you don’t have the hire up, it’s a pretty tough job to hire up for, to generate the datasets that you’d need for this to work in a carrier or third–party administrator or a broker. And I can think of, by the way, everybody in the insurance ecosystem that could benefit from computer vision and it is related technologies. But it is not a magic bullet, there is no easy button, you have to build the model and then you have to build data sets and you have to rebuild the model. And then you have to change your model. And some interesting machine learning websites allow you to apply every known machine learning model because there is a bunch of them, and then identify which one is the most accurate. So, there is also that and then by the way what works today, next month there might be a new model out that you want to work with right?
GRAHAM: That’s exactly right. Where I think the future of this goes, is proprietary models and proprietary datasets, right? Being able to lease access to these enormous datasets.
JAMES: So, do you think insurance carriers instead of leasing or licensing it, one of their competitive advantages will be that they built the best machine learning model?
GRAHAM: Absolutely.
JAMES: Which will allow them to automatically adjudicate claims, will allow them to dramatically streamline the underwriting process, and there is a lot of things you can do with that but that almost looks like it might be a competitive advantage that they don’t want anyone else to have right?
GRAHAM: 100%.
JAMES: But then there will also be technology vendors out there who offer the same datasets to everybody. So, you may end up with a hybrid where you license some tech and then you maintain some tech internally. The end output is a compiled model file, right? It is a file on a computer. And it does not learn in real–time, does it? Do you have to schedule the time for learning?
GRAHAM: That’s exactly right. The processes is you give you a data set and it splits it in half. And what it does with that first half is it learns from it. And the second half it tests to see how accurate it is. And you know that is the whole second half is training because at first, it is not going to be accurate at all. You know it will be finding hot dogs everywhere. And over the course the process gets more and more accurate as it learns and adjusts and tries to get that second half more and more accurate.
JAMES: Yeah. Yeah, there are some great examples. If you want to play with this type of image recognition, a website you can go to by the way is captionbot.ai. oh, side note, before we go to Caption Bot because we will load some test images into Caption Bot and see how it does in just a second. I thought it was interesting when I started seeing the caption companies. If you don’t know what the caption is, is that annoying thing where do you have to interpret the squiggly letters and lines and you have to tell it what the word is that it’s trying to, or the numbers and letters, so they can prevent bots from logging into accounts. Well, Caption Bot they realized then opportunity to use it as a giant mechanical Turk bot and so they started putting in images that they thought they understood with their current model, and it can test you on, but then have humans click on, have you seen when you have to log into something it says, click on all the buses. Click on all the storefronts. What is the street sign say? It is using you for free to Crowdsource its learning model for computer vision.
GRAHAM: That’s why caption services are free to developers you know? Google is the leading one with its recapture service.
JAMES: Yeah recapture is super popular, and they are doing this. They are crowdsourcing image capture and it is amazing.
GRAHAM: Feeding all of that data into maps, into their storefront detection.
JAMES: Yeah. It is awesome. So, let’s talk about caption bot because this is one our followers, our listeners can listen, can go check this out because they’ve now I’ve learned the history, they understand the uses and we talked about all of this but now they want it to go and play with something that might resemble computer vision and see what it can do. And some pretty interesting used cases but caption bots is a fun one because it allows you to upload immediately so you just type, go into your web browser on your phone, and just type in, I’ll do it with you just typing captionbot.ai, captionbot.ai. This is built by Microsoft and it is going to load it in your web browser. When it loads, you’re going to upload a given image and I’m going to pick an image out from my photo library, something I recently took, just a picture of, one of the pictures of my house and I will upload as soon as you select the image on your phone, it uploads the image, runs through its AI algorithm and because it has a model that was compiled last night, seemingly, and then it’s going to spit back a caption.
So, this one does not produce tags, it writes the searchable caption of the image, and here it goes. I think it is a living room filled with furniture and a fireplace. There was no fireplace in the image, but it was right, and it is a living room filled with furniture. So there is, you can see they are not 100% accurate but it’s pretty darn accurate for me not telling it anything, and then of course what you can do is you can grade, you can grade the image on how well it does. And we will go ahead, and we will upload another one of a group, let us get a water scene here of a patio, in front of a lake, and see what it says. So, there are some interesting things you can experiment within caption bot. You can also go and play this Image Net Roulette, that is ImageNetRoulette. And that will allow you to see how it tags people and see if it is going to be anyway remotely accurate for you. At the end of the day, what is the conclusion for the insurance professional out there listening to this and thinking about using computer vision?
GRAHAM: Computer vision is very-very difficult, but it is getting easier. And it is going to be huge potential for competitive advantage across all areas of insurance, but you know especially adjudication of claims. Being able to flag them ahead of time. Which ones can we shoot right through, which ones need human review? It is very challenging, but some applications could lead to competitive advantage today.
JAMES: Yeah, and tomorrow. This is about being able to scale a company without having to hire a ton more people but not so much about laying off all of your staff, we are talking about being able to more effectively utilize them for thinking tasks. In particular, when you are trying to search through images and you’re trying to search these objects, that is when this comes in handy if you have a collection of tags and captions on the images it’s like a complete game–changer for you right?
GRAHAM: Absolutely and we saw that with Google photos when they released them. You could just say show me all my photos with a particular family member in them it would sort it automatically.
JAMES: Yeah. And your iPhone does this already too.
GRAHAM: Yeah.
JAMES: So if you have an iPhone, if you have the photos app on iPhone, is that does that locally, you can just type in the word dogs, and I am a huge dog fan, I’ve got a bunch of dogs, and if I type in dogs, it returns thousands of photos that all have a dog in them. It does not catch all my photos that have a dog in them, but all the photos it did catch do have a dog in them. And it is amazing, amazing technology and what is neat about it is that it can be incorporated into existing systems. So if you have a claims system out there, that supports attachment of photos, and you have the ability to type either a caption or details or tags in the photo, you can build on a machine learning system into that existing system and have it populate the caption text or tags into the system you already have.
GRAHAM: And talk about something today, that is pretty straightforward to do. With services like Amazon and Microsoft and Google, all offer off the shelf services like this for general object recognition.
JAMES: Yeah.
GRAHAM: So absolutely.
JAMES: Yeah, they are busy training on every object on the planet. There are not getting specific to the insurance business, but you would still get a ton of value out of it. Because it is identifying any type of object that is in the system. The same thing with motion and video analysis. It breaks that video apart. There are already services out there you can tap into, that will tell you what is going on in a video and the same thing with photogrammetry. This a lot of application that does scene creation and that build, one of my favorites PIX4D by the way, PIX 4D does a phenomenal job at taking multiple photographs and producing a point cloud, which can be produced into a mesh, which can be produced into a 3D model. You get a 3D model scene off an accident. There are all kinds of neat stuff that can be done with this. We think it is a transformative technology for the insurance business. That is why we have done so much research on it ourselves and we have built a few of these different applications ourselves, including for inspections and daily logs and safety inspections right?
GRAHAM: That’s exactly right.
JAMES: And my recollection, it was a fairly quick development process because of how frankly many services you can tap into that you do not have to build from the ground upright?
GRAHAM: Yeah that is right. It is very easy to get started with and becomes very challenging as you move into this niche and domain–specific areas.
JAMES: Yeah you get frustrated quickly if you are not prepared for the fact that Murphy’s law applies here too. So that is your how to get started out there if you are listening to do the InsurTech Geek Podcast. All the major clouds have computer vision APS that can be used in a trail to recognize common-common everyday objects in photos. And that is a wonderful place to start at the same thing. Services like Azure have captioning services that will write captions on photos. It is a great place to get started and does not require an enormous amount of effort, but the bigger project will require an amount of effort and time. You have to build your own datasets, you have to build your own models, the deployment, and the applications and that can take some time. But if you want to be part of the digital transmission of the insurance business, I do not think you have a choice on whether or not you work with computer vision. I think this is a mandatory tool that will be used against you in the court of technology so be ready for that. Grahams thank you for coming on today and so thoroughly and succinctly summarizing how this technology works for the industry out there.
GRAHAM: Yeah, my pleasure. Ready to get out of here and go and get a hotdog.
JAMES: Yeah exactly it is time to go eat. Let us get some hot dogs. Preferably a Chicago dog. One that has been dragged through the whole garden.
This has been the InsurTech Geek Podcast powered by JBKnowledge. JBKnowledge.com is all about technology that is transforming and disrupting the Insurance world. I have been Host James Benham, jamesbenham.com Thank you for joining us this week.