AI-Powered Alt Text in PowerPoint

I was working on a slide deck and did something I don't normally do – I added a photo. Usually the graphical content of my presentations involves diagrams or other vector art – I don't often have the need for actual photos. But this time I did, and I was really surprised to see that PowerPoint generated a text description of the photo and placed it in the alt text field. It also superimposed it on top of the image, which is how I discovered this was occurring to begin with.

photo of planetary gears

Well that's really neat. I don't know when this showed up in PowerPoint, but I was able to find this blog post announcing the feature back in December 2016. This, along with other things, was rolled out as a way of making content more accessible, and I think that's awesome. I often don't think to add alt text, so if something will do that for me that is a huge help. (And on that note, I really need to start adding alt text to the images in my blog posts. I'll start with this one!)

Fun With The Computer Vision API

That blog post mentions that feature is powered by the Azure Cognitive Services Computer Vision API, which lets you test out image analysis right on its webpage. Oh this could be fun!

I started by submitting the photo of the gears to the service to see what it returned. It gave me a whole bunch of information about tags and other image properties, but I'll stick with the description field:

[ { "name": "metalware", "confidence": 0.9998753 }, { "name": "gear", "confidence": 0.9984683 }, { "name": "auto part", "confidence": 0.9844382 }, { "name": "tire", "confidence": 0.884439468 }, { "name": "metal", "confidence": 0.861439943 }, { "name": "synthetic rubber", "confidence": 0.757674336 }, { "name": "tread", "confidence": 0.7314086 }, { "name": "wheel", "confidence": 0.7102231 }, { "name": "engine", "confidence": 0.6209707 }, { "name": "silver", "confidence": 0.532617569 } ]

Not too shabby! It identified it as metal and gears with pretty high confidence. Let's try something else. My father and I have on several occasions handed one of these to the younger members of my family and enjoyed watching them try to figure out what it is:

photo of a metal object

And the Computer Vision API didn't actually return any tags for this photo. Instead I'll paste the "description" field, (which also includes its own tags.)

{ "tags": [ "thing", "water", "sitting", "bench", "park", "air", "man", "green", "laying", "beach", "body", "wooden", "board", "table", "ocean", "sand", "white", "cat", "trick", "jumping", "doing" ], "captions": [] }

Okay, that's really hard, and maybe not fair. The page specifically mentions that the service can "Recognize more than 1,500 global brands and logos, 1 million celebrities from business, politics, sports and entertainment, as well as 9,000 natural and manmade landmarks from around the world." So let's try a photo of the Sphinx:

photo of The Great Sphinx

Survey Says:

[ { "name": "outdoor", "confidence": 0.99820435 }, { "name": "sky", "confidence": 0.985456049 }, { "name": "ground", "confidence": 0.9564954 }, { "name": "ancient", "confidence": 0.9246716 }, { "name": "ruins", "confidence": 0.911889434 }, { "name": "desert", "confidence": 0.9089508 }, { "name": "ruin", "confidence": 0.841672063 }, { "name": "ancient history", "confidence": 0.8359611 }, { "name": "mountain", "confidence": 0.8351296 }, { "name": "archaeology", "confidence": 0.822488248 }, { "name": "stone", "confidence": 0.7800616 }, { "name": "archaeological site", "confidence": 0.7232571 }, { "name": "pyramid", "confidence": 0.6723917 }, { "name": "monument", "confidence": 0.634320557 }, { "name": "historic site", "confidence": 0.620408833 }, { "name": "history", "confidence": 0.5490571 }, { "name": "egyptian temple", "confidence": 0.518620968 }, { "name": "brick", "confidence": 0.3100215 }, { "name": "slope", "confidence": 0.306334674 }, { "name": "day", "confidence": 0.12448746 } ]

It's definitely hitting on some key terms, but not outright identifying it as the Sphinx. Still, it's quite impressive. I definitely couldn't do any better!

Disabling It

If you have a need to disable this feature, possibly because your organization is concerned about submitting images to a cloud service, you can do so through the Options menu. In the "Ease of Access" tab, uncheck the "Automatically generate alt text for me" box.

Diagram of Microsoft PowerPoint options

This is also in MS Word!

Microsoft Word has the exact same functionality, however from what I can tell it's not quite as automatic. After inserting a photo into Word, right-click on it and select "Edit Alt Text"

Screenshot of Microsoft Word image context menu

In the Alt Text menu that opens, click "Generate a description for me" and it will populate the alt text box automatically.

Screenshot of Microsoft Word alt text menu

So there you have it. Word and PowerPoint can make our content just a little bit better and more accessible through the magic of artificial intelligence. Neat!