×

Activity Summary

If you’re accessing this activity directly, did you know there are nine other activities in this series up on our website? Visit our AI page to see a breakdown of the activities and our recommended order to complete them in! Also, these activities introduce AI concepts and terminology. If you find yourself unfamiliar with any of the words in this activity, the landing page also has a glossary of AI terms. Happy space-station-fixing!

To recap: You and your group-mates are astronauts and scientists aboard the Actua Orbital Station. Unfortunately, your station just got bombarded by magnetic rays and your electronics have begun to shut down! The only one who can save you is the station’s AI, DANN. DANN stands for Dedicated Actua Neural Network, and it’s gone a little loopy. Brush up on your technical skills, learn about AI, and save yourself and your crewmates! So far, we’ve restored DANN’s basic thinking abilities and its mathematical skills. Now we need to make sure we can communicate with it!

DANN has finished its diagnostic, and we have finished our study from “Regression Analysis: Making Predictions Using Data.” Now, we can begin to fix DANN! When it was functioning, DANN was a state-of-the-art AI system that you could interact with using your voice. Because of the damage, however, DANN’s audio core was knocked offline and, unfortunately, you don’t seem to be able to access the audio core without DANN’s help. Our mission specialists think that you might be able to use DANN’s visual core to communicate, but the visual core isn’t currently set up to recognize our requests. They suggest training an image classification model to recognize our poses or hand shapes so that we can get access to the audio core. Once we do that, we can bring DANN’s other senses back online in “Hand Commands: Training Image Classification Models”!

In this activity, participants will experiment with machine vision’s application to classification tasks. They will learn to identify classification schemes and classes. Participants can then explore and compare two pre-trained machine vision models: COCO-SSD and MobileNet. Participants will evaluate these models and test their suitability for a defined task. This activity will lay the foundation for participants to train their own image classification model in a following activity.

Activity Procedure

Our mission specialists think that you might be able to use DANN’s visual core to communicate, but the visual core isn’t currently set up to recognize our requests. They think we need to train an AI to recognize our poses and  hand shapes so that we can get access to the audio core and reinitialize it. To help orient yourself to the task, they’ve sent some additional information and exercises to bring us up to speed on classification tasks.

What is classification?

Classification is the process of separating data into distinct categories, also known as classes, based on certain features of the data. For example, you could classify your friends or family members by eye color (e.g. brown, blue, hazel), hair colour (e.g. blonde, brown, black, red), or height (e.g. tall, average, short). You can also classify objects! Cars can be classified by type (e.g. compact, sedan, SUV), colour, or manufacturer (e.g. Honda, Ford, Toyota).

{ Individually / In small groups / As a large group }, come up with 1-3 groups of objects that can be classified, and what features you could use to classify them. Think about broad groups of objects that can have lots of differences, like cars or books!

Facilitator’s note: Many possible answers. For any suggested classification scheme, ask for examples of classes that might make it up if they are not volunteered. Possible example: foods for different meals of the day (e.g. breakfast foods, such as cereal, pancakes, or eggs; lunch and dinner foods, such as pizza or pasta; or snacks, such as chips or candy).

Having come up with examples, consider these two questions: What’s the point of classifying things (e.g., objects, items, data)? What does classification do?

  • Classification is a way of understanding, sorting, and simplifying data. Computers in general are very good at tasks that give them clear, well-defined data. Through classification, data can be converted into a format that computers can more easily work with.
  • Some classes can also be thought of as “shorthand” for a collection of features. For example, classes of pants might include sweatpants, jeans, and pajamas. Each of these classes might tell us something about the features of the pants: leg length, whether or not they’re made of denim, or whether they have a zipper. Sweatpants will likely have full-length legs, not be made of denim, and have no zipper, so we can classify them!

Image classification is the task of determining what’s shown in an image (e.g. photo, video frame). This might mean identifying what an object is, or what action is taking place, or who specifically is in the image. This is the kind of classification task that we will be exploring in this activity, since Mission Control hopes that we will be able to use it to develop a solution to move DANN’s repairs forward. To get us started, specialists at Mission Control have set up two AI models (Model 𝛼 and Model 𝛽, below) for us to explore and test. These models are pre-trained, which means that they should be ready for us to use. They’ve included some instructions to help with our testing and exploration, which starts with Model 𝛼. 

Activity 1: Playing with Machine Learning

  • Click “Start Model 𝛼” (or similar wording) to load the first AI model and begin detecting.
    1. You may be asked for permission to use your webcam. No data from your webcam will be stored—the whole process happens on your computer.
    2. After a few moments, the model window should say that it’s currently “detecting”.
      1. You should see a live, but somewhat choppy, feed from your webcam, but you may also see a pink rectangle around parts of the image, with other information there as well. In small groups, jot down your answers to the following questions about the webcam feed:
        1. What is happening in the video window? Make 2-3 observations that describe what you are seeing in the video window (e.g. “I see a rectangle that moves when I move.”).
        2. What do you think the shapes, words, and numbers that appear on camera mean? What is their purpose?
    • The rectangle is a frame that indicates the program has recognized an object. The word is what the program thinks the object is, and the number is how confident the program is that it’s correct (from 0.00 to 1.00).

    The notes from Mission Control refer to Model 𝛼 as “Common Objects in Context, Single Shot Multibox Detector”, or COCO-SSD for short. They include a few extra details about the name:

    • COCO is the name of the collection of images (the dataset) that was used to train this model to recognize different types of objects.
    • SSD refers to what the model has been trained to do, which is detect (i.e. “are they there at all?”) then classify (i.e. “what are they?”) multiple objects in an image or video.

    With that in mind, you will need to conduct a short test of the detection and classification capabilities of this model to see if it could be used for your purposes. To do so:

    1. Gather a few handheld objects from the space around you. These can be things like writing utensils, paper, silverware, a phone, or anything like that.
    2. Create a data table to record your observations. The table needs to have one row for each of the objects that you gathered, so if you gathered 5 objects, there should be at least 5 rows. A data table template has been included to help you out but you are free to create your own.
    Object Recognized? Model output
    (words and numbers)
    Other observations
    (e.g. orientation, distance)
    [Add a row for each object]

     

    1. Show each object to your webcam. 
      1. What happens? Does the model react in any way?
      2. Does moving the object change the result? Try changing how close or far the object is from the camera and changing the orientation (e.g. whether it’s vertical or horizontal).
      3. A rectangle will be drawn around each object that is detected by the model. Within each rectangle, there should be a word or a few words followed by a number. Record these words and numbers.
      4. If you show the AI an object and it is not recognized (i.e. no rectangle, no words, and no numbers), write “Not recognized” or “Not detected” or “No”, in the “Recognized?” column, if you’re using the data table template.
      5. Record any other information you wish to observe.

    COCO-SSD looks for potential objects in a given image or video frame and draws a rectangle, called a bounding box, around them. It also attempts to classify each object that it thinks it found: the words you see within the bounding box are the model’s guess at what the object is. The number next to the words is how confident the model is in identifying the object. The model reports its confidence as a number from 0 to 1, where 0 is 0% confident and 1 is 100% confident (Meaning that it is 100% sure that it knows what the object is). Most models will only report guesses above a certain confidence threshold (for example, 0.5 or 50%). According to Mission Control’s notes, the overlay on the video also has a threshold of 50%, and won’t display anything below that threshold, since it would likely be wrong at that point.

    Activity 2: Can you Break it?

    The second model that Mission Control provided, Model 𝛽, is a simpler machine vision model. Instead of searching for separate objects within an image, Model 𝛽 tries to classify the image or video frame as a whole. Similar to COCO-SSD, Model 𝛽 reports how confident it is in its classification. The instructions for testing Model 𝛽 are similar to those for COCO-SSD:

    • Gather a few objects from the space around you, similar to the ones you used for the last activity. If you can find a few new objects, that’s great!
    • Create a data table to record your observations. The table needs to have one row for each of the objects that you gathered. If you have 5 objects to test with, there should be at least 5 rows in your table. You can use the table structure from the last activity for this. 
    • Show each object to your webcam. 
      1. What happens? Does the model react in any way?
      2. Does moving the object change anything? Try changing how close or far the object is from the camera and changing the orientation (e.g. whether it’s vertical or horizontal).
      3. The model’s guess for classification (i.e. the label for the image) and its confidence in its guess are reported at the bottom of the video window. Record the model’s guesses (classification and confidence).
      4. If you show the AI an object and it is not recognized (i.e. no rectangle, no words, and no numbers), write “Not recognized” or “Not detected”.
      5. Record any other information you wish to observe.

    Mission Control refers to Model 𝛽 as “MobileNet” and notes that this model was trained using the ImageNet dataset. Imagenet is a popular dataset that contains millions of labelled images of all sorts of things, meaning our second model here is very extensively trained. 

    Reflection & Debrief

    With your results and observations from the quick tests of both AI models in hand, { in small groups / as a large group } discuss the following questions:

    1. Did the AI models successfully recognize all of the objects that you showed them? How many of the objects did they successfully recognize?
    2. Why can the AI models recognize some objects, but not others?
    3. Did either of the AI models confidently (any confidence over 70%) label some objects that they were wrong about?
    4. Can you think of a reason why either of the AI models might have thought that the object that you showed it was something else?
    5. Do you think that either of these AI models could be used without additional training or other modifications to accomplish a hand shape classification task? How might either the model or the task be changed to suit the other?
    6. How can visual recognition AI programs like these be used in society? What would a bigger version of this program look like, and how could it help people?

    Many answers are possible here, though they should be backed up by evidence and observations from the tables. 

    As you look through your notes from Mission Control to see what the next step is, you come across a section with a heading in all caps. It has one word that’s been double-underlined. It reads: TRAINING.

    In the next activity, we will learn about model training and how to train our own image classifier using Google’s Teachable Machine platform. We can take the model we produce with Teachable Machine and upload it to DANN’s visual core so that we can continue with repairs.

    Extensions & Modifications

    How might you adapt the time, space, materials, group sizes, or instructions to make this activity more approachable or more challenging?

    Extensions 

    • Begin finding larger or more complicated objects that the AI models might not recognize, and see what they classify those objects as. Is there a pattern in the model’s incorrect answers? Do either of the models have “default” answers that show up more often?
    • Begin testing the model with multiple objects at once. Can it recognize them separately? Does it lump them together into another object? How does this affect the accuracy of the model?

    Modifications 

    • In a virtual environment, ensure each participant has a selection of objects to use for the classification activities. 
    • If participants do not have reliable access to writing supplies, create a shared Google Doc with observation tables to fill out.

    Downloads

    This website uses cookies to ensure you get the best user experience. By continuing to use this website, you consent to our use of cookies.
    Accept