This activity is a fun way to introduce the topics of machine learning (ML) and deep machine learning (DML) without much contextual information needed. This is accomplished by continuously using cats as both the comedic relief and focus of study. Participants will learn the intricacies of NODES, Neural Networks, and many other ML tools. This activity is lightly guided into by having participants question how we really know what things are, and how a computer can do the same. At its conclusion participants will have had a fun introduction into skills that will influence digital literacy and employability skills in the years to come.
By Actua – 2018 for CS Ed Week
- Show the participants an image (preferably digital) of a cat. Ask the participants to identify what is in the image. After identifying the cat, ask the participants how they know it is a cat? Expand the questioning to cover the concept that the image of the cat is comprised of many things:
- What aspects of the image show that it is a cat?
- How do YOU know it is a cat? Why do you know what a cat is? Have you seen one before?
- Would you be able to identify it as a cat if you had never seen a cat before?
- Advanced – (if digital) Is this cat not a representation of 1s and 0s? It is a digital image – Are there a series of 1s and 0s that represent what a cat should look like?
- Does your experiences identifying cats in the past help you to categorize this image as a cat now?
- Participants should be able to identify that an accumulated history of seeing cats (in various forms – images, video, real, stories, etc.) helps them to identify features of a cat and categorize them. The more experience with cats, the better they are at identifying cats.
- Ask the participants to identify what specific features define what a cat looks like? Record the features that are identified as these will be used in the next step. Some common suggestions are:
- Features should not be limited to this list, allow the participants to make a complete list.
Section 1: What is a cat?
- This type of problem is a very common problem that computer scientists and data scientists want to solve. It is extremely helpful if we can train a computer to identify, classify or cluster objects into categories. The process of setting up a computer to do this is called machine learning. Machine learning involves developing a network to analyze objects based on some parameters and letting it learn. In many cases, the goal is to have the machine learn how to categorize things itself (called Classification of Data), based on some examples of objects that have been pre-categorized (called Supervised learning process). In simple terms – setup a network and show it what cats look like and what cats don’t look like. After a while, it will be able to tell you if the object you show it is a cat.
- Explain to the participants that they will be setting up a simple human powered machine learning network. This network will be a simplified neural network (a network made up of nodes, kind of like the human brain).
- Group the participants into small groups of 2-3 people. Allow the groups to take turns selecting an important feature (that they listed above) until each group has at least 1 feature and most of the features have been selected. Ask them to consider as they are selecting what they think is the most important features are for identification. There can be no duplicate features at this time. The network will need at least 6 NODES to run well, but more will improve how well it works.
- The feature (or features) that the groups have selected will be the NODE or NODES that they are responsible for during this activity. Hand out 1 sheet of blank paper for each feature. This paper represents everything that the NODE knows about cats. Have the groups write the feature in the middle of the paper, with a circle around it. Plenty of space should be left around the label so that information can be recorded on the paper.
- It is now time for the participants to complete a supervised learning process to train the NODE to recognize the feature that has been selected. Participants are taking the role of computer scientists, feeding information to the NODE of what to look for in that feature and writing it down on the NODE sheet. Participants should use Google Image to search for the evidence to feed into the NODE. Don’t go off memory – find images that provide evidence of the information you write down. For example, let’s say we were training the “Nose” NODE:
- Document what a cat nose looks like. Draw some examples, describe its shape, size & placement (relative to something else, like the cats eyes), what colour is it, etc. Include things that are obvious! Remember that the NODE has never seen a cat before, so you need to teach it everything it needs to know.
- Document what it doesn’t look like too. Build some parameters around what it can not be. It shouldn’t have more than two nostrils, it shouldn’t be much bigger than the cat’s eye, it shouldn’t be lime green, etc.
- Give the participants 15-20 minutes to complete the supervised training of the NODE, doing image searches and writing down everything they can about the feature.
- Encourage collaboration between the groups – what strategies are each team using to train their NODE?
- Note that groups should take their time to write clearly, as others will be reading the information.
Section 2: I know my cats!
- Now that we have trained NODES, it is time to see if our neural network is working as expected. Have the groups trade their NODE sheets to other groups so that every group has a new NODE. Give the groups a couple of minutes to familiarize themselves with their new NODES.
- Show images of different things (cats and not cats) and see how well the NODES work. Participants should take on the role of the NODE and only use the information on the sheet to determine if the NODE finds the feature it is looking for. NODES have no other context, so they base it only on what they have on their NODE papers (what they have been trained for).
- Example: If the picture is of a cat but its nose can’t be seen, the nose NODE should determine that the image is not a cat.
- Show an image and have the NODES call out one at a time if the image is of a cat. Tally the counts of “Cat” and “Not a Cat” on the whiteboard or chart paper. Whichever vote has the majority determines if the image is a cat or not.
- After categorizing each image, ask the participants if a human would have categorized this image as a cat or not. Place a ✔or ❌ depending on if the network made the correct prediction. This is the real test of if a neural network is working well – how well does it mimic human decision making?
- Continue to show additional images from the collection as desired.
- Discuss as a group the outcome of the test.
- How well did your machine learn what a cat is? Was your neural network effectively trained?
- What types of changes would you make to how you trained the NODES?
- What types of changes would you make to the network itself – other NODES, etc?
- Were their certain aspects that worked well or didn’t work well?
- As a Computer Scientist, what did you learn from this test?
- What about other items using the term cat (CAT 5 cable, cat treats, etc), should these be identified as cats? Why or why not?
Section 3: Deep into cats!
- Computer Scientists use the term “Deep learning” to describe when additional layers of complexity are added to a neural network. This can be done in a number of ways:
- Having layers of NODES, where NODES will take information from other NODES and use that to help categorize. For example, the network could have the nose, eyes, mouth and ear NODES feed into a face NODE. This NODE could use information like where in the picture the parts were seen to better determine if a cats face is present.
- Weighting NODES can make results from one NODE more important than another. If the legs NODE sees 2 legs, that might be less important data than seeing eyes, ears and a nose that look like a cat.
- Allowing NODES to cluster results. This means that the NODES select a whole bunch of images that have similar features and group them together. These images can then be analysed using a different network to look for specific things.
- Commonly, undefined NODES can be used where the network itself is allowed to define what the NODES are looking for. Instead of training the neural network to look for a cat’s nose, you let the network learn to pick out its own important aspects. As the training images are processed, the NODES will be defined by the network.
- Ask the participants to consider how they could develop their neural network into a network that thinks deeper?
- What changes might the group make to how the network operates?
- Additional training to the NODES?
- Give the participants 10 minutes to improve their neural network.
- Once ready, select a few images from the previous test to re-test again. Include at least one image that was properly identified and a number of images that were not. Again keep a tally of the results on the whiteboard.
Reflection & Debrief
Reflection & Debrief
- Discuss the final results from the test of the neural network:
- Was the network more effective at identifying cats?
- Were the improvements effective?
- What other changes might want to be made?
- In this activity, there has been a lot of human involvement with the training of the NODES. As described above, many machine learning processes actually take undefined NODES and train them to look for features themselves. This eliminates the time consuming process of teaching the NODES.
- Other discussion points: Allow the participants some time to personally reflect on the question before discussing as a group:
- Consider how Google images knows to look for cats when you type ‘cats’. Do you think that Google uses machine learning to categorize information?
- How is the machine learning process similar to human learning? How are human brains setup like a neural network? Are these networks trained in the same way?
- How might machine learning benefit your life? Could machine learning be harmful?
Extensions & Modifications
How might you adapt the time, space, materials, group sizes, or instructions to make this activity more approachable or more challenging?
- Ways to make this activity more challenging:
- Create 3 layers in the deep learning network. How might a 3rd layer be beneficial?
- What other information might a NODE want to output, besides cat/no cat? How can this be used to improve the deep learning of the network?
- Consider how a computer might process these images – The network does not “See” the image, rather it processes the image pixel by pixel examining the colour of the pixel relative to the pixels around it. What are some examples of information the NODES might be looking for in this case?
- Ways to make this activity more approachable:
- Create a few sample NODE pages for the exercise.
- Brainstorm information to include on the NODE pages as a whole group.
- Where internet access is limited – Print a few pages of google image searches for “cats”
- Prompt groups with questions to improve their NODE construction:
- What if the cat is facing away from the camera?
- What if the cat is a cartoon?
- What if a dog is in the picture instead?