Using Machine Learning for Kids, participants will have the opportunity to add data and begin training your machine. In this activity, we recommend you include up to eight labels to train data under; an example would be Sci-Fi and Fantasy text (e.g. The laser whizzed vs the Dragon bellowed). From here, participants will get to add sample text under these labels before training the machine.
Once the model has been trained, participants will need to copy and paste the code stub into a text editor. Here they will modify it slightly to make it more effective before inputting new lines of text and watching as the model classifies them under your labels.
The whole activity can be adapted with different text classification options, and still works if you want to use a different number of labels, keep in mind that the more data or labels you input, the longer it will take for the machine to train.
Developed by Actua’s Network Member Science Venture.
To Do in Advance
- To carry out this activity, you will need to create some accounts so that you can access all the required resources. For this activity, none of the accounts you will be asked to register for come at a fee and are all free to use for the purposes of this lesson.
- You can follow the instructions which can be downloaded from here to set up your free Lite account on the IBM Cloud. Once you have completed this, you will have the API keys necessary to plug into MLK (Machine Learning for Kids).
- Set up five student accounts with different pairs of categories in each account. There is a button on the bottom of the student account management page where you can reset all student passwords to the same thing.
- Examples for labels that you can use in each account include;
- Sci-Fi vs. Fantasy, Positive vs. Negative movie reviews, Greeting vs. Farewell, Chatspeak vs. Formal
Opening Hook: Understanding Meaning in Voice
- For this activity, we will begin by introducing the participants to how machines and artificial intelligence can “think.” When you search up “dog” on Google, you’ll get pictures of dogs, but if you search up “sad” on google, you might get a picture of a sad dog. So how do Google and other engines assign keywords to images, today we’ll do a quick activity to see if we can figure it out!
- Make sure you’ve either printed out or written out the five sets of four keywords, and have the five corresponding images ready to showcase to the class.
- We will begin by recapping the information given in step one to the participants, before explaining that once participants see the four words, they will have two minutes to sketch out what they think the photo is going to be.
- Once two minutes are up, give the group some time to share their sketches with each other and have some laughs.
- Following this, ask some of the participants why they chose they to draw what they did.
- Next, show the actual image to the class, once participants have had thirty seconds to think about it- begin a conversation on why they believe the words correspond to the image, and if they would have made a different choice then Google
- Repeat this process for any number of keywords and photos, it is encouraged to create your own setup of cards to suit your own group, but an example set can be found here.
- When a computer is trying to classify text based on factors such as emotional undertone, word choice, and grammar, it can often make choices that we wouldn’t necessarily think of making. This activity will highlight how not all images are tagged how we think they should be, yet we still get just what we want from Google when we search for something.
Section 1: Introduction to Coding with Python
- First, have the participants pick a model and log into the appropriate account. Have them add some lines of text to their training model (under train). Tell the participants to make sure they do the same number of examples in each category. If one category has more examples than the other, the computer will start to favour choosing the category with more examples.
- Each time they test the test file, have them write down the average of all of the confidence levels and the amount of training data at the time of testing. This way, they can see the confidence level increase as the amount of data increases.
- After each category has at least five examples on each side, it is time to have one person from each account train the model, under learn and test. Be careful with how you do this. Whenever the model is being trained, no one will be able to run any test lines. So, if too many participants are doing this, or one participant does it too often, it will seem like the program is not working.
- Under Make, you will see the above screen. Have your participants copy and paste the python code into Notepad++ or a similar text editor. Have them save it as “cat1cat2.py”, where cat1 and cat2 are replaced by the relevant categories, e.g. “scififantasy.py.” First, have them run the code with just “The text that you want to test” changed. Once they have a command line open, and they are in the correct directory, they can run the program by entering “python scififantasy.py.” The result will be the computer’s guess at which category the text belongs in and the confidence level.
Section 2: Helping the Computer Learn
- After they test some of the lines of code, this is an excellent time to check in with the group as a whole.
- What have they noticed with the test lines? Does the model classify the test lines like you would expect? Is the confidence high or low?
- Ask the group to return to adding training data for some fixed time (5 or 10 minutes at most). Then, ask the same participants from before to press the “train model” button under “Learn and Test.” Now, have them rerun their python programs to see how the confidence levels have changed.
Note that they only need to copy the python code from the website after the first time you train the set since the API key given is a link to your project and will stay up to date each time the model is trained.
Reflection & Debrief
Ask the participants about how this changes the way they look at computers? How does this process benefit us and is there a better way we could interpret text with an AI? Have the participants turn to an elbow buddy and talk for 30 seconds about what they would do differently if given another chance.
Extensions & Modifications
How might you adapt the time, space, materials, group sizes, or instructions to make this activity more approachable or more challenging?
- Instead of hardcoding in test lines each time, let’s have the program ask for input! For example, testLine=input(“enter test text”), then replace the “The text that you want to test” with testLine. Now when they run the program, it will prompt them for input.
- To further improve the program, put the bottom set of text into a loop so that the program continues to prompt for new test lines until an empty string is entered.
- Here’s the code snippet for opening the file, putting the lines into a list, removing the newlines/whitespace, then testing each line and printing the result to the console.
- Test from file and further challenges – testFile = open(“scififantasyLines.txt”, “r”) – testLines = testFile.readlines() – for line in testLines: – testLine = line.rstrip() #needed to remove newline character
- If you pre-train the models with the minimal amount of data, you can have the participants start with testing using the python program. This works well if you want to avoid multiple training cycles, and you still get the benefit of seeing the model improve after the participants add training data.
- Either have the participants come up with a set of test data, or have the instructors write a set of test data beforehand. For the sci-fi vs. fantasy example, I used scififantasyLines.txt to name the test file.
- If you want to avoid python programming, you can have all of the participants try test lines on the “Learn and Test” page instead. The only issue here is making it clear that the “Train model” button should only be pressed by one person at specific times. (once after there is the minimum amount of training data, and once after they have done another cycle of training)