in the nature

Developing an Artificial Intelligence Model for Wildlife Image Classification

As many of you know, camera traps are a fantastic tool for monitoring wildlife. They capture a wealth of images that help us understand animal behaviour, population dynamics, and more. However, the sheer volume of images can be overwhelming, making it difficult to sort through and classify them efficiently. That’s where researchers from the DEEP (Dynamics of eco-evolutionary patterns) lab at the University of Tasmania and their Mega Efficient Wildlife Classification (MEWC) model comes in. I’m excited to share with you a little about how it works, and what it means for our WildTracker citizen science program.

The Challenge

In the field of wildlife monitoring, we often face the daunting task of processing millions of images from camera traps. Each image needs to be examined to identify the species captured, which can be incredibly time-consuming and often requires expert knowledge. One of the reasons the TLC started WildTracker was to enlist the help of citizen scientists in managing the data, but we understand that the workload can also be overwhelming for landholders!

What is Computer Vision?

Before diving into the MEWC model, let’s talk about computer vision. This field of AI focuses on teaching computers to interpret and understand the visual world. By using algorithms and models, computers can analyse images, recognise patterns, and even identify objects – much like humans. This technology has many applications, from facial recognition systems to medical imaging, and now, wildlife monitoring.

The MEWC Model

The research team at UTAS leveraged cutting-edge computer vision techniques, inspired by the visual cortex in animals. These techniques involve deep-learning, a subset of machine learning that has revolutionised computer vision. Convolutional Neural Networks (CNNs) are a specific kind of deep-learning algorithm designed specifically for processing structured grid data (like pixels), making them powerful for image analysis and enabling computers to learn from vast amounts of training data. These networks mimic the human brain’s visual processing, allowing for highly accurate object recognition, classification, and detection.

At this point, I must apologise for the inevitable avalanche of tech terms and acronyms! There is a secret language among computer scientists that even I am still learning to understand.

Training the MEWC model involved feeding it millions of labelled camera trap images where species have been identified by experts. The model learns from these examples, improving its ability to classify new, unseen images accurately. This training process uses a technique called supervised learning, where the model iteratively adjusts its parameters to minimise prediction errors.

The MEWC workflow involves several steps: detection, snipping, prediction, and annotation.

Detection: After pre-processing the images (e.g. resizing), MEWC employs another machine learning model, the open source ‘MegaDetector’, developed by Microsoft’s AI Earth team. MegaDetector passes the image through a CNN, generating a heatmap indicating likely animals, people or vehicles being present in various regions of the image. If no features are detected, the images are placed in a ‘blank’ folder. Blank images submitted to WildTracker will eventually be moved to a cold cloud storage server where they will still exist but won’t clog up our database. The output from MegaDetector depends on a threshold in the model’s confidence level, set to balance between capturing animals and ignoring rocks or lumps of wood.

Snipping: Features are then extracted from the larger image and resized once more (see ‘snips’ to the right), ready to be classified.

Prediction: MEWC offers a choice of CNN models to classify each image snip, predicting the species present. It assigns a probability score to each possible species and ranks them accordingly. For example, an object might be classified with 60% confident as a pademelon, 30% confident as a wallaby, 5% confidence as a bettong and so on. The classifier’s accuracy depends greatly on the quality of training data. Fortunately, MEWC has seen A LOT of Tasmanian animals in different postures, lighting, and coat colours etc.

Annotation: MEWC overwrites some of the less useful fields in the image metadata (e.g. focal length, ISO) with codes indicating species classifications and confidence levels. These fields can then be accessed by WildTracker and displayed online, along with a bounding box around the detected objects, pulled from MegaDetector.

As AI technology continues to evolve, the MEWC model and others like it will become even more powerful and accurate. By integrating AI into the WildTracker citizen science program, we have a unique opportunity for human feedback to refine models. WildTracker participants will be able to either accept or reject the AI classification or refer it to an expert (one of us ecologists at TLC) for validation.

If this blog wasn’t nerdy enough for you, the detailed architecture of MEWC and how it is being deployed can be found in this open access preprint.