2022-12-17

To: Pomona

How: By bicycle, foot, packraft, more packraft, boat thanks to Pomona Trust, car, bicycle.

Overview

Location	Solo_Male	Solo_Female	Duets	Individual	False_Positives
F09	583	66	91	831	41
C05	530	52	49	680	42
H04	255	54	48	405	54
N20	214	20	46	326	34
M04	132	63	41	277	41
D03	138	20	50	258	24
N14	94	34	51	230	59
D09	155	32	11	209	67
WD05	177	1	1	180	53
T10	107	20	15	157	19
F05	149	2	2	155	57
K09	40	26	9	84	11
S13T	48	32	1	82	15
KS06	32	15	11	69	8
E07	55	0	4	63	4
J11	25	2	3	33	11
A11	7	1	0	8	2
TOTAL	2741	440	433	4047	542

Calls per Hour

location	male	female	duet	individual
F09	1.2255	0.2855	0.1655	1.511
N20	1.0924	0.2773	0.1933	1.3697
C05	1.0527	0.1836	0.0891	1.2363
N14	0.5492	0.322	0.1932	0.8712
H04	0.5509	0.1855	0.0873	0.7364
T10	0.5062	0.1452	0.0622	0.6514
M04	0.3145	0.1891	0.0745	0.5036
D03	0.317	0.118	0.0843	0.435
D09	0.3013	0.078	0.02	0.3793
K09	0.2025	0.1446	0.0372	0.3471
WD05	0.3236	0.0036	0.0018	0.3272
S13T	0.1782	0.12	0.0036	0.2982
F05	0.2796	0.0074	0.0037	0.287
KS06	0.1503	0.0909	0.0385	0.2412
E07	0.1073	0.0073	0.0073	0.1146
J11	0.0509	0.0091	0.0055	0.06
A11	0.0286	0.0041	0.0	0.0327
TOTAL	0.4253	0.1277	0.0628	0.5531

Changes

Female Calls per Hour

Male Calls per Hour

2022-12-29

I finished the first pass over the data from my last trip on the 17th of December.

Prior to the trip I had trained my own binaray classifier on 160gb of mostly Pomona data. About 10% was from Secretary Island, in total 4% of the audio was actual kiwi calls. It had very good validation statistics on the 20% data held out from training to check results after each epoch. Opensoundscape/pytorch made very good use of compute once I put the data on the ssd instead of the hard drive. It completed an epoch, including validation every hour, using 12 cores and big chunks of 100% GPU. It trained for 95 epoch’s, but I lost the best model due to a mistake I made, and ended up using the best model from around 75 epochs, there was very little difference in their statistics, decimal places.

Predicting with the model was extremely impressive, it finished processing 800gb raw audio in hours, AviaNZ would have taken a week running 24/7. I ran it over all my data, close to 6TB in a few days, less time than I would have taken processing 1 trip worth of audio using AviaNZ. The results are unprocessed except for the new data fetched in December. Time is short.

I over ran my usage limits on Airtable which I was using for the first pass over detections. I found a very efficient workflow on my MacBook. I merge the spectrogram image and audio file into a video, all the details I need are in the file name, and I use tags in Finder to label files. The tag is written into the file metadata and I can later retrieve a long list of labels for each file. Working with Finder is extremely efficient, you can tag files with keyboard shortcuts, even tag multiple files at the same time, sort tags, etc. Finder is brilliant. The labels stick to the file because they are part of it.

When predicting, Opensoundscapes looks at 5s chunks of audio with a 2.5s overlap. I get a long list of segmeents for each file, with a 0 or 1 assigned. There is some stuff going on there, to get a binary result and I may need to do some refining but overall it is very good. I then take that list, exclude any detection that falls in the day, defined by civil twilight. I made a thing that chunks it all up into actual calls, essentially I discard any detection that has no other detections nearby, but anything within 10s of any other detections gets chunked up. It works extremely well, no more half calls, except when they overflow a file during recording.

I have simplified my labels, for kiwi, all I now label are Male/Female, and I mark Close calls so I can find them. I Plan to find duets algorithmically in future.

I had a mind boggling 10 716 detections to wade through:

11 files got a ?
7 Geese
22 Kaka, yes 22! C05, D03, F05, F09, H04, M04, S13T, T10.
5 Kea, D03 & D09
237 LTC (Long tail cuckoo, need to add them in to the model, its doing well though, mostly distant LTC detected.)
Many creaking trees from J11, also need to add them in to the model.
Loud close frogs are no longer a problem, but distant frogs now are! Also need to add them in to the model. The model pulls a lot of kiwi, even distant ones, from the din of frogs at N20
Only 22 morepork, my new model is right onto it.
A lot of dawn chorus, but also a lot of kiwi in the middle. Also need to add dawn chorus in to the model. Kiwi are active in the early mornings on Pomona, all the time.

219 Close Kiwi detections
258 Duets, provisionally.

677 Female Kiwi
2607 Male Kiwi
This is an under estimate, it is raw detections, many detections will be chopped up into more than 1 call.

This model detects a lot more calls than I am used to. Previously I had detected a total of around 5400 calls on Pomona. Must have been missing many many calls. I will find them.

First I need to reduce my false positives.

The Plan:

Finish labelling this data
Construct a new dataset
Train a new model, this one will likely classify Male/Female/NotKiwi
Go through all my data with hopefully fewer false positives.
Train a new new model t use on the next tranche of fresh data.

I am aiming to label calls automatically in future, just weed out a few exceptions. Or that is wnat my goal is anyway.

In future my skraak notbook will need to be re-written to accomodate a simpler, more automated labelling scheme.

Video

Audio

Back to Trips

Subscribe to the Skraak Newsletter
Thanks to the Pomona Island Charitable Trust
Supported by Fiordland Packs

CC BY-SA 4.0 David Cary. Last modified: May 29, 2024.

2022-12-17

Overview

Calls per Hour

Changes

Newsletter

Next