Abstract
Many bioacoustic signals consist of a sequence of discrete stereotyped sounds occurring in repeated patterns. A natural question to ask is how best to characterize the underlying structure of the source producing the sequence of sounds. The structure of the source manifests itself as constraints on the patterns observed in the sequence of sounds. These constraints determine how predictable the order of the sounds is. The information entropy of a discrete symbol sequence is a quantitative measure of how unpredictable the sequence is. A straightforward but biased technique for estimating the entropy of an unknown source is to substitute observed symbol frequencies into parametric models such as Markov models. More general nonparametric entropy estimators exploit the relationship between the entropy and the average length of matching patterns within the sequence. This nonparametric entropy estimate forms an upper bound on the amount of information conveyed by the sequence of sounds. Additionally, comparing entropy estimates from the parametric and nonparametric models provides a hypothesis test determining whether the parametric model sufficiently captures the constraints of the source. These techniques are illustrated in analyses of humpback whale songs and leopard seal calling bouts.