The relationship between target strength frequency response and vertical swim velocity: a new approach for fish discrimination

In-situ identification of fish species using acoustic methods is a key issue for fisheries research and ecological applications. We propose a novel approach to fish discrimination based on the relationship between target strength frequency response (TS(f)) and vertical swim velocity (VSV), as a proxy of fish body orientation. The measurements were carried out with a wideband echosounder on live fish of five species confined in a net cage. The data show a large dependence of TS(f) on VSV. To compare the variability of frequency responses of different fishes, we calculated ΔTS(f, VSV) as the difference between the TS(f) at given VSV and the TS(f) at VSV = 0, i.e. when the fish was swimming horizontally. We demonstrated that the relationships between ΔTS and VSV were similar for fish of the same species but dissimilar for different species. This implies that the acoustic fish discrimination in nature might be performed when the variations of the VSV can be measured from acoustically tracked fish. This can be a promising method for remote fish discrimination, for instance, for fish with diurnal vertical migrations. Further validation of this approach for fish recognition is required.


SM1. Single target and fish track detection
We used the software Echoview 9 (Echoview Software Pty. Ltd.) to obtain the echoes of single targets and tracks of the studied fish from the data of pulse-compressed wideband backscattering strength. The single target detection function of Echoview located the peaks of wideband backscattering strength vertical profile at each time step with the criteria for excluding overlapping targets, which is based on the conventional method for narrowband data (Ona and Barange, 1999). The detected peaks at each time step were defined as echoes from single targets. Using the detected single targets, the fish tracks were estimated by using the fish track detection function of Echoview (Blackman, 1986;Bertsekas, 1990). The single targets that were sequentially detected without noticeable jumps in depth, time, and the position in a beam were defined as a track of the same object (fish).
The fish track detection function was run with the parameters summarized in Table S1. We selected the "4D" algorithm (using range, major/minor-axis angles, and time) of the fish track detection function. More details about the algorithms and parameters of the function are described on the website of Echoview (Echoview Software Pty. Ltd., 2020).

3
In addition to the parameters shown in Table 1, the parameters of the acceptance of detected tracks were as follows: Minimum number of single targets in a track = 3, Minimum number of pings in track (pings) = 3, Maximum gap between single targets (pings) = 5.
The false detections of the tracks due to unwanted targets (e.g., cage wireframe in this study) were excluded manually.

SM2. Calculation of frequency response
The frequency response of target strength depending on acoustic frequencies, TS(f), was calculated by using the Echoview based on the method suggested by Demer et al. (2017).
The signal received by the echosounder was pulse compressed to improve time-domain resolution and signal-to-noise ratio (e.g., Chu and Stanton, 1998). First, the Echoview defined a short segment of the time series of the received signal which corresponds to a section of a vertical profile of backscattering strength around a single-target echo as the center of the section. Next, the short segment was transformed into frequency spectrum of the received signal by the fast Fourier transform (FFT) and the normalization using the frequency spectrum of transmitted signal. The received signal was padded with zeros, and the number of data points for FFT calculation was increased to 1024 to improve the frequency resolution of the resulting spectrum. The frequency spectrum of the received signal was converted into TS(f) by using the impedances of the receiver and transducer, the frequency dependent calibration coefficients, and compensations such as the range compensation of 40 log R (R is range), absorption, and beam compensation which are commonly used for the conversion from narrowband signal to target strength (e.g., Simmonds and MacLennan, 2005).

5
The frequency resolution of the resulting TS(f) is fs/N, where fs is the sampling rate of the received signal after decimations (Demer et al., 2017), N is the FFT length in data points. In this study, fs was 62.5 kHz and N was 1024, thus the frequency resolution of TS(f) was ~0.06 kHz.

SM3. TS(f) quality control
The target strength frequency response, TS(f), was calculated from the wideband acoustic  (Fig. S3). In addition, the atypical TS(f) curves with irregular shapes which were not detected by the TS standard deviation were removed manually.  Table 1 in the main text for species abbreviations). For each studied fish, the 90th percentile was used as the threshold to exclude the TS(f)s with exceptionally large TS variations.
In addition, all single-target echoes positioned in close proximity (< 5 cm) to the top and bottom of the cage, where fish changed their swimming direction (flipped), were also excluded from the analyses, as they could impair the consistency between fish 9 vertical swim velocity (VSV) and fish body orientation in the cage with confined dimensions. For the same reason, for our analyses we used only the single-target echoes with a small variance of the VSV over 5 seconds centered at each time step. This permitted us to select the echoes when the fish was swimming at stable velocities. The upper threshold of the variance of the VSV over 5 seconds was set to 1.3 × 10 −3 m 2 s −2 , which corresponds to the 95th percentile of the 5-second VSV variance measured from all studied specimens (Fig. S4).

SM4 Frequency responses of various fishes
Large variability of TS at given f (e.g. at 70 kHz) and TS(f) shapes were observed for all studied fishes (Fig. S5). The variations of TS(f) were associated mainly with VSV of fish.
The variations of TS(70) were always >10 dB but in some cases (e.g. Mc and Tz) exceeded 20 dB.  Table 1 in the main text for species abbreviations). The data show large variations in TS at each f. Color bar indicates the fish vertical swim velocity for each TS(f).

SM5. TS and fish position in the cage and acoustic beam
To examine if TS of studied fishes could be dependent on their position in the cage and the acoustic beam, TS at 70 kHz was plotted against fish depth and major/minor axis angles in the acoustic beam (Fig. S6). The relationships between the TS and the positions of different fishes did not reveal any patterns or trends. This indicates that the observed TS changes could not be associated with fish location in the cage and supports our conclusion that the TS changes were dependent on the VSV changes. show close similarities in patterns at various frequencies.

SM7. Near-field effect
The field close to a transducer or backscatterer is termed the near field, where the relationship between acoustic intensity and range is more complicated than that out of the near field (e.g., Simmonds and MacLennan, 2005). The near-field range of the transducer, Rnf, is calculated as follows (Demer et al., 2015): where λ is the acoustic wavelength, and dt is the largest distance across the active elements of a transducer. The dt is approximated as follows (Demer et al., 2015): where θ−3 dB is the nominal half-power (3-dB) beamwidth, and k = 2π/λ is the acoustic wavenumber. Demer et al. (2015) recommended measuring the backscattering strength at range > 3Rnf where the near-field effect can be negligible.