Went back to the #HaarWavelets stuff and found a way to control the rhythmicality of the noise by scaling the variances of the seeding decorrelated gaussian random numbers. With sigma = 1 it's rhythmic, smaller is more uniform. Maybe I could control it per octave instead of all octaves together in unison.
Now I'm thinking of combining this with the FFT-based stuff: X would be the 10D (real; usually 11D but I think it's maybe best to exclude DC for this) energy per octave analysis of the same audio that Y is the 128D (complex, excluding DC) FFT analysis of, then I can simulate X with #VectorAutoRegression like the attached (without generating audio) and feed that into the #VARMA to get Y for audio output.
#HaarWavelets #VectorAutoRegression #VARMA
#VectorAutoRegression is a pure feedback model, similar to poles in the pole-zero representation of z-transform filters. To add zeros to the model is quite simple:
$$ y_t = \sum_{i=1}^p A_i y_{t - i} + \sum_{i=0}^q B_i x_{t-i}$$
and then add the $x$ vectors and $B$ matrices to the least squares stuff. I think the jargon for this is #VARMA, for Vector Auto-Regressive Moving Average.
I worked it through and coded up most of it before realizing a #FatalFlaw : I need the $x$ input data corresponding to the $y$ output data to do the regression to estimate the $A$ and $B$ parameter matrices, while so far I've just been using WAV files as $y$ series, which means I don't have any $x$.
#VectorAutoRegression #VARMA #fatalflaw
The #VectorAutoRegression stuff I've been doing can be summarized as
$$ y_t = \sum_{i=1}^p A_i y_{t - i} $$
where each $y_t$ is a $D$-vector and each $A_i$ is a $D \times D$ matrix.
Given an input series of $y$ values, the $A_i$ can be calculated by #LeastSquares minimization:
$$
Y = [ y_p ; y_1; \ldots ; y_{T - 1} ] \\
X = [ y_{p-1}, y_{p-2}, \ldots, y_0 ; y_p, y_{p-1}, \ldots, y_1 ; \ldots ; y_{T-2}, y_{T-3}, \ldots, y_{T-1 - p} ] \\
A = \left(X^T X\right)^{-1} X^T Y
$$
where the matrices have these initial dimensions:
Y : (T - p) × D
X : (T - p) × (p × D)
A : (p × D) × D
then reshape $A$ to $p × (D × D)$
In my case all values are complex (and ^T is conjugate transpose) because each $y$ is an FFT of a block of input data - I'm using FFT size $256$ (making $D = 129 = 256/2+1$ unique bins for real input) overlapped 4x.
#VectorAutoRegression #LeastSquares
More research on #VectorAutoRegression got me digging into #SpectralRadius which is the magnitude of the largest-magnitude #Eigenvalue. This is analogous to the pole radius in regular single variable #ZTransform representation: if it's less than 1 all should be fine, bigger than 1 and it becomes unstable.
So I'm now normalizing all the feedback coefficient matrices by the largest spectral radius among them and a bit more, so that the new largest radius is less than 1, and it seems stable even in the presence of morphing.
The attached is heavily dynamics compressed, as it was a bit peaky otherwise.
#VectorAutoRegression #SpectralRadius #eigenvalue #ZTransform
I think I got interpolation between feedback coefficient matrices working (thanks to SciPy, which has polar decomposition, log and exp for complex matrices) but it blew up almost immediately, so I'm re-adding the normalization code in the hopes of getting some output that's more than a brief click followed by NaNs.
Trying to make morphing drones modulated by a third signal or something, not sure what it'll turn into...