short-time Fourier transform (STFT)

The short-time Fourier transform is a type of Fourier analysis used to determine the frequency content of a signal over short, fixed-length time intervals. The STFT is based on the conventional Fourier transform, but it divides the signal into overlapping segments, called [[frame]]s, and then performs a Fourier analysis for each frame. This results in a two-dimensional representation of the signal, where the frequency is on one axis and time on the other. Different resolutions can be obtained for analyzing different aspects of the signal by varying the size and position of the frames. The signal segmentation, called [[windowing]], is performed by multiplying the signal by a window function $w(n)$ that is zero-valued outside a specified interval. For example the rectangular window: $ w_{r}(n) = \begin{cases} 1, & 0 \le n \le N-1 \\ 0, & \text{otherwise} \end{cases} $ The [[finite-duration discrete-time signal]] resulting from the multiplication is called a **frame** and the [[window length]] $N$ can also called the frame size. $ x_{m}(n) = x(m+n) w(n) $ where $x_m(n)$ is a frame with $N$ samples of $x(n)$ starting in sample $m$. The short-time Fourier transform, $X(n, \omega)$ is a two-dimensional representation of the signal $x(n)$: $ X(n, \omega) = \sum_{l=0}^{N-1} x(n+l)w(l) e^{-j\omega l} $ If we use the [[discrete Fourier transform (DFT)]]: $ X(n,k) = \sum_{l=0}^{N-1}x(n+l)w(l) e^{-j \frac{2\pi k}{N}l},\ 0\le k \le N-1 $ Given the averaging effect of the window, there is usually no need to compute the STFT for every time sample. To prevent discontinuities it is also convenient to have some overlapping between consecutive frames. This means that the next frame should begin $M$ samples after the beginning of the previous frame. $M$ is referred as the [[hop length]] and is often defined as a fraction of the [[window length]] $N$. For example, $M = N/4$.