Context
I'm working on an audio classification problem and I want to recreate the spectrogram I get from librosa's built in plotting in grayscale.
The reason for doing this is to create images to pass to a neural network. Doing it with Matplotib is too slow, since it is designed for creating figures, not images.
I have scaled the amplitude using power_to_db(), but the frequency axis still needs to be scaled. With the built in display.specshow(), y_axis='log' I am able to replicate the desired result.
Question
How can I apply an equivalent operation to my spectrogram so the Y axis of my image looks like the one provided by librosa? Consider comparing librosa's spectrogram example and mine.
def get_spectrogram_from_wav(wav: np.ndarray, sample_rate: int) -> np.ndarray:
spec = np.abs(librosa.stft(wav))
spec_db = librosa.amplitude_to_db(spec, ref=np.max)
# log_spec = np.log10(spec_db)
return spec_db
def plot_slice(wav: np.ndarray):
spec = np.abs(librosa.stft(wav))
plt.figure()
librosa.display.specshow(
librosa.amplitude_to_db(spec, ref=np.max),
x_axis='time', y_axis='log'
)
plt.title('Power spectrogram')
plt.show()
I believe the right way to do this per Dorian's answer is to create a numpy meshgrid using np.logspace for the Y axis. I'm still not sure what the next step should be, but this is a start.