About
This is an idea that interests me. I seek feedback, references, and collaborators. I have considered in implementing this in Python. My background is in image algorithms, and not so much music.
Summary
Some image algorithms can be generalized and applied to music, if music is represented like image pixmaps.
Music maps
A Map is a 2-D addressable array.
A MusicMap is a Map where the addressable unit is a Musicel, like a Pixel.
A Musicel comprises Musicelels, like Pixelels.
Typical Musicelels:
- pitch
- volume
- waveform (attack/decay)
Visually, if you take a sheet of music and divide it into quads, where the size of quads is such that there is only one note in it, you have a MusicMap. Some quads contain silence notes. I gloss over that all notes are not the same duration, you can spread a long duration note into many Musicels.
(A PixMap is a 2-D array of Pixels comprising Pixelels i.e. RGB values.)
Comparing the self-similarity of PixMaps and MusicMaps
In a Pixmap, each Pixel is related (correlated?) to neighboring Pixels with the same strength. In a MusicMap, there is less relation between a Musicel and the Musicels above and below it (in the same measure on the lines of music above and below.)
In many images of the real world, there might not be much relation between the top and bottom of an image. In music, generally there is some relation between all large parts of the piece, i.e. the scale, and the set of chord changes. Restating, some images (having man-made objects as opposed to natural objects) are not very fractal throughout, whereas music is usually fractal throughout.
Generalizing image processing algorithms to music
An image processing algorithm that works on Pixmaps can be generalized to work on MusicMaps. You simply make the algorithm polymorphic on the base class of Pixels and Musicels ( call the base class Cel.)
The Cel class might have these methods:
- compare(self, other)
- average(self, other)
More generally there would be many methods for reducing two Cels to one, e.g. min(self, other)
An example
Consider a blur algorithm, which reduces the resolution of an image (or music.) Suppose a symphony of 1,024 lines of 32 notes each. A blur algorithm applied to the symphony might reduce it to 32 lines of one note each (by a factor of 32.)
Music is played/heard linearly, unlike an image which is parallel processed by the eye/brain. Still, if you played the reduced symphony, you might get a blurred sense of it. Suppose the first movement of the symphony used mostly staccato, short, high notes. That might come through in the blur.
Continuing along this line of thought, there are blur algorithms that are better at retaining ‘features’, knows as image summarization algorithms. These are used for example to compress images to thumbnails. Can you create a thumbnail for music, that is not just a sample?
Which image processing algorithms are interesting when applied to music?
Image processing algorithms are categorized as ‘structural image editing’ algorithms seem most applicable to music. For example:
- summarization (reduce resolution while retaining features)
- context sensitive paste (inserting a sample that matches nearby context)
- reconstruction (filling in holes to match nearby context)
- collage (making one from two)
- alteration detection
- texture transfer (applying texture from one to another)
There is a foundational algorithm called bidirectional similarity that is useful in the structural image editing category of image processing algorithms.
Silence and transparency
There is another interesting question here, about generalizing silence in music and transparency in images. I think they are analogous, and that many concepts from image processing, such as pre-multiplied alpha, could be carried over. For an example of the question: is a silence represented by a duration Musicelel having value zero, or represented by all its Musicelels having value zero?