3rd Workshop on Intelligent Music Production (WIMP 2017)
Friday 15th September
Media City UK, University of Salford
Registration can be completed via the University of Salford Online Shop, by clicking here. The cost of registration is £20, which includes the full conference plus food and refreshments throughout the day. The welcome event on Thursday evening is also included but spaces are limited.
The venue is located within 200 m of the Metrolink station at MediaCityUK, which has links to Manchester Piccadilly train station (~20 minutes) as well as Manchester Airport (~1 hr, with a change at Cornbrook station). Alternatively, a taxi from the airport costs ~£25, or ~£12 from the city centre.
A large number of hotels and restaurants are located within 10 minutes walk of the main venue.
The conference will be preceded by a welcome event at the University of Salford main campus (Newton Building – see map below), on the evening of 14th September. This will start at 18:30 and conclude at 20:00. This event will feature a tour of the acoustics test facilities (anechoic chamber, reverberation room), demonstrations from our on-going research projects, including Media Device Orchestration, plus a wine reception.
|09:45||Keynote||Alexandros Tsilfidis, Accusonus
TOWARDS SOUND INTELLIGENCE: A MUSIC SOFTWARE STARTUP’S PERSPECTIVE
|10:45||Demo: Behringer DeepMind12 Synth
|11:00||Paper Session A|
|Brecht De Man, Joshua D. Reiss and Ryan Stables||TEN YEARS OF AUTOMATIC MIXING|
|Dominic Ward, Hagen Wierstorf, Russel D. Mason, Mark Plumbley and Chris Hummersone||ESTIMATING THE LOUDNESS BALANCE OF MUSICAL MIXTURES USING AUDIO SOURCE SEPARATION|
|Nicholas Jillings and Ryan Stables||AUTOMATIC CHANNEL ROUTING USING MUSICAL INSTRUMENT LINKED DATA|
|13:15||Posters & Late Breaking Demos + Coffee|
|David Su||USING THE FIBONACCI SEQUENCE TO TIME-STRETCH RECORDED MUSIC|
|Jordie Shier, Kirk McNally and George Tzanetakis||SIEVE: A PLUGIN FOR THE AUTOMATIC CLASSIFICATION AND INTELLIGENT BROWSING OF KICK AND SNARE SAMPLES|
|Alex Wilson||PERCEPTUALLY-MOTIVATED GENERATION OF ELECTRIC GUITAR TIMBRES USING AN INTERACTIVE GENETIC ALGORITHM|
|Marco A. Martínez Ramírez and Joshua D. Reiss||DEEP LEARNING AND INTELLIGENT AUDIO MIXING|
|Spyridon Stasis, Nicholas Jillings, Sean Enderby and Ryan Stables||AUDIO PROCESSING CHAIN RECOMMENDATION USING SEMANTIC CUES|
|14:00||Keynote||Andy Elmsley, Melodrive
(SOME) CHALLENGES IN REAL-TIME INTELLIGENT MUSIC PRODUCTION
|14:45||Coffee Break + Demos
|15:15||Paper Session B|
|Balandino Di Donato and James Dooley||MYOSPAT: A SYSTEM FOR MANIPULATING SOUND AND LIGHT PROJECTIONS THROUGH HAND GESTURES|
|Trevor Agus and Chris Corrigan||ADAPTING AUDIO MIXES FOR HEARING-IMPAIRMENTS|
|Alex Wilson, Róisín Loughran and Bruno M. Fazenda||ON THE SUITABILITY OF EVOLUTIONARY COMPUTING TO DEVELOPING TOOLS FOR INTELLIGENT MUSIC PRODUCTION|
|16:45||Panel session||TOPIC: Are Sound Engineering Jobs at Risk?
Alexandros Tsilfidis (Accusonus): TOWARDS SOUND INTELLIGENCE: A MUSIC SOFTWARE STARTUP’S PERSPECTIVE
Accusonus’ mission is to build products that open new creative possibilities for everyone involved in the music making process. In this talk, we will discuss the challenges of trying to transform a modern signal processing algorithm into a successful software product. I will highlight specific problems we confront when we develop and iterate on our machine learning based software (e.g. drumatom and Regroover). Given our previous research experience, I will also share some thoughts on how developing an algorithm is different if you are looking to productize it rather than publish it. Finally I will present our vision on how we want to use and apply machine learning in a way that empowers musicians, producers and sound engineers.
Andy Elmsley (Melodrive): (SOME) CHALLENGES IN REAL-TIME INTELLIGENT MUSIC PRODUCTION
We are hopefully all familiar with the concept of linear music. In a film score, even if the music reflects the action on-screen, it is always the same, no matter how many times you watch the film. In interactive media, such as VR and games, things are not so linear. The player has the ability to affect the world around them, triggering events and different states at any time. The music in these situations should depict these emotional settings and adapt to the player and the experience. This is known as adaptive music.
However, as interactive experiences grow is size and scope we hit a problem: it’s impossible for a composer to conceive and produce every musical outcome to address all the possible decisions and behaviours of a player in a game.
With Melodrive, we’re building an AI platform to help composers with the mammoth task they face. Melodrive is an AI music platform that can automatically generate emotionally-driven adaptive music in realtime. The aim of our system is not to replace composers, but rather to augment them by creating infinite emotional variations of their music, and automatic transitions between musical states.
In order to succeed, our system must produce convincing emotive variations that respond in realtime, with the highest quality production standards and the lowest latency and memory and processor usage possible. This talk will address some of the challenges we’re facing in building such a system.
The panel session, entitled “Are Sound Engineering Jobs at Risk?”, will address the growing popularity of AI and intelligent systems in sound engineering. Experts from academia and industry will discuss recent developments, along with the implications that they have for jobs in the field. Confirmed panel members are: Alessandro Palladini (Music Group), Enrique Perez-Gonzalez (Solid State Logic), Hyunkook Lee (University of Huddersfield) and Steinunn Arnardottir (Native Instruments).
TEN YEARS OF AUTOMATIC MIXING [pdf]
Brecht De Man, Joshua D. Reiss and Ryan Stables
Reflecting on a decade of Automatic Mixing systems for multitrack music processing, this paper positions the topic in the wider field of Intelligent Music Production, and seeks to motivate the existing and continued work in this area. Tendencies such as the introduction of machine learning and the increasing complexity of automated systems become apparent from examining a short history of relevant work, and several categories of applications are identified. Based on this systematic review, we highlight some promising directions for future research for the next ten years of Automatic Mixing.
ESTIMATING THE LOUDNESS BALANCE OF MUSICAL MIXTURES USING AUDIO SOURCE SEPARATION [pdf]
Dominic Ward, Hagen Wierstorf, Russel D. Mason, Mark Plumbley and Chris Hummersone
To assist with the development of intelligent mixing systems, it would be useful to be able to extract the loudness balance of sources in an existing musical mixture. The relative-to-mix loudness level of four instrument groups was predicted using the sources extracted by 12 audio source separation algorithms. The predictions were compared with the ground truth loudness data of the original unmixed stems obtained from a recent dataset involving 100 mixed songs. It was found that the best source separation system could predict the relative loudness of each instrument group with an average root-mean-square error of 1.2 LU, with superior performance obtained on vocals.
AUTOMATIC CHANNEL ROUTING USING MUSICAL INSTRUMENT LINKED DATA [pdf]
Nicholas Jillings and Ryan Stables
Audio production encompasses more than just mixing a series of input channels. Most sessions involve tagging tracks, applying audio effects, and configuring routing patterns to build sub-mixes. Grouping tracks together gives the engineer more control over a group of instruments, and allows the group to be processed simultaneously using audio effects. Knowing which tracks should be grouped together is not always clear as this involves subjective decisions from the engineer in response to a number of external cues, such as the instrument or the musical content. This study introduces a novel way to automatically route a set of tracks through groups and subgroups in the mix. It uses openly available linked databases to infer the relationship between instrument objects in a DAW session, utilising graph theory and hierarchical clustering to obtain the groups. This can be used in any intelligent production environment to configure the sessions’ routing parameters.
USING THE FIBONACCI SEQUENCE TO TIME-STRETCH RECORDED MUSIC [pdf]
Human creativity in the music production process is often augmented by technological tools that modify existing musical input, thereby generating unanticipated ideas. The Fibonacci stretch algorithm is a means of time-stretching a music recording, using the Fibonacci sequence as a theoretical basis, such that the resulting music recording is perceived as being in a new meter. Because the overall impulse shape of the recording remains intact, Fibonacci stretch generates novel rhythmic ideas rooted in a natural-sounding transformation of existing musical material. This paper explores the possibilities for Fibonacci stretch to assist in the music production process, and also introduces an audio plugin prototype that encompasses a real-time variant of the algorithm.
SIEVE: A PLUGIN FOR THE AUTOMATIC CLASSIFICATION AND INTELLIGENT BROWSING OF KICK AND SNARE SAMPLES [pdf]
Jordie Shier, Kirk McNally and George Tzanetakis
The use of electronic drum samples is widespread in contemporary music productions, with music producers having an unprecedented number of samples available to them. To be efficient, users of these large collections require new tools to assist them in sorting, selection and auditioning tasks. This paper presents a new plugin for working with a large collection of kick and snare samples within a music production context. A database of 4230 kick and snare samples, representing 250 individual electronic drum machines are analyzed by segmenting the audio samples into different sample lengths and characterizing these segments using audio feature analysis. The resulting multidimensional feature space is reduced using principle component analysis (PCA). Samples are mapped to a 2D grid interface within an audio plug-in built using the JUCE software framework.
PERCEPTUALLY-MOTIVATED GENERATION OF ELECTRIC GUITAR TIMBRES USING AN INTERACTIVE GENETIC ALGORITHM [pdf]
This paper presents a system for the interactive modification of electric guitar timbre. A genetic algorithm was used to explore the parameter space of a simplified re-amping circuit, which consisted of an initial high-pass filter, a soft-clipping circuit, equalisation and cabinet simulation. This allowed perceptually optimal solutions to be found in the parameter space, e.g. to find sounds that are “warm”, “bright”, “heavy” or any other perceptual quality, as perceived by the user. Such a system could be used to increase accessibility in music production. Additionally, it is hoped that this system can be used in future psychoacoustic experiments investigating the perception of electric guitar timbre, or that of similar instruments.
DEEP LEARNING AND INTELLIGENT AUDIO MIXING [pdf]
Marco A. Martínez Ramírez and Joshua D. Reiss
Mixing multitrack audio is a crucial part of music production. With recent advances in machine learning techniques such as deep learning, it is of great importance to conduct research on the applications of these methods in the field of automatic mixing. In this paper, we present a survey of intelligent audio mixing systems and their recent incorporation of deep neural networks. We propose to the community a research trajectory in the field of deep learning applied to intelligent music production systems. We conclude with a proof of concept based on stem audio mixing as a content-based transformation using a deep autoencoder.
AUDIO PROCESSING CHAIN RECOMMENDATION USING SEMANTIC CUES [pdf]
Spyridon Stasis, Nicholas Jillings, Sean Enderby and Ryan Stables
Sound engineers typically allocate audio effects to a channel strip in series. This allows the engineer to perform a complex set of operations to fine-tune different tracks in a mixing or mastering environment. In this research, trends in plugin chain selection are investigated, focusing on transformations which modify the timbral characteristics of a sound. Using this information, a recommendation system can be constructed to generate full processing chains in a Digital Audio Workstation (DAW).
MYOSPAT: A SYSTEM FOR MANIPULATING SOUND AND LIGHT PROJECTIONS THROUGH HAND GESTURES [pdf]
Balandino Di Donato and James Dooley
MyoSpat is an interactive audio-visual system that aims to augment musical performances by empowering musicians and allowing them to directly manipulate sound and light through hand gestures. We present the second iteration of the system that draws from the research findings to emerge from an evaluation of the first system.
MyoSpat 2 is designed and developed using the Myo gesture control armband as input device and Pure Data as gesture recognition and audio-visual engine. The system is informed by human-computer interaction (HCI) principles: tangible computing and embodied, sonic and music interaction design (MiXD). This paper reports a description of the system and its audio-visual feedback design. We present an evaluation of the system, its potential use in different multi-media contexts and in exploring embodied, sonic and music interaction principles.
ADAPTING AUDIO MIXES FOR HEARING-IMPAIRMENTS [pdf]
Trevor Agus and Chris Corrigan
A hearing impairment can make it harder to pick apart typical soundtracks, whose dialog, music, and sound effects have likely been mixed with normal-hearing listeners in mind. This paper reviews the potential for enhancing audio mixes, with a focus on preserving the original intentions of the sound engineer, even if this involves changing the mix. Hearing aid strategies are contrasted with more extended enhancements that would be possible off-line. The solutions proposed range from remastering from an established mix, through remixing from available stems, to more extensive processing of individual stems. The arrival of object-based audio makes these solutions more feasible and offers the opportunity to test and develop psychoacoustical theories in the complex but controlled world of the sound engineer.
ON THE SUITABILITY OF EVOLUTIONARY COMPUTING TO DEVELOPING TOOLS FOR INTELLIGENT MUSIC PRODUCTION [pdf]
Alex Wilson, Róisín Loughran and Bruno M. Fazenda
Intelligent music production tools aim to assist the user by automating music production tasks. Many previous systems sought to create the best possible mix based on technical parameters but rarely has subjectivity been directly incorporated. This paper proposes that a new generation of tools can be designed based on evolutionary computation methods, which are particularly suited to dealing with the non-linearities and complex solution spaces introduced by perceptual evaluation. These techniques are well-suited to studio applications, in contrast to many previous systems which prioritized the live environment. Furthermore, there is potential to address accessibility issues in existing systems which rely greatly on visual feedback. A survey of previous literature is provided before the current state-of-the-art is described and a number of suggestions for future directions in the field are made.
Call for Papers:
The production of music often involves interdisciplinary challenges, requiring creativity, extensive knowledge of audio processing and exceptional listening skills. Many of these complex production processes have rules that could be made more intuitive, or managed by intelligent processes. Intelligent Music Production focuses on developing systems that map these requirements into automated or adaptive processes within the production chain to achieve results which are both efficient and aesthetically pleasing.
This event will provide an overview of the tools and techniques currently being developed in the field, whilst providing insight for audio engineers, producers and musicians looking to gain access to new technologies. The day will consist of presentations from leading academics, keynotes, posters and demonstrations.
We welcome submissions on intelligent music production from researchers worldwide at all stages of their careers.
Suggested topics include:
- Intelligent music production systems for common tasks such as level-balancing, equalisation, dynamic range processing, audio editing, etc.
- Intelligent music production systems capable of: generating/performing music; supporting the musical creativity of human users; incorporating affective responses.
- Philosophical foundations of IMP systems
- Accessibility in IMP systems
- Surveys of state-of-the-art techniques in the area
- Studies on the applicability of IMP techniques to other research areas
As well as submissions in the following general areas:
- Perception, psychoacoustics and evaluation
- Source separation
- Semantic audio processing
- Musical similarity and structure analysis
Paper submissions are accepted in PDF format and should be between 2 and 4 pages, including references. Templates for submission are available for Word and LaTeX. Papers should be sent to firstname.lastname@example.org.
On submission, authors should express in their email a preference for either poster or oral presentation. Posters will be presented during the coffee and lunch breaks.
10 April 2017 1st call for papers
10 May 2017 2nd call for papers
15 June 2017 23 June 2017 (Extended) Deadline for full-paper submission
19 July 2017 Notification of acceptance
15 August 2017 Camera-ready paper submission
14 September 2017 Welcome event and demos
15 September 2017 Conference
Further details will be made available here, at http://www.semanticaudio.co.uk/events/wimp2017/. For more information about the event, please get in touch with the Intelligent Music Production committee, or follow our Twitter account.
Bruno Fazenda (chair), email@example.com
Alex Wilson (co-chair), firstname.lastname@example.org
Ryan Stables, email@example.com
Josh Reiss, firstname.lastname@example.org
Brecht De Man, email@example.com