Replacing lighting programmers with AI

The title is click bait, but I can picture situations where we could replace lighting programmers with artificial intelligence using language understanding techniques that power Alexa, Google Assistance and Cortana. I’m going to discuss how I’m working on this using Microsoft language understanding and intelligence technology.

Console Syntax

Most DMX Lighting control systems have a command line interface which supports a domain specific language for programming lights for theatre shows, concerts, tv and everything in between. The syntax is usually pretty similar across different manufacturers, but there always lies some subtle differences that can trip up experienced users when switching systems.

As I’ve been investigating how to create my language for users to program their shows on my control system, I’ve continually come back to the idea that the lighting industry has standardised what they do but its the tools that offer the variables.

An extreme of this is the theatre world where the lighting designer may call for specific fixtures and exact intensities. They may say something like “one and five at fifty percent”. The lighting programmer will then type in something like 1 + 5 @ 50 Enter on one brand of lighting consoles and perhaps Fixture 1 + 5 @ 5 on another console. The intent is the same but the specific syntax changes.

It’s currently the role of the lighting programmer to understand the intent of the designer and execute that on the hardware in front of them. The best lighting programmers can translate more abstract and complex queries into machine understandable commands using the domain-specific language of the control system their using. They understand the lighting consoles every feature and are master translators. They’re a bridge between the creative and the technical, but they are still fundamentally just translating intents.

Removing the need for human translation

Voice to text would go some way to being able to remove the lighting programmer as for simple commands like the one demonstrated earlier, it’s easy to convert the utterance to an action, but most designers don’t build scenes like this. For more complex commands, the console will likely get it wrong, and with no feedback loop, it won’t have the opportunity to learn from its mistakes like a human.

This is where utilising AI will significantly help. I’m currently working on my console featuring the ability to use machine learning, powered by the cloud so that eventually, even the most complex of requests should be fulfilled by the console alone. While cloud-connected in a lighting console probably seems strange, it is just a stepping stone to a full-offline supported system.

Language Understanding with AI

Let me walk you through the training process as I go about teaching it to understand a range of syntax and utterances. I’ve started this in the blog post from scratch, but in reality, I have a fleshed out version of this with many more commands and supports syntax from a variety of consoles.
The first step is to create a new ‘App’ within the Language Understanding and Intelligence service (LUIS).

NewAPp

By default, our app has no intents as we’re able to build a solution to suit our needs and this isn’t a pre-canned or prebuilt solution. To get started we’ll try and define an intent to effect the Intensity parameter of a fixture.

empty intents

EmptyIntent

We need to provide some examples of what the user might say to trigger this intent. To make this powerful, we want to give a variety of examples. The most natural being something like “1 @ 50”. This contains numeric symbols because that’s what our consoles interfaces provide us with, but if we’re using voice to text solutions, we will get the following response “One at fifty”. To solve this, we need to create some entities so that our AI understands that one is also 1. Thankfully to make developers lives easier, Microsoft provides a whole host of prebuilt entities so we can use their number entity rather than build our own.

entities

Matching to numbers is helpful, but we also need to provide information about other types of lighting specific entities. Below I define a few SourceTypes as the offline syntax for my console follows the grammar rules of ‘Source, Command Destination’.

Creating custom entities

I also provide synonyms which mean if a lighting designer for some crazy reason calls all lighting fixtures “device” then we can arcuately calculate the intent. Synonyms are incredibly powerful when we’re building out language understanding as you’ll see below in the PaletteType entity. I’ve created synonyms for Intensity which allows designers to say things like “Fixture 1 brightness at 50 percent” rather than knowing that the console thinks of brightness as intensities. I’ve also made sure to misspell Colour for Americans…

paletteTypes

Even with just three entities, our intent is more useful than just setting intensities. We can now handle basic fixture manipulation. For example “Group 3 @ Position 5” would work correctly with this configuration. For this reason, I renamed the intent to something more sensible (Programmer.SourceAtDestination).

renamming

Training and testing the AI

Before we can use our AI service, we must train it. The more data, the better but we can get some great results with what we already have.

TrainedApp

Below you can see I passed in the utterance of “fixture 10 @ colour 5”.

testresults

The top scoring intent (we only have one so its a little bit of a cheat) is Programmer.SourceAtDestination. The source type is Fixture and the number 10.

What’s next?

Language and conversation will be used in many technologies that may not be obvious right now. I believe it won’t be long until a lighting control manufacturer releases some form of AI bot or language understanding within their consoles and these get better with every day there used. Maybe I’ll be first, but I can’t believe no one else has seen this technology and not thought how to build it into a control system so perhaps we’ll start to see this as the norm in a few years.
Right now its early days but I’d put money on there being some virual lighting programmers shipping with consoles. What type of personality the virtual programmers have will be down to each manufacturer. I hope that they realise that their virtual programmer needs some personality or it’ll be no more engaging than voice to text. I’ve given some thought to my bots personality, and it stems from real people. I’m hoping to provide both female and male assistance, and they’ll be named Nick and Rachel.

Takeaways

It’s never too early to start investigating how AI can disrupt your industry. This post focuses on the niche that is the lighting control industry, but this level of disruption will be felt across all industries. Get ahead of the curve and learn how you can shape the future of your industry with Microsoft AI services.

 

 

 

Consuming Microsoft Cognitive Services with Swift 4

This post is a direct result of a conversation with a colleague in a taxi in Madrid. We were driving to Santiago Bernabéu (the Real Madrid Stadium) to demonstrate to business leaders the power of artificial intelligence.

The conversation was around the ease of use of Cognitive Services for what we call “native native” developers. We refer to those that use Objective-C, Swift or Java as ‘native native’ as frameworks like ReactNative and Xamarin are also native, but we consider these “XPlat Native”. He argued that the lack of Swift SDKs prevented the adoption of our AI services such as our Vision APIs.

I maintained that all Cognitive Service APIs are well documented, and we provide an easy to consume suit of REST APIs, which any Swift developer worth their salt should be able to use with minimal effort.

Putting money where my mouth is

Having made such a statement, it made sense for me to test if my assertion was correct by building a sample app that integrates with Cognitive Services using Swift.

Introducing Bing Image Downloader. A fully native macOS app for downloading images from Bing, developed using Swift 4.

Screen Shot 2018-05-10 at 11.11.55.png

I’ve put the code on Github for you to download and play with if you’re interested in using Cognitive Services within your Swift apps, but I’ll also explain below how I went about building the app.

Where the magic happens

In the interest of good development practices, I started by creating a Protocol (C# developers should think of these as Interfaces) to define what functions the ImageSearch class will implement.

Protocol

protocol ImageServiceProtocol {
// We will take the results and add them to hard-coded singleton class called AppData. 
func searchForImageTerm(searchTerm : String)

// We pass in a completion handler for processing the results of this func
func searchForImageTerm(searchTerm : String, completion : @escaping ([ImageSearchResult]) -> ())
}

Two Implementations for one problem

I’ve made sure to include two implementations to give you options on how you’d want to interact with Cognitive Services. The approach used in the App makes use of the Singleton class for storing AppData as well as using Alamofire for handling network requests. We’ll look at this approach first.

search For Image Term

This is the public func, which is easiest to consume.

func searchForImageTerm(searchTerm : String) {

    //Search for images and add each result to AppData
    DispatchQueue.global.(qos: .background).async {
        let totalPics = 100
        let picsPerPage = 50 
        let numPages = totalPics / picsPerPage 
        (0 ..< numPages)             
            .compactMap { self.createUrlRequest(searchTerm: searchTerm, pageOffset: $0 }             
            .foreach{ self.fetchRequest(request: $0 as NSURLRequest) }         
        .RunLoop.current.run()     } 
} 

create Url Request

private func createUrlRequest(searchTerm : String, pageOffset : Int) -> URLRequest {

    let encodedQuery = searchTerm.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed)!
    let endPointUrl = "https://api.cognitive.microsoft.com/bing/v7.0/images/search"

    let mkt = "en-us"
    let imageType = "photo"
    let size = "medium" 

    // We should move these variables to app settings
    let imageCount = 100
    let pageCount = 2
    let picsPerPage = totalPics / picsPerPage 

    let url = URL(string: "\(endPointUrl)?q=\(encodedQuery)&count=\(picsPerPage)&offset=\(pageOffset * picsPerPage)&mkt=\(mkt)&imageType=\(imageType)&size=\(size)")!
        
    var request = URLRequest(url: url)
    request.setValue(apiKey, forHTTPHeaderField: "Ocp-Apim-Subscription-Key")
        
    return request
}

fetch Request

This is where we attempt to fetch and parse the response from Bing. If we detect an error, we log it (I’m using SwiftBeaver for logging).

If the response contains data we can decode, we’ll loop through and add each result to our AppData singleton instance.

private func fetchRequest(request : NSURLRequest){
    //This task is responsbile for downloading a page of results
    let task = URLSession.shared.dataTask(with: request as URLRequest){ (data, response, error) -> Void in
            
    //We didn't recieve a response
    guard let data = data, error == nil, response != nil else {
        self.log.error("Fetch Request returned no data : \(request.url?.absoluteString)")
        return
    }
            
    //Check the response code
    guard let httpResponse = response as? HTTPURLResponse,
        (200...299).contains(httpResponse.statusCode) else {
        self.handleServerError(response : response!)
        return
    }
            
    //Convert data to concrete type
    do
    {
        let decoder = JSONDecoder()
        let bingImageSearchResults = try decoder.decode(ImageResultWrapper.self, from: data)
                
        let imagesToAdd = bingImageSearchResults.images.filter { $0.encodingFormat != EncodingFormat.unknown }
            AppData.shared.addImages(imagesToAdd)            
        } catch {
            self.log.error("Error decoding ImageResultWrapper : \(error)")
            self.log.debug("Corrupted Base64 Data: \(data.base64EncodedString())")
        }     
     }
        
     //Tasks are created in a paused state. We want to resume to start the fetch.
     task.resume()
}   

Option two (with no 3rd party dependancies)

As a .NET developer, the next approach threw me for a while and took a little bit of reading about Closures to fully grasp. With this approach, I wanted to return an Array of ImageSearchResult type, but this proved not to be the best approach. Instead, I would need to pass in a function that can handle the array of results instead.

// Search for images with a completion handler for processing the result array
func searchForImageTerm(searchTerm : String, completion : @escaping ([ImageSearchResult]) -> ()) {
        
    //Because Cognitive Services requires a subscription key, we need to create a URLRequest to pass into the dataTask method of a URLSession instance..
    let request = createUrlRequest(searchTerm: searchTerm, pageOffset: 0)
       
    //This task is responsbile for downloading a page of results
    let task = URLSession.shared.dataTask(with: request, completionHandler: { (data, response, error) -> Void in
            
    //We didn't recieve a response
    guard let data = data, error == nil, response != nil else {
        print("something is wrong with the fetch")
        return
    }
            
    //Check the response code
    guard let httpResponse = response as? HTTPURLResponse,
    (200...299).contains(httpResponse.statusCode) else {
        self.handleServerError(response : response!)
        completion([ImageSearchResult]())
        return
    }
            
    //Convert data to concrete type
    do
    {
        let decoder = JSONDecoder()
        let bingImageSearchResults = try decoder.decode(ImageResultWrapper.self, from: data)
                
        //We use a closure to pass back our results.
        completion(bingImageSearchResults.images)
                
    } catch { self.log.error("Decoding ImageResultWrapper \(error)") }
    })
    task.resume()
}

Wrapping Up

You can find the full project on my Github page which contains everything you need to build your own copy of this app (maybe for iOS rather than macOS?).

If you have any questions, then please don’t hesitate to comment or email me!